A Hitchhiker’s Guide to Knowledge Galaxies (WIP)

Mapping Conceptual Galaxies (Georg von Welling)



The aim of ontologies is to make sense of the discourses about the nature of things. To that end they must weave together thesauruses, for the meaning of discourses, and models, for the representation of things:

  • Thesauruses deal with the meaning of terms, categories, and concepts.
  • Models add syntax to build structured representations of contexts and concerns.
  • Ontologies add pragmatics in order to ensure the functional integration of models and thesauruses in physical and symbolic environments. 
Ontology Architecture

Ontologies can be built from scratch or through the conceptual equivalent of APIs (Application Programming Interface).
The scratch option aims to redefine the whole of syntactical and semantic constructs, mixing meanings and representation; it ensues that its “open” capability (or ecumenism) is achieved through translations which are subject to cumulative updates that create exponential complexity.
By contrast the kernel option only adds ontological constructs to unambiguous conceptual primitives, maintaining a distinction between meanings (thesauruses) and representations (modeling languages). That makes for a two-tiered built-in interoperability, with models on the one hand, thesauruses on the other hand.

Basically, ontological entries can be organized in terms of:

  • Concepts: for pure semantic constructs defined independently of instances or categories
  • Categories: for symbolic descriptions of sets of objects or phenomena; Categories can be associated with actual (descriptive, extensional) or intended (prescriptive, intensional) sets of instances
  • Aspects: for symbolic descriptions of features (properties or behaviors)
  • Facts: for observations of objects or phenomena 
  • Documents: for the symbolic contents of media, with instances representing the actual (digital or physical) copies

The challenge is then to ensure the consistency and interoperability of meanings and representations.

Contents & Communication

As far as enterprises are concerned, the architecture of ontologies should match information systems capabilities; to that end the contents of thesauruses and models can be expounded in terms of data (observations from environments), information (data+structures and semantics), and knowledge (information put to use):

Ontologies as Glue between Meanings & Purposes
  • Thesauruses federate meanings across digital (observations) and business (analysis) environments
  • Models deal with the integration of digital flows (environments) and symbolic representations (systems)
  • Ontologies ensure the integration of enterprises’ data, information, and knowledge, enabling a long-term alignment of enterprises’ organization, systems, and business objectives

The objective is then to ensure the consistency of meanings independently of the ways contents are obtained:

  • Direct (conversational) communication: when meanings can be set directly (without mediation) by actual context or through dialog
  • Mediated communication: when meanings are obtained through a mapping of contents and shared categories

That can be achieved by combining layered representations of semantics (networks) and knowledge (graphs).

Networks, Graphs, & Galaxies

For thesauruses set in ontologies the primary objective is to attach words to environments’ objects and phenomena whatever their nature (actual or symbolic) and naming source (external or internal). To that effect, names are first attached to observations taking into account the specificities of contexts (e.g. business domains) and the heterogeneity of sources (e.g., data lakes or factories). Along traditional models, and beyond various terminologies, different kinds of networks and/or graphs can be used, often with overlaps, to organize and refine representations:

  • Neural networks are built from tokenised sentences, sounds, and images, with deep learning used to identify connectors
  • Semantic networks represent words and their relationships
  • Syntactic graphs map documents to grammar categories
  • Conceptual graphs add reasoning capabilities to concepts and categories
  • Knowledge graphs add modal (aka epistemic) categories

Cut to the bone, these representations are all made of nodes and connectors, and the objective is to ensure a seamless and consistent integration of semantic, conceptual, and knowledge representations, and consequently the interoperability of corresponding applications, typically natural language processing, business intelligence, supporting systems, and decision-making.

Semantic Networks & Gravity

The first stage pertains to observations from physical (facts) or symbolic (documents) environments. Assuming music-like lexical interoperability, thesauruses in ontologies should combine top-down and bottom-up perspectives:

  • Top-down meanings set by convention and specific to domains, defined literally on discrete scales.
  • Bottom-up meanings built from observed communications independently of domains, with meanings weighted relatively on continuous scales.

Semantic networks could thus provide a common launching pad, with connecting weights defined or observed depending on perspective:

Observed and defined semantic bounds

That clockwise approach would then be extended to support conceptual and knowledge representation. Concomitantly, a counter clockwise approach could build syntactic graphs using lexical markers for grammatical categories in order to combine syntax and semantics.

Conceptual Networks & Planetary Systems

As figured above, networks and graphs overlap and don’t come with built-in distinctions between nodes mapped to natural languages constructs (e.g. Role and reference grammar) and modeling ones (e.g. UML). Yet, such constructs are needed in order to ensure the transparency and interoperability of representations; that can be achieved with astronomy and gravitational forces as a metaphor.

On that account semantic networks would be represented as planetary systems built from:

  • Stars, planets, and satellites, respectively for concepts, categories, and facts
  • Gravity forces, for positive or negative semantic bonds between terms
  • Star systems and orbits, for the abstraction levels within (subtypes) and between (realization) perspectives

Semantic weights (SW) would vary between -1 (antonym) and +1 (synonym), between concepts, from concepts to primary terms, and from primary to secondary terms; e.g.:

Semantic Galaxy
  • T1 meaning is set by Ca (SW=.8)
  • T2 meaning is set by Ca (SW=.4) and Cb (SW=.7)
  • T3 meaning is set by Ca (SW=.2) and Cc (SW=.-7)

It must be noted that such semantic planetary systems are set at thesaurus level and thus deal with meanings, whether defined top-down from business or institutional domains, or emerging bottom-up from communications independently of specific contexts. It ensues that planetary systems can also be associated with domains, ensuring a seamless integration of defined and observed meanings (thesauruses) with canned ones (models).

Knowledge Graphs & Galaxies

Planetary systems introduce three tiers of nodes and a transition to conceptual networks, providing a bridge between semantic networks (for thesauruses), conceptual graphs (for models), and knowledge graphs (for ontologies). These tiers can also be aligned with the facets of the ontological prism:

  • Semantic networks are used to map facts (data) with words (thesauruses)
  • Conceptual graphs (semantic networks with planetary systems) can then integrate thesauruses with models (information)
  • Finally, knowledge graphs are used to build ontologies from conceptual networks and galaxies

For instance:

Weaving the Fabric of Meanings
  • ‘Wheels’ is a familiar equivalent for ‘Car’, ‘Bike’ is opposed to ‘Car’ (definition) and ‘Wheels’ (usage)
  • Stat system for concept Vehicle encompasses Car, Boat, and Plane categories, as well as Time Capsule concept
  • Motorbike and Bicycle are actual subtypes of bike, as instances can be mapped to identified elements in environments
  • Compact, Van, and SUV are functional (aka symbolic) subtypes of rental categories, as instances can be mapped to sets of identified elements in environments
  • Retired and Employed commuters are actual realization of Commuter (as opposed to subtypes) because while there are instances identified in environment there are no managed categories representing them
  • Ford Mustang and Fiat500 are functional realization of Car Model (as opposed to subtypes) because there are no identified instances or managed categories

It must be stressed that ensuring the interoperability of representations makes no assumption about their validity.

Sailing Through Knowledge Galaxies

Ontological languages like OWL can provide for a seamless implementation of networks, graphs, and galaxies:

  • Nodes: individuals (or instances) for observations; categories for stars and planets
  • Connectors: realization (dashed, yellow), generalization/specialization (solid, yellow); semantics (solid, blue)
Graph & Galaxies Representation

Ontologies can then support a dynamic integration of live and managed meanings:

  • Live meanings: set through communications, represented by planetary systems
  • Managed meanings: defined by organizations, represented by domains

In return, that integration opens the door to universal knowledge interfaces.

The digital transformation of enterprises combined with the ubiquity of AI and ML technologies raises new questions regarding accesses to knowledge:

  • At the system level: how to balance rule-based (symbolic, explicit knowledge) and deep-learning (non-symbolic, implicit knowledge) technologies
  • At the user level: how to guarantee the traceability and accountability of decision-making processes

Both questions, and more, can be best answered when set on a dual representation/communication perspective subject to the double constraint of explainability and expansibility.

Knowledge Representation: Ontologies & Learning

Rule-based and ML technologies are often misrepresented in terms of traditional vs new AI, missing their technical and functional continuity and complementarity.

With regard to continuity, advances in ML technologies have been driven by the simultaneous availability of massive data and computing power. With regard to complementarity, networks can serve as technical bridges between implicit (ML) and explicit (rule-based) knowledge.

Conceptual galaxies add functional dimensions to that complementarity:

  • Planetary systems with differentiated semantics (e.g. RRG and UML) are used to improve the consistency and interoperability of natural and modeling languages.
  • The dynamic nature of galaxies enables a continuous expanse of knowledge (aka learning), with ML for the mining of semantic weights observed in communication soup, planetary systems to map them to models, and galaxies to integrate the whole into knowledge graphs.

That holistic perspective can be translated into learning capabilities: observation (ML), reasoning (models), and judgment (knowledge graphs).

Knowledge Communication: Interfaces & Languages

As one would expect, the distinction between communication and representation in relation with knowledge is mirrored by the roles of languages, the nexus being the gear between syntax and semantics.

With regard to communication three tiers can be summarily identified:

  • Digital interfaces, for agents devoid of symbolic processing capabilities, commonly known as “things”.
  • Symbolic interfaces, for agents with unambiguous syntactic and semantic capabilities, typically traditional systems without AI abilities.
  • Natural language interfaces, for agents empowered with the whole of syntactic, semantic, and pragmatic capabilities, typically but not exclusively people.

The interoperability between these tiers has been hampered by theoretical as well as practical difficulties stemming mainly from the ways modeling and natural languages can be processed.

Being defined upfront modeling languages can set apart syntax, semantics, and pragmatics, with interfaces organized accordingly. By contrast, being practiced before being defined, natural languages mix formal with pragmatic tiers, inducing semantics-based interfaces. It ensues a structural discrepancy between communication interfaces at system and user levels, hampering the consistency and the interoperability of functions performed across levels.

Such discrepancies can be ironed out using conceptual galaxies to set apart lexical semantics unambiguously tied to syntax and lexicons, from pragmatic ones mixing meanings from lexicons, managed domains, and customary communications. Interfaces based on lexical semantics would ensure the consistency and interoperability of observation and reasoning across levels; they could be extended with pragmatic semantics supporting judgment expressed through natural languages.


%d bloggers like this: