A Hitchhiker’s Guide to Knowledge Galaxies (2nd ed.)

The limits of my language mean the limits of my world

Ludwig Wittgenstein

*Mapping Conceptual Galaxies (Georg von Welling)*

Preamble

Ontologies

The aim of ontologies is to make sense of the discourses about the nature of things. To that end they must weave together thesauruses, for the meaning of discourses, and models, for the representation of things:

Thesauruses deal with the meaning of terms, categories, and concepts.
Models add syntax to build structured representations of contexts and concerns.
Ontologies add pragmatics in order to ensure the functional integration of models and thesauruses in physical and symbolic environments.

Ontologies can be built from scratch or through the conceptual equivalent of APIs (Application Programming Interface).
The scratch option aims to redefine the whole of syntactical and semantic constructs, mixing meanings and representation; it ensues that its “open” capability (or ecumenism) is achieved through translations which are subject to cumulative updates that create exponential complexity.
By contrast the kernel option only adds ontological constructs to unambiguous conceptual primitives, maintaining a distinction between meanings (thesauruses) and representations (modeling languages). That makes for a two-tiered built-in interoperability, with models on the one hand, thesauruses on the other hand.

Basically, ontological entries can be organized in terms of:

Concepts: for pure semantic constructs defined independently of instances or categories
Categories: for symbolic descriptions of sets of objects or phenomena; Categories can be associated with actual (descriptive, extensional) or intended (prescriptive, intensional) sets of instances
Aspects: for symbolic descriptions of features (properties or behaviors)
Facts: for observations of objects or phenomena
Documents: for the symbolic contents of media, with instances representing the actual (digital or physical) copies

The challenge is then to ensure the consistency and interoperability of meanings and representations.

Contents & Communication

As far as enterprises are concerned, the architecture of ontologies should match information systems capabilities; to that end the contents of thesauruses and models can be expounded in terms of data (observations from environments), information (data+structures and semantics), and knowledge (information put to use):

*Ontologies as Glue between Meanings & Purposes*

Thesauruses federate meanings across digital (observations) and business (analysis) environments
Models deal with the integration of digital flows (environments) and symbolic representations (systems)
Ontologies ensure the integration of enterprises’ data, information, and knowledge, enabling a long-term alignment of enterprises’ organization, systems, and business objectives

The objective is then to ensure the consistency of meanings independently of the ways contents are obtained:

Direct (conversational) communication: when meanings can be set directly (without mediation) by actual context or through dialog
Mediated communication: when meanings are obtained through a mapping of contents and shared categories

That can be achieved by combining layered representations of semantics (networks) and knowledge (graphs).

Networks, Graphs, & Galaxies

For thesauruses set in ontologies the primary objective is to attach words to environments’ objects and phenomena whatever their nature (actual or symbolic) and naming source (external or internal). To that effect, names are first attached to observations taking into account the specificities of contexts (eg business domains) and the heterogeneity of sources (eg, data lakes or factories). Along traditional models, and beyond various terminologies, different kinds of networks and/or graphs can be used, often with overlaps, to organize and refine representations:

Neural networks are built from tokenised sentences, sounds, and images, with deep learning used to identify connectors
Semantic networks represent words and their relationships
Syntactic graphs map documents to grammar categories
Conceptual graphs add reasoning capabilities to concepts and categories
Knowledge graphs add modal (aka epistemic) categories

Cut to the bone, these representations are all made of nodes and connectors, and the objective is to ensure a seamless and consistent integration of semantic, conceptual, and knowledge representations, and consequently the interoperability of corresponding applications, typically natural language processing, business intelligence, systems engineering, and decision-making.

Semantics & Gravity

The first stage pertains to observations from physical (facts) or symbolic (documents) environments. Assuming music-like lexical interoperability, thesauruses in ontologies should combine top-down and bottom-up perspectives:

Top-down meanings set by convention and specific to domains, defined literally on discrete scales.
Bottom-up meanings built from observed communications independently of domains, with meanings weighted relatively on continuous scales.

Semantic networks could thus provide a common launching pad, with connecting weights defined or observed depending on perspective:

That clockwise approach would then be extended to support conceptual and knowledge representation. Concomitantly, a counterclockwise approach could build syntactic graphs using lexical markers for grammatical categories in order to combine syntax and semantics.

Stars & Planets

As figured above, networks and graphs overlap without built-in distinctions between nodes mapped to natural languages constructs (eg Role and reference grammar) and modeling ones (eg UML). Yet, such constructs are needed in order to ensure the transparency and interoperability of representations; to that end formats (networks, graphs, models …) must be clearly associated with contents (facts, concepts, categories).

Using celestial systems and gravitational forces as a metaphor, stars, planets, and meteors would stand respectively for concepts, categories, and facts, and gravity forces for positive or negative semantic bonds.

The objective is then to integrate two kinds of contents:

Emerging semantics as observed from actual discourses (eg LLMs)
Normative semantics as defined by grammatical (eg RRGs) and/or domains categories and rules

To that end stereotyped connectors are defined depending on their nature:

Nominal, for connectors involving facts and words
Functional, for connectors pertaining to behaviors
Structural, for connectors pertaining to structures

Stereotypes are also defined with regard of scope in order to ensure the interoperability of representations:

Subsets, for the partitioning of facts
Subtypes, for concepts or categories
Associations, for the whole of representations

Ontological prisms and stereotyped connectors can then be used to build enterprises’ KM dashboards in support of data and business analysis, decision-making, and information systems.

Sailing Through Symbolic Galaxies

From Nebulas to Galaxies

Nebulae are clouds of dust and gas that may eventually coalesce into meteors, planets, or stars; it’s a matter of time and, taking into account intervals between events and observations, what looks like dust may have in actuality turned into stars. A similar impediment affects the observation of facts (data can be successively raw, eligible, and obsolete) and the understanding of words (contrary to words meanings cannot be etched into stone); hence the benefit of using ontological prisms as telescopes.

Semantic networks, to map facts (data) first with words and then with meanings (thesauruses)
Conceptual graphs, to combine thesauruses and categories and build models of physical and symbolic environments
Finally, knowledge graphs adding ontological modalities to conceptual graphs

Given sequences of textual tokens, discovering meanings can be done directly through semantics, or by combining semantics and grammatical categories.

Taking for exemple the sequence “Due to a flat Tiger had no wheels to drive Amber to the prom”, each element (aka token) can be taken as a piece of data that must be interpreted in the context of the sequence.

Syntactic agents are first used to mark grammatical roles (eg “flat” as noun/adjective) and names
Semantic agents like thesauruses are then used to associate words with possible (eg “Tiger” as name or animal) and/or probable (eg Flat as adjective or noun) meanings
Pragmatic agents use context (eg Habitation or Vehicle) to decide between alternative meanings

*Weaving the Fabric of Meanings (excerpts)*

It must be stressed that while ontological prisms ensure the integration and interoperability of representations they make no assumption about their utilisation, as illustrated by Large language models (LLMs) which rely on Deep learning methods to build networks and graphs with or without support from syntactic, semantic, or pragmatic agents.

Ontological Representation

Ontological languages like OWL can provide for a seamless implementation of all kinds of networks, graphs, and models; using CaKe 4.0 ontological kernel:

Nodes: individuals (or instances) for facts, categories for concepts and categories
Connectors: realization (dashed, yellow), generalization/specialization (solid, yellow); semantics (solid, blue)

*Galaxies Representation* *with CaKe 4.0*

Ontologies can thus support a dynamic integration of emerging and managed meanings, the former observed and the latter defined, and consequently the interoperability of symbolic representations pertaining to business analysis (words), information systems (categories), and business models (concepts).

*Interoperability of business analysis (“flat”, “wheels, “drive”), managed information (Flat, Car), and business concepts (Dwelling, Mobility)*

Last but not least, ontological prisms can be use to build maps serving a wide range of intents and purposes.

Galactic Maps & Travels

As far as travels are concerned, maps serve two kinds of purposes, charting the lays of the land, and assisting travellers along the road. Likewise, language charts must be first established according to agreed upon contexts and purposes, and then used in conversations driven by specific intents. Corresponding language paradigms can be summarily aligned with the divide symbolic / non symbolic artificial intelligence:

*Symbolic (left) and non symbolic (right) language models*

Symbolic approaches combine grammars (syntax), thesauruses (semantics), and ontologies (types and pragmatics) to navigate through documents and datasets (left)
Non symbolic approaches apply neural networks and Deep learning algorithms on massive samples of documents and datasets in order to train Large language models (LLMs) which can then be prompted through natural language (right)

After an initial flurry of frantic and extravagant expectations, the intrinsic limits of non symbolic approaches, and more generally generative models, have been broadly identified with regard to:

Mapping: the costs of training language models on undifferentiated contents grow exponentially
User experience: travellers are not geographers and have neither the time or skills to elaborate and tune prompts or delve into maps’ semantic or conceptual layers
Reliability: being by design devoid of knowledge compasses LLMs are prone to factual and reasoning hallucinations when employed in open-ended contexts

Mapping

Setting apart boutiques and would-be start-ups pursuing end-users applications despite these caveats, leading tech incumbents put the focus on the benefits of LLMs as copilots or assistants adding intelligent capabilities to their platforms. While little is known about the actual mechanics under the hoods, conjectures can be made about two training options, namely bounded contexts and targeted training.

LLMs are designed to deal with character strings, extracting meanings from words arrangement. But meanings are set by contexts and when contexts are open-ended the hazards of circular references cannot be avoided; that’s not the case when LLMs are operated within bounded semantic contexts.

LLMs are meant to be trained on extensive accumulations of documents, the aim being to encode semantic networks using billions if not trillions of parameters; given the limits of one-fits-all and brute force strategies, the trend is towards targeted training combined with scripted prompts. Set in a broader perspective, such training can be described in terms of mapping protocols:

Syntactic mapping: a network backbone for grammatical commons is built according to the natural or modeling languages considered (a).
Semantic mapping: the shared backbone is fleshed out with the semantics of bounded contexts (b).
Pragmatic mapping: unambiguous semantic networks are anchored to relevant factual references (c).

Such maps (aka language models) can then be customised according to the contexts and purposes of navigation.

Navigation

Like travellers maps, language models can be used to prepare or to proceed:

Planning tours (in vitro): users take time to consider alternative locations and itineraries, and search for additional information; that would entail scripted prompts accessing knowledge graphs.
Actual traveling (in vivo): conversations are carried on in context with relevant maps supposedly at hand and no need for prompting scripts and knowledge graphs.

Planning use cases (clockwise) aim at:

Objectives (Ξ/≈), ie translating values and goals into categories
Realisations (≈/#), identifying relevant actual resources

Actual navigation use cases (counterclockwise) include:

Fetches (#?), meant to retrieve datasets and/or documents based on names and/or features
Searches (≈?), meant to retrieve datasets and/or documents based on topics and/or features
Queries (Ξ?), meant to add intents to searches

These use cases are meant to provide the nuts and bolts of enterprise architecture governance.

Enterprise Architecture Interfaces

The digital transformation of enterprises combined with the ubiquity of AI and ML technologies raises new issues regarding the integration of enterprises governance remits:

At the system level: how to balance rule-based (symbolic, explicit knowledge) and deep-learning (non-symbolic, implicit knowledge) technologies
At the organisation level: how to guarantee the traceability and accountability of decision-making processes

Hence the need of integrated dashboards with interfaces covering the whole of EA governance remits.

Engineering Interfaces

As already noted, enterprise architecture is meant to encompass three kinds of concerns and consequently three kinds of interfaces:

Data/Facts: data files (eg JSON), datasets (eg SQL), statistical series (eg SAS, MATLAB, SPS), references (eg NIEM)
Information/Categories: general (eg UML) and specialised (eg Archimate, SysML, BPMN) modeling languages and methods (eg OOD)
Knowledge and concepts: ontology (eg OWL)

In addition, cross interfaces should ensure the consistency and interoperability of symbolic resources according to names (thesauruses), structure (XML), and purpose (RDF):

Thesauruses ensure access to the semantics attached to names
Resources Description Framework (RDF) are the Swiss Army Knife of graph-based representation of symbolic contents
Extensible Markup Language (XML) is the Swiss Army Knife for the representation of documents structure, ensuring the storing, transmitting, and reconstructing of arbitrary contents independently of their meaning

More specific interfaces can also be introduced pertaining to semantics (eg LLMs, SPIRES, Schema), models (eg ORM), and reasoning (eg Prolog).

Governance Interfaces

Last but not least, enterprise architecture is bound to remain an afterthought without a functional integration and interoperability across main governance use cases:

Requirements, for actual business needs
Data analytics, for virtual business needs
Business analysis, for the alignment of virtual business needs with business models and objectives
Business intelligence for the definition of business models and objectives
Strategic planning, for the alignment of business models and objectives with organisation and supporting systems
Systems engineering
Systems modeling, for the alignment of requirements and systems

Using ontological prisms as governance hubs can thus ensure separation of concerns as well as interoperability of resources.

A Hitchhiker’s Guide to Knowledge Galaxies (2nd ed.)

Preamble

Ontologies

Contents & Communication

Networks, Graphs, & Galaxies

Semantics & Gravity

Stars & Planets

Sailing Through Symbolic Galaxies

From Nebulas to Galaxies

Ontological Representation

Galactic Maps & Travels

Mapping

Navigation

Enterprise Architecture Interfaces

Engineering Interfaces

Governance Interfaces

FURTHER READING

KALEIDOSCOPE SERIES

OTHER CAMINAO REFERENCES

Like this:

Preamble

Ontologies

Contents & Communication

Networks, Graphs, & Galaxies

Semantics & Gravity

Stars & Planets

Sailing Through Symbolic Galaxies

From Nebulas to Galaxies

Ontological Representation

Galactic Maps & Travels

Mapping

Navigation

Enterprise Architecture Interfaces

Engineering Interfaces

Governance Interfaces

FURTHER READING

KALEIDOSCOPE SERIES

OTHER CAMINAO REFERENCES

Share this:

Like this: