Knowledge Management Booklet

Driving the Spinoza Car (Thomas Hirschhorn)

Whereas EA as a discipline is still in its infancy, practitioners make effective advances with groundwork in a wide range of enterprises. Given the variety of technical and organizational circumstances, adding an organizational layer with roles and responsibilities would be pointless or even counterproductive. Alternatively, enterprise architects should be seen as proxies balancing long-term business revenues with assets sustainability. To that end they should ensure: 

  • The consistency of meanings and representations of business objectives, enterprise organization, and systems architecture
  • The transparency and traceability of decision-making processes

These objectives can be best achieved with ontologies serving as enterprise architects’ handbooks.

Preamble

Knowledge & Ontologies

From a functional perspective the role of ontologies is to manage knowledge representations (KR); as defined by Davis, Shrobe, and Szolovits, that can be summarised with five basic functions:

  1. Surrogates: manage the symbolic counterparts of objects, events and relationships identified in context and pertaining to concerns.
  2. Ontological commitments: maintain a set of consistent statements about the categories of things that may exist in the domain under consideration.
  3. Fragmentary theory of intelligent reasoning: support actionable an representation of what the things can do or can be done with.
  4. Medium for efficient computation: make knowledge understandable by computers and support smooth learning curves.
  5. Medium for human expression: improve the communication between specific domain experts and generic knowledge managers.

Traditional models can support points 1, 4 and 5, but often fall short of points 2 and 3. Ontologies are meant to generalize the representation of and the reasoning about any kind of realm, actual or virtual. To that effect, ontologies should be organized in terms of language, models and thesaurus

  • Thesauruses deal with the meaning of terms, categories, and concepts.
  • Models add syntax to build representations of contexts and concerns.
  • Ontologies add pragmatics in order to ensure the functional integration of models and thesauruses in physical and symbolic environments. 
Ontology Architecture

The aim of the Caminao Kernel (CaKe) is to demonstrate the benefits of this approach by providing a formal (based on axioms and non-circular definitions) and ecumenical (no assumption regarding modeling languages) basis for ontologies development.

Kernel Architecture

The CaKe architecture is organized around:

  • Concepts: for pure semantic constructs defined independently of instances or categories
  • Categories: for symbolic descriptions of sets of objects or phenomena; Categories can be associated with actual (descriptive, extensional) or intended (prescriptive, intensional) sets of instances
  • Aspects: for symbolic descriptions of features (properties or behaviors)
  • Facts: for observations of objects or phenomena 
  • Documents: for the symbolic contents of media, with instances representing the actual (digital or physical) copies

To ensure a seamless integration of these descriptions, the contents of thesauruses and models are expounded in terms of data (observations from environments), information (data+structures and semantics), and knowledge (information put to use):

Ontologies as Glue between Meanings & Purposes
  • Thesauruses federate meanings across digital (observations) and business (analysis) environments
  • Models deal with the integration of digital flows (environments) and symbolic representations (systems)
  • Ontologies ensure the integration of enterprises’ data, information, and knowledge, enabling a long-term alignment of enterprises’ organization, systems, and business objectives

Kernel Implementation

A testbed of the Caminao Kernel is implemented on the OWL 2 Stanford/Protégé portal, with an Enterprise architecture case study which can be consulted with a Protege account (Cake_Diner22Q1).

The kernel’s thesaurus is based on the music keys principle: a compact set (or key) of modeling axioms (or notes) is meant to guarantee the validity (or harmony) of all definitions. While the kernel comes with its own specific key, the models developed with that key, like musical pieces, can be expressed or translated (played) using alternative sets of axioms (or keys) without unwarranted assumptions with regard to the truth of representations.

The Protege graphical user interface comes with a number of generic filters:

  • Negative filters operate on nodes and are used to mask root categories, e.g.: xActivities, xObjects
  • Basic hierarchy filters (1,2,3) operate selectively on OWL subclass and instance properties
  • Variants filter operates on kernel partitioning properties: sortedBy_, subset_, extendedby_
  • Association and composition filters operate selectively on structural (include_) and functional (connectRef_, connectExe_) kernel properties
  • Thesauruses and inference connectors operate selectively on semiotic and reasoning kernel properties

These filters can be used to define integrated views according to perspective.

Shared Meanings: Thesauruses

The primary objective of thesauruses is to attach words to physical/digital and symbolic/business environments.

Contrary to a naive understanding, there can be no universal attachment of words to reality, even for concrete objects, and business affairs are generally more symbolic than actual. For enterprises the primary objective of ontologies is therefore to manage the various meanings associated with environments’ objects and phenomena whatever their nature (concrete or symbolic) and naming source (external or internal). That’s the role of thesauruses.

  • From a business perspective, thesauruses are used to attach names to data taking into account the specificities of business domains and the heterogeneity of sources (e.g., data lakes or factories). The focus is put on the federation of data semantics across sources and business domains , e.g. using data meshes.
  • From a system perspective thesauruses are used to map named data to their employ by applications and functions. The focus is put on the consolidation of business domains’ semantics and shared resources, meta-data, and master data management (MDM).

The integration of thesauruses within ontologies can be achieved through their refactoring as semantic networks and a built-in distinction between semantic connectors (aka connecting nodes) and syntactic ones, e.g.:

At the technical and functional levels, such normalization opens the door for the integration of thesauruses with Entity-relationship (E/R) and Relational data models.

At the conceptual level the removing of domain-specific semantics from connectors has two major consequences:

  • They can be uniformly defined as to ensure syntactic interoperability across modeling languages
  • Their semantics can be aligned with ontological (or epistemic) meanings

The distinction between domain-specific connectors and ontological ones should be the corner stone of effective ontologies, enabling the interoperability of models across epistemic levels as well as built-in modal reasoning capabilities.

Shared Representations: Models

Whereas thesauruses put the focus on meanings, attaching words to facts and ideas, the primary objective of models is to manage representations:

  • From a business perspective, models are used to define managed objects and processes, and plan for changes
  • From a system perspective, they are used to define systems architecture and to support engineering (MBSE).

But facts are not given: to ensure a perennial and consistent representation of objects and phenomena observations must be identified, structured, and put into context; that is done by adding syntax (for identified structures) and domain semantics (for interpretation) to data. Models can then serve to align enterprises’ environments and objectives on the one hand, organization and supporting systems on the other hand.

Identities, Structures, Aspects

The distinction between thesauruses and models takes a particular importance for enterprises immersed in digital environments as it marks the limit between unregulated realms (the meanings of words) and regulated ones (business representations), with models serving as gate keepers between observed data and managed information. To ensure that double transition (between thesauruses and models, and between data and information), identification and structures must complement thesauruses:

  • Since representations are meant to associate concepts or categories (as defined in thesauruses) to sets of objects or phenomenas, principles are required to identify individuals pertaining to business concerns.
  • Then, a distinction must be maintained between structural (or intrinsic) features on the one hand, and functional or behavioural ones on the other hand.
  • Finally, conditions must be added for the structural and functional integrity of representations.

Whereas leading modeling languages (like E/R, Relational, or OO) come with corresponding constructs, there is no easy match; the objective of the kernel is thus to provide a common and unambiguous set of syntactic connectors that could serve as a bridge.

Assuming homogenous domains for identified structures and features semantics, such connectors should be enough to ensure the mapping of normalized thesauruses to models, except for a caveat: syntactic connectors cannot deal with abstraction semantics.

Semantics & Abstractions

As long as abstractions are about meanings (as they are in thesauruses), they can be translated into “kind of” or “is a” relationships, and represented as such by OWL2 hierarchies. But that understanding falls short when abstractions apply to representations, as they are supposed to be in models. In that case different semantics must be considered depending on the kind of abstraction:

  • Conceptual, as originally defined in thesauruses (yellow color)
  • Structural and functional, represented by subsets and extensions respectively (blue color)

Specific subset_ and extend_ connectors are thus introduced to deal with structural and functional abstractions. Moreover, the kernel also introduces partitioning connectors (sortedBy_) between primary categories and classifying ones. Classifying categories defined in thesauruses would directly translate as powertypes in models, to be realized through instances (MusicInstr2Source) or subtypes (MusicInstr2Function).

Using instances to associate powertypes and subtypes ensures the continuity of abstraction semantics across thesauruses, models, and ontologies.

Shared Knowledge: Ontologies

To keep a competitive edge in digital environments enterprises must turn into learning machines; to that end their organization must weave together the processing of data (observation), information (reasoning), and knowledge (judgment) with their decision-making processes. That should be the role of ontologies.

From Data to Knowledge

At the system level data is meant to be mapped to information models, but at the enterprise level data may also be directly processed into knowledge, which means some direct mapping between thesauruses and ontologies.

From data to Categories

The raison d’être of ontologies is to make explicit the relationship between languages and what they denote according to the domains of concern. For enterprises having to make sense of a plurality of business and digital environments, ontologies help with a double delusion:

  • The implicit assumption that meanings can be extracted from raw data like gold nuggets from river beds independently of customary contexts and purposes
  • The explicit assumption that concepts can be organized into final and universal hierarchies

Assuming that putting names on facts is the first step of complexity management, Machine learning technologies can be used to make sense of unstructured observations obtained from environments (Data mining) and operations (Process mining). The tentative groupings of data items can then be translated into meaningful categories supporting information models.

From a functional perspective, that twofold undertaking (labelling and modelling) corresponds to the distinction between data analytics and data analysis, the former for marketing data (e.g. profiles of anonymous music listeners), the latter for managed information (e.g., ticket sales). It also corresponds to key governance distinctions:

  • Between unregulated observations and regulated managed information
  • Between the intrinsic uncertainties of observations and the necessary reliability of models

In terms of cybernetics it marks the frontier between entropy (random data) and information (structured data).

Abstraction & Epistemic levels

Abstraction is arguably the Swiss Army Knife of complexity management as it can be applied in different situations, e.g.,

  • Between actual things and the subset of features deemed relevant
  • Between agents and their roles
  • Between designs and realizations
  • Between concepts

But to be effective in tackling complexity, the modalities of abstractions must be explicit with regard to the nature (or epistemic level) of the realms being represented: conceptual, actual, virtual, fictional, etc.

The benefits of epistemic levels have been illustrated above with the regulatory distinction between anonymous data and managed information. Taking a simple Parent/Children exemple, information models can deal with the fact that persons have parents, and represent the corresponding individuals and relationships, e.g.:

  • Entity Person_ with instances Mike and Jerry,
  • Role Parent() with standard connector (connectRef#_)
  • Connectors’ cardinalities for integrity constraints

But unidentified instances and associated relationships cannot be explicitly represented by models: if Mike and Jerry are siblings, they are supposed to share a parent, but that parent (JDoe) is not necessarily identified as an instance in the database. Hence the benefit of a built-in ontological distinction between identified and induced occurrences of Person_.

That simple exemple illustrates the epistemic distinction between anonymous data (JDoe) and managed information (Mike and Jerry), and its significance for enterprises compliance with privacy regulations.

Applied to system engineering, epistemic levels would typically enable the integrated representation of environments, models, and components; for EA governance they would constitute the mainstay of speculative environments and alternative strategies.

Taking the exemple of modernization, that would enable the representation of overlapping solutions combining current (actual, default) and planned (virtual, explicit) configurations, and so ensuring the alignment of changes between models (maps) and configurations (territories).

Compared to ad-hoc solutions supported by modeling tools (e.g., for temporal data bases), ontologies would enable the interoperability of any kind of epistemic representations.

From Information to Knowledge

The raison d’être of organizations is to define the relationships between individual and collective agents. Supporting ontologies should therefore be explicit about agents’ representations and purposes: actual, expectations, objectives, projects, alternative courses of action, etc. These epistemic levels can then be combined with reasoning as to enhance the governance capabilities of systems and organizations, the former for information and reasoning, the latter for knowledge and decision-making.

Reasoning with models can be achieved through computation (functions), deduction (dependencies), and induction (statistics). The objective of ontologies is to extend these capabilities as to support epistemic (aka modal) reasoning, i.e., reasoning about different levels of reality, typically what agents think about the knowledge and objectives of other agents.

That issue can be first expressed in terms of closed- or open-world assumptions: 

  • The Closed-world assumption (CWA) states that whatever is true is also known to be true, in other words that what is not known is false
  • The Open-world assumption (OWA) states that nothing is false until proven so

But ontologies of enterprise knowledge must go further and take into account the open-ended modalities of truth, ensuring the representation of temporal or alternative realms and built-in modal reasoning capabilities.

Being limited to meanings, thesauruses can only deal with logical inferences, not with inferred realms. System modeling languages often extend reasoning capabilities either natively (Object oriented languages) or through extensions (e.g., UML-Unified Modeling Language with OCL-Object Constraint Language). Graph based representations like RDF offer broader possibilities, either as extensions (e.g., SHACL-Shapes constraint language), or natively (e.g., OWL-Web Ontology Language).

While languages like OWL comes with reasoning capabilities, they are defined at the conceptual level and thus rely on connectors’s business semantics. Introducing ontological connector make the reasoning cognizant of the epistemic status of nodes. Taking the modernization exemple:

  • Strategies (DinerStrategies2) are sorted by M&A negotiations (DinerM&A2)
  • Two options are considered (dinerM&A:ScenA, dinerM&A:ScenB), with induced business expectations (dinerExpect:ScenA, dinerExpect:ScenB)
  • Two corresponding instances of deduced strategies (dinerStrategy:PlanA, dinerStrategy:PlanB) can thus be defined

The integration of modal reasoning across epistemic levels will ensure the integration of decision-making processes across operational, tactical, and strategic levels.

Reasoning Patterns

Logic distinguishes between three basic forms of reasoning: deductive, inductive, and abductive. These forms (complemented with one for negation) can be expressed in terms of data, information, and knowledge:

  • Deduction infers individual facts from categories’ features
  • Induction: infers categories’ features from observed facts
  • Abduction: adds conceptual explanations to induction
  • Intuition: makes direct associations between experience (facts) and beliefs (concepts)
Necessary (solid line) and non-necessary (dotted line) Inference patterns

Reasoning patterns can thus be built from predicates and clauses derived from models (powertypes, subtypes, etc.).

Decision-making & Strategies

With regard to governance the key discriminant between systems and organization is the traceability and accountability of decision-making processes; that entails four intertwined undertakings:

  1. Reducing uncertainties (observation): facts are not given but must be “mined” from data whose quality often improves with time until it becomes irrelevant.
  2. Determining causalities (orientation): contrary to science, business causal chains are set in minds as well as circumstances, and can change accordingly.
  3. Managing risks (decision): business competition being by nature a time- dependent nonzero-sum game, the quality of observations and the reliability of orientations should be part and parcel of risk assessment.
  4. Improving efficiency (action): enterprises’ performances primarily depend on the coupling between operations and organization; digital architectures can boost improvements through experience and collective learning.

Sorting out these undertakings into intertwined yet manageable threads can be best achieved through iterations of the OODA loop backed by ontologies integrating forms (thesauruses, models, graphs), contents (data, information, knowledge), and functions (observation, reasoning, decision-making).

Decision-making & Governance

That dynamic integration of representations of environments and assets with decision-making processes comes to be a critical factor for enterprises’ digital transformation:

  • Continuous iterations between business and physical considerations enable a smooth alignment of planned changes in systems and organization with emerging forces in business and physical environments.
  • Feedbacks across operations (action, observation) and representations (orientation, decision) can be used to iron out the folds between operational, tactical, and strategic time frames.

In a broader perspective the integration of individual knowledge, experience, and creativity into collective learning can be decisive for complexity management.

Collective Learning & Complexity Management

organization & collective learning

Fusing individual experience into collective knowledge should arguably be a mainstay of any organization, all the more for enterprises competing in knowledge driven environments. That can be achieved along a twofold dimension: between people and supporting systems, and between individual and collective agents. Compared to the well trodden former dimension, the latter is often defined as a foggy realm of intangible assets. Introducing Machine learning technologies and ontologies can help to clarify the learning momentum in terms of implicit and explicit knowledge:

Fusing Individual & Collective Knowledge
  • Between people and organizations, it’s typically done through a mix of experience and collaboration (a)
  • Between systems and representations, it’s the nuts and bolts of Machine learning and Knowledge graphs technologies (b)
  • Between people and systems, learning relies on the experience feedback achieved through the integration of ML into the OODA loop (c)
  • Between organization and systems, learning relies on the functional distinction between judgment, to be carried out at the organizational level, and observation and reasoning, supported by systems (d)

Finally, ontologies can be turned into a learning machine weaving observations (data), thesauruses (meanings), models (information), and knowledge into a collective fabric of systems and organizations supporting enterprises’ self-learning capabilities:

Ontologies as learning machines

Governance, Complexity, Entropy

Enterprises competing in digital environments have to strike a delicate balance between overwhelming flows of data and manageable information. Ingesting too much of inflows will decrease the effectiveness of information systems, ingesting too little will decrease the understanding of business environments.

Taking a leaf from Shannon’s theory of information, that conundrum can be expressed in terms of complexity management, the objective being to optimise the representation of environments given assets and objectives. To that end enterprise architects could use the ratio between the size of targeted items (or micro-states in cybernetics parlance), and the size of the categories needed to represent them (or macro-states); the former corresponding to data, the latter to information. Entropy would then be defined as the part of data unaccounted for by information models.

Along that reasoning managing complexity means reducing entropy, which can be done through a better analysis of data and/or a better design of models.

On the physical hand (territories), data improvements can be achieved through a digital osmosis between environment and systems. On the symbolic hand (maps), models improvements can be achieved through homeostasis, i.e., a continuous realignment of the categories supporting orientation and decisions.

Managing Complexity

Enterprise governance can thus be defined within a framework combining three seminal paradigms: cybernetics & complexity, knowledge & learning, organization & decision-making.

Further Reading

%d bloggers like this: