“Nothing that is worth knowing can be taught” – Oscar Wilde
As illustrated by he Symbolic Systems Program (SSP) at Stanford University, advances in computing and communication technologies bring information and knowledge systems under a single functional roof, namely the processing of symbolic representations.
Within that understanding one will expect Knowledge Management to shadow systems architectures and concerns: business contexts and objectives, enterprise organization and operations, systems functionalities and technologies. On the other hand, knowledge being by nature a shared resource of reusable assets, its organization should support the needs of its different users independently of the origin and nature of information. Knowledge Management should therefore bind knowledge of architectures with architecture of knowledge.
In their pivotal article Davis, Shrobe, and Szolovits set five principles for knowledge representation:
- Surrogate: KR provides a symbolic counterpart of actual objects, events and relationships.
- Ontological commitments: a KR is a set of statements about the categories of things that may exist in the domain under consideration.
- Fragmentary theory of intelligent reasoning: a KR is a model of what the things can do or can be done with.
- Medium for efficient computation: making knowledge understandable by computers is a necessary step for any learning curve.
- Medium for human expression: one the KR prerequisite is to improve the communication between specific domain experts on one hand, generic knowledge managers on the other hand.
- That puts information systems as a special case of knowledge ones, as they fulfill the five principles, yet with a functional qualification:
- Like knowledge systems, information systems manage symbolic representations of external objects, events or activities purported to be relevant.
- System models are assertions regarding legitimate business objects and operations.
- Likewise, information systems are meant to support efficient computation and user-friendly interactions.
The only difference is about coupling: contrary to knowledge systems, information and control ones play a role in their context, and operations on surrogates are not neutral.
Knowledge constructs are empty boxes that must be properly filled with facts. But facts are not given but must be observed, which necessarily entails some observer, set on task if not with vested interests, and some apparatus, natural or made on purpose. And if they are to be recorded, even “pure” facts observed through the naked eyes of innocent children will have to be translated into some symbolic representation. Taking wind as an example, wind socks support immediate observation of facts, free of any symbolic meaning. In order to make sense of their behaviors, wanes and anemometers are necessary, respectively for azimuth and speed; but that also requires symbolic frameworks for directions and metrics. Finally, knowledge about the risks of strong winds can be added when such risks must be considered.
As far as enterprises are concerned, knowledge boxes are to be filled with facts about their business context and processes, organization and applications, and technical platforms. Some of them will be produced internally, others obtained from external sources, but all should be managed independently of specific purposes. Whatever their nature (business, organization or systems), information produced by the enterprises themselves is, from inception, ready to use, i.e organized around identified objects or processes, with defined structures and semantics. That’s not necessarily the case with data reflecting external contexts (markets, regulations, technology, etc) which must be mapped to enterprise concerns and objectives before being of any use. That translation of data into information may be done immediately by mapping data semantics to identified objects and processes; it may also be delayed, with rough data managed as such until being used at a later stage to build information.
From Data to Information
Information is meaningful, data is not. Even “facts” are not manna from heaven but have to be shaped from phenomena into data and then information, as epitomized by binary, fragmented, or “big” data.
- Binary data are direct recording of physical phenomena, e.g sounds or images; even when indexed with key words they remain useless until associated, as non symbolic features, to identified objects or activities.
- Contrary to binary data, fragmented data comes in symbolic guise, but as floating nuggets with sub-level granularity; and like their binary cousin, those fine-grained descriptions are meaningless until attached to identified objects or activities.
- “Big” data is usually understood in terms of scalability, as it refers to lumps too large to be processed individually. It can also be defined as a generalization of fragmented data, with identified targets regrouped into more meaningful aggregates, moving the targeted granularity up the scale to some “overwhelming” level.
Since knowledge can only be built from symbolic descriptions, data must be first translated into information made of identified and structured units with associated semantics. Faced with “rough” (aka unprocessed) data, knowledge managers can choose between two policies: information can be “mined” from data using statistical means, or the information stage simply bypassed and data directly used (aka interpreted) by “knowledgeable” agents according to their context and concerns.
As a matter of fact, both policies rely on knowledgeable agents, the question being who are the “miners” and what they should know. Theoretically, miners could be fully automated tools able to extract patterns of relevant information from rough data without any prior information; practically, such tools will have to be fed with some prior “intelligence” regarding what should be looked for, e.g samples for neuronal networks, or variables for statistical regression. Hence the need of some kind of formats, blueprints or templates that will help to frame rough data into information.
Knowledge must be built from accurate and up-to-date information regarding external and internal state of affairs, and for that purpose information items must be managed according to their source, nature, life-cycle, and relevancy:
- Source: Government and administrations, NGO, corporate media, social media, enterprises, systems, etc.
- Nature: events, decisions, data, opinions, assessments, etc.
- Type of anchor: individual, institution, time, space, etc.
- Life-cycle: instant, time-related, final.
- Relevancy: traceability with regards to business objectives, business operations, organization and systems management.
On that basis, knowledge management will have to map knowledge to its information footprint in terms of reliability (source, accuracy, consistency, obsolescence, etc) and risks.
From Information to Knowledge
Information is meaningful, knowledge is also useful. As information models, knowledge representations must first be anchored to persistency and execution units in order to support the consistency and continuity of surrogates identities (principle #1). Those anchors are to be assigned to domains managed by single organizational units in charge of ontological commitments, and enriched with structures, features, and associations (principle #2). Depending on their scope, structure or feature, semantics are to be managed respectively by persistent or application domains. Likewise, ontologies may target objects or aspects, the former being associated with structural sub-types, the latter with functional ones. Differences between information models and knowledge representation appear with rules and constraints. While the objective of information and control systems is to manage business objects and activities, the purpose of knowledge systems is to manage symbolic contents independently of their actual counterparts (principle #3). Standard rules used in system modelling describe allowed operations on objects, activities and associated information; they can be expressed forward or backward:
- Forward (aka push) rules are conditions on when and how operations are to be performed.
- Backward (aka pull) rules are constraints on the consistency of symbolic representations or on the execution of operations.
Assuming a continuity between information and knowledge representations, the inflection point would be marked by the introduction of modalities used to qualified truth values, e.g according temporal and fuzzy logic:
- Temporal extensions will put time stamps on truth values of information.
- Fuzzy logic put confidence levels on truth values of information.
That is where knowledge systems depart from information and control ones as they introduce a new theory of intelligent reasoning, one based upon the fluidity and volatility of knowledge.
Meanings are in the Hands of Beholders
Seen in a corporate context, knowledge can be understood as information framed by contexts and driven by purposes: how to run a business, how to develop applications, how to manage systems. Hence the dual perspective: on one hand information is governed by enterprise concerns, systems functionalities, and platforms technology; on the other hand knowledge is driven by business processes, systems engineering, and services management.
That provides a clear and comprehensive taxonomy of artifacts, to be used to build knowledge from lower layers of information and data:
- Business analysts have to know about business domains and activities, organization and applications, and quality of service.
- System engineers have to know about projects, systems functionalities and platform implementations.
- System managers have to know about locations and operations, services, and platform deployments.
The dual perspective also points to the dynamics of knowledge, with information being pushed by the their sources, and knowledge being pulled by their users.
A Time for Every Purpose
As understood by Cybernetics, enterprises are viable systems whose success depends on their capacity to countermand entropy, i.e the progressive downgrading of the information used to govern interactions both within the organization itself and with its environment. Compared to architecture knowledge, which is organized according to information contents, knowledge architecture is organized according to functional concerns and information lifespan, and its objective is to keep internal and external information in synch:
- Planning of business objectives and requirements (internal) relative to markets evolutions and opportunities (external).
- Assessment of organizational units and procedures (internal) in line with regulatory and contractual environments (external).
- Monitoring of operations and projects (internal) together with sales and supply chains (external).
That put meanings (that would be knowledge) in the hands of decision makers, respectively for corporate strategy, organization, and operations. Moreover, enterprises being living entities, lifespan and functional sustainability are meant to coalesce into consistent and homogenous layers:
- Enterprise (aka business, aka strategic) time-scales are defined by environments, objectives, and investment decisions.
- Organization (aka functional) time-scales are set by availability, versatility, and adaptability of resources
- Operational time-scales are determined by process features and constraints.
Such a congruence of time-scales, architectures and purposes into Shearing Layers is arguably a key success factor of Knowledge management.
Search and Stretch
As already noted, knowledge is driven by purposes, and purposes, not being confined to domains or preserves, are bound to stretch knowledge across business contexts and organizational boundaries. That can be achieved through search, logic, and classification.
- Searches collect the information relevant to users concerns (1). That may satisfy all the knowledge needs, or provide a backbone for further extension.
- Searches can be combined with ontologies (aka classifications) that put the same information under new lights (1b).
- Truth-preserving operations using mathematics or formal languages can be applied to produce derived information (2).
- Finally, new information with reduced confidence levels can be produced through statistical processing (3,4).
For instance, observed traffic at toll roads (1) is used for accounting purposes (2), to forecast traffic evolution (3), to analyze seasonal trends (1b) and simulate seasonal and variable tolls (4).
Those operations entail clear consequences for knowledge management: As far as computational distances don’t affect confidence levels, truth-preserving operations are neutral with regard to KM. Classifications are symbolic tools designed on purpose; as a consequence all knowledge associated to a classification should remain under the responsibility of its designer. Challenges arise when confidence levels are affected, either directly or through obsolescence. And since decision-making is essentially about risks management, dealing with partial or unreliable information cannot be avoided. Hence the importance of managing knowledge along shearing layers, each with its own information life-cycle, confidence requirements, and decision-making rules.
From Knowledge Architecture to Architecture Capability
Knowledge architecture is the corporate central nervous system, and as such it plays a primary role in the support of operational and managerial processes. That point is partially addressed by Frameworks like Zachman whose matrix organizes Information System Architecture (ISA) along capabilities and design levels. Yet, as illustrated by the design levels, the focus remains on information technology without explicitly addressing the distinction between enterprise, systems, and platforms.
That distinction is pivotal because it governs the distinction between corresponding processes, namely business processes, systems engineering, and services managements. And once the distinction is properly established knowledge architecture can be aligned with processes assessment.
Yet that will not be enough now that digital environments are invading enterprise systems, blurring the distinction between managed information assets and the continuous flows of big data.
A way has to be found to bridge the gap between big data and enterprise information models.
Knowledge Representation & Profiled Ontologies
Faced with digital business environments, enterprise must sort relevant and accurate information out of continuous and massive inflows of data. As modeling methods cannot cope with the open range of contexts, concerns, semantics, and formats, looser schemes are needed, that’s precisely what ontologies are meant to do:
- Thesaurus: ontologies covering terms and concepts.
- Documents: ontologies covering documents with regard to topics.
- Business: ontologies of relevant enterprise organization and business objects and activities.
- Engineering: symbolic representation of organization and business objects and activities.
Profiled ontologies can then be designed by combining that taxonomy of concerns with contexts, e.g:
- Institutional: Regulatory authority, steady, changes subject to established procedures.
- Professional: Agreed upon between parties, steady, changes subject to accords.
- Corporate: Defined by enterprises, changes subject to internal decision-making.
- Social: Defined by usage, volatile, continuous and informal changes.
- Personal: Customary, defined by named individuals (e.g research paper).
Last but not least, external (regulatory, businesses, …) and internal (i.e enterprise architecture) ontologies could be integrated, for instance with the Zachman framework:
Using profiled ontologies to manage enterprise architecture and corporate knowledge will help to align knowledge management with EA governance by setting apart ontologies defined externally (e.g regulations), from the ones set through decision-making, strategic (e.g plate-form) or tactical (e.g partnerships).
An ontological kernel has been developed as a Proof of Concept using Protégé/OWL 2; a beta version is available for comments on the Stanford/Protégé portal with the link: Caminao Ontological Kernel (CaKe).
From Data Analysis to Deep Learning
Set between all-inclusive onslaught of data on one side, pervasive smart bots on the other side, information systems could lose their identity and purpose. And there is a good reason for that, namely the confusion between data, information, and knowledge.
As it happened aeons ago, ontologies have been explicitly though up to deal with that issue.
- Architecture Capabilities
- Architecture Capabilities and Requirements
- Ontologies & Enterprise Architecture
- Economics of Reuse
- Sifting through a web of things
- Semantic Web: from Things to Memes