Conceptual Thesaurus: Overview


As far as systems are concerned, the purpose of conceptual modeling is to align specific business concerns with shared systems representations. That can only be achieved with some kind of neutrality between knowledge management on one side, systems design on the other side.

Giorgio de Chirico book
Thesauruses are Ontologies in binders (Giorgio de Chirico)

Ontologies & Thesauruses

Ontologies are made of symbolic categories pertaining to the description and understanding of a particular domain of discourse. With regard to the conceptual modeling of systems, an ontology would use open concepts to deal simultaneously with two different domains of discourse: business operations on one hand, supporting systems on the other hand. Since that duality is at the essence of conceptual modeling it should be reflected by the organization of associated thesauruses.

Individuals, Concepts, Artifacts

As noted above, the aim of a system conceptual thesaurus is to bring together two types of descriptions: one targeting the instances of business objects and processes, the other the software artifacts to be implemented by supporting systems. Not by chance, that distinction neatly coincides with different modeling targets and schemes:

  • Processes vs Capabilities. Business concerns are specific and changing, systems architectures are supposed to be shared and stable.
  • Extensions vs Intensions. The former describe sets of instances (business individuals), the latter specifies types (software artifacts).

A conceptual thesaurus will introduce concepts to serve as pivots between sets of business individuals (extensions) and types of systems artifacts (intensions).

With regard to systems conceptual modeling, a thesaurus is meant to cross extensions (left) with intension (right)

Moreover, being a conceptual pivot between business and systems descriptions, such a thesaurus can also be used to bring the business specific domains under a shared conceptual roof. That conclusion is borne out by the congruence of that approach with the traditional triangle popularized by the theories of signs (cf J. Sowa, “Signs and Reality”).

Using a Thesaurus

Translating business requirements into software artifacts can be done directly and iteratively or through phased processes.

The direct approach is epitomized by agile development models, with project teams taking full responsibility for the realization of users stories into applications. In that case a thesaurus will help to map the specifics of stories to systems shared features and functionalities.

Phased approaches are necessary when shared ownership and continuous delivery cannot be guaranteed. In that case reusing existing shared features and functionalities will not be enough, as the new artifacts will have to take into account external concerns and dependencies.

As illustrated by waterfall development schemes, administrative management of cross requirements often turns into cumbersome and prone to errors procedures. Yet most of these procedures may be preempted by weaving the individual threads of business stories through established blueprints of systems functional architecture. That should be the primary requirement for thesaurus user interface and associated use cases.

User Interface (Mockup)

As far as the conceptual modeling of systems is concerned, a thesaurus has to meet two core requirements:

  • Concepts must be modeled independently of their business or system realization (no encroaching from system architecture capabilities).
  • The scope and relevancy of modeled concepts is to be defined by their potential realization by system capabilities (no inroads into knowledge management).

On that basis the thesaurus distinguish between textual requirements (bottom left), concepts (bottom center), and the system capabilities meant to support concepts actualization (bottom right and top).

With regard to concepts, a distinction should be made between domain specific concepts and the ones open to shared semantics and realization.

With regard to capabilities, a thesaurus should prevent any confusion between actual (top left, green border) and symbolic ones (top right, blue border).

Concepts (bottom) can be associated to actual categories (top left) and their symbolic representation (top right). Diagrams can be opened in between.

Taking the concept of Patient for example, it can inherit structure and identification mechanism from the Person open concept, and functional features from a more general Customer enterprise-defined concept. When capabilities are considered, the concept will be associated to a physical entity (e.g for geo-localization), a business entity, and roles (actor in UML parlance). Features and connectors will be defined specifically depending on models, e.g and respectively: location and communication channel (physical entity); responsibilities and authority (organizational entity); managed properties and relationships (symbolic surrogates).

Concepts and capabilities can then be combined in models (center) describing enterprise, systems (functional architecture), and platforms (technical architecture).

It must be stressed that these levels of indirection between concepts, business individuals, and systems artifacts are a key success factor for systems conceptual thesauruses:

  • Being freed from any modeling contingencies, concepts can be defined without being overloaded with ambiguous or irrelevant information.
  • Semantics of features and connectors defined by business stories can be managed independently of systems artifacts semantics.

With the benefit of that double distinction (individuals vs concepts vs artifacts), it will be easier to weave specific business categories into shared types of systems artifacts.

Blueprints of Concepts Realization

Blueprints of conceptual realization (through system capabilities) are best understood as plain functional (aka analysis) patterns as they describe straightforward translations of business categories into software artifacts. As for every patterns, blueprints should be built around a limited number of functional constructs meant to apply to all individuals, objects as well as behaviors:

  • Identification of individuals on both sides of the actual/symbolic divide.
  • Structures binding elements to identified individuals.
  • Partitioning of individuals.
  • References between individuals.
Blueprints are built with only four basic constructs for identification (#), structures (+), partitions (/2), and connectors (>).

Blueprints should also come as a small and closed set of well-defined configurations. Two simple examples from health management can illustrate the point.

Example: Persons and Roles

Proper modeling of persons and roles is a key success factor of functional architecture as it will determine consistency of data as well as confidentiality.

  • Persons are identified (#) as active physical individuals possibly characterized (+) by states for expectations.
  • Individuals identified externally are to be anchored to their symbolic surrogates identified by the system (#).
  • The roles they play (nurse, patient, etc) are to be represented separately (+), with power-types (2) added to describe roles independently of the persons playing them.
A conceptual blueprint for active persons and roles

Variants could be introduced, e.g for passive roles (no interactions between individuals and systems) or exclusive ones.

Example: Processes and Activities

Processes are best described along two perspectives: execution control and business logic. Assuming that business logic may be applied in various operational contexts, it should be defined independently of execution.

  • Pathologies are documentary entities (#) and categories(2).
  • Treatments are symbolic descriptions of activities (#) associated to pathologies (>), information (+) and categories (2).
  • Their execution is to be identified by reference to individuals (#>) and associated to states (+).
A conceptual blueprint with variants for process execution

That should provide the basic blueprint, with variants set with regard to real-time constraints:

  • Real-time treatments driven by activities (e.g surgery) must anchor actual states and their symbolic representation.
  • Real-time treatments driven by events (e.g emergencies) are identified accordingly.

But if lessons from design patterns are to be learnt, the potential benefits of reusing tested blueprints largely depend on a supporting environment. That would be where a thesaurus would help.

Blueprints & Thesaurus

If the aim of conceptual modeling is to manage and align definitions and descriptions on the two sides of the business/system divide, each side must be managed separately. Based on such an indisputable conclusion, conceptual thesauruses could be used to support the mapping of business requirements into analysis models.

That can be illustrated by applying the “Person & Role” blueprint to the concept of Nurse. To begin with, a conceptual thesaurus will get rid of two vexing hurdles:

  • Crossed abstractions: concepts are by nature polymorphic, e.g nurses can be understood as persons or staff. Since concepts are not tied to artifacts descriptions, there is no need of overhasty decisions about inheritance semantics.
  • Naming: instead of imposing labeling conventions (with additional routines), concept names can be applied as they are to different artifacts.

At first sight, a conceptual thesaurus may look like DB generation languages that transform conceptual models into logical ones; that would be misleading;

  • While generation languages perform well for data models or for domain specific class diagrams, they don’t deal with conceptual modeling.
  • While the so-called wizards used to generate logical models seem to rely on stereotypes, those are hard-coded formats that cannot be managed as concepts footprints.

Nonetheless, smart DB management tools point to the right direction and conceptual thesaurus could be seen as some upstream intelligent associate.

Thesaurus, Model Based Systems Engineering, & Enterprise Architecture

A thesaurus would be of limited use were it not to feed the engineering process. And since thesauruses stand on the business/system divide, they have to support the translation of business expectations defined by business analysts, into functional requirements put to form by systems analysts. Following with the Nurse and Hospital examples:

  • Business analysts may consider a “work in” connector as organizational (nurse as staff member) or physical (nurse as person) and leave to system analysts the definition of the corresponding features and relationships in nurse record.
  • Likewise, the way nurses attend to patients, deal with emergencies (events), or carry out treatments (activities) are better defined independently of the corresponding systems artifacts.

That brings back the problem of mapping specific business concepts to their system realization without coercing business concerns into predefined functionalities or, conversely, degrading systems architectures by heaping users stories without functional consolidation. Tackling the issue locally for each domain is arguably a poor option compared to one based on a conceptual consolidation of business domains. Whereas a conceptual thesaurus built on the distinction between concepts and artifacts will mark a significant advance, the benefits will be compounded by using open concepts, first as a semantic roof over business specific domains, and then as a nexus between businesses and systems.

Set within the broader perspective of model based system engineering, these benefits could be reaped whatever the development model:

  • For phased ones, a conceptual thesaurus would wrap up processes hitherto missing a formal inception, as can be illustrated by MDA’s computation independent models (CIMs).
  • For agile ones, a conceptual thesaurus could provide scaled agile schemes with the glue holding users’ stories together.

That is to bring business processes and systems architectures under a shared conceptual roof; the next step would be to bring in data analytics and business intelligence.

From Thesaurus to Ontologies

Thesauruses and models can be understood as special cases of ontologies, opening the door to a comprehensive and consistent approach to information systems:

  • Thesaurus: ontologies covering terms and concepts.
  • Document Management: ontologies covering documents with regard to topics.
  • Organization and Business: ontologies pertaining to enterprise organization, objects and activities.
  • Engineering: ontologies pertaining to the symbolic representation of products and services.


That could be used to support the integration of information processing, from data mining to knowledge management and decision making.


Thesauruses built with ontology languages can be used as a glue between enterprise architecture and software engineering, as illustrated by basic blueprints defined in terms of identification (#), updates (+) and references (:):

  • New concept represented by a new entity with possible updates to domains and partitions.
  • Business domains, with possible updates to events, activities (business functions), and partitions.
  • Application domains, with possible updates to roles, events, activities, entities, and functional partitions.
  • Business processes, with possible updates to domains, entities, roles, events, activities (business functions), and partitions.
  • User stories, identified as activities, with possible updates to roles (as defined by organization), events, entities, and partitions (branches).
  • Use cases, identified as activities, with possible updates to locations, domains, entities, roles (as UML actors), events, activities (business functions), and partitions (extension points).
  • Business (external) events, as represented by entities, with possible updates to roles (as UML actors), processes, and partitions.
  • Business roles, as represented by entities, with possible updates to agents, events, activities, processes, and partitions.
  • Real time applications, as processes with possible updates to active objects and agents.

Basic thesaurus blueprints

A workbench built with OWL 2 is available for comments on the Stanford/Protégé portal using the link: Caminao Ontological Kernel (CaKe_WIPg).

Further Readings

External Links

%d bloggers like this: