NIEM & Information Exchanges

Preamble

The objective of the National Information Exchange Model (NIEM) is to provide a “dictionary of agreed-upon terms, definitions, relationships, and formats that are independent of how information is stored in individual systems.”

(Alfred Jensen)
NIEM’s model makes no difference between data and information (Alfred Jensen)

For that purpose NIEM’s model combines commonly agreed core elements with community-specific ones. Weighted against the benefits of simplicity, this architecture overlooks critical distinctions:

  • Inputs: Data vs Information
  • Dictionary: Lexicon and Thesaurus
  • Meanings: Lexical Items and Semantics
  • Usage: Roots and Aspects

That shallow understanding of information significantly hinders the exchange of information between business or institutional entities across overlapping domains.

Inputs: Data vs Information

Data is made of unprocessed observations, information makes sense of data, and knowledge makes use of information. Given that NIEM is meant to be an exchange between business or institutional users, it should have no concern with data mining or knowledge management.

Data is meaningless, information meaning is set by semantic domains.
As an exchange, NIEM should have no concern with data mining or knowledge management.

The problem is that, as conveyed by “core of data elements that are commonly understood and defined across domains, such as person, activity, document, location”, NIEM’s model makes no explicit distinction between data and information.

As a corollary, it implies that data may not only be meaningful, but universally so, which leads to a critical trap: as substantiated by data analytics, data is not supposed to mean anything before processed into information; to keep with examples, even if the definition of persons and locations may not be specific, the semantics of associated information is nonetheless set by domains, institutional, regulatory, contractual, or otherwise.

Data is meaningless, information meaning is set by semantic domains.
Data is meaningless, information meaning is set by semantic domains.

Not surprisingly, that medley of data and information is mirrored by NIEM’s dictionary.

Dictionary: Lexicon & Thesaurus

As far as languages are concerned, words (e.g “word”, “ξ∏¥” ,”01100″) remain data items until associated to some meaning. For that reason dictionaries are built on different levels, first among them lexical and semantic ones:

  • Lexicons take items on their words and gives each of them a self-contained meaning.
  • Thesauruses position meanings within overlapping galaxies of understandings held together by the semantic equivalent of gravitational forces; the meaning of words can then be weighted by the combined semantic gravity of neighbors.

In line with its shallow understanding of information, NIEM’s dictionary only caters for a lexicon of core standalone items associated with type descriptions to be directly implemented by information systems. But due to the absence of thesaurus, the dictionary cannot tackle the semantics of overlapping domains: if lexicons alone can deal with one-to-one mappings of items to meanings (a), thesauruses are necessary for shared (b) or alternative (c) mappings.

vv
Shared or alternative meanings cannot be managed with lexicons

With regard to shared mappings (b), distinct lexical items (e.g qualification) have to be mapped to the same entity (e.g person). Whereas some shared features (e.g person’s birth date) can be unequivocally understood across domains, most are set through shared (professional qualification), institutional (university diploma), or specific (enterprise course) domains .

Conversely, alternative mappings (c) arise when the same lexical items (e.g “mole”) can be interpreted differently depending on context (e.g plastic surgeon, farmer, or secret service).

Whereas lexicons may be sufficient for the use of lexical items across domains (namespaces in NIEM parlance), thesauruses are necessary if meanings (as opposed to uses) are to be set across domains. But thesauruses being just tools are not sufficient by themselves to deal with overlapping semantics. That can only be achieved through a conceptual distinction between lexical and semantic envelops.

Meanings: Lexical Items & Semantics

NIEM’s dictionary organize names depending on namespaces and relationships:

  • Namespaces: core (e.g Person) or specific (e.g Subject/Justice).
  • Relationships: types (Counselor/Person) or properties (e.g PersonBirthDate).
vvv
NIEM’s Lexicon: Core (a) and specific (b) and associated core (c) and specific (d) properties

But since lexicons know only names, the organization is not orthogonal, with lexical items mapped indifferently to types and properties. The result being that, deprived of reasoned guidelines, lexical items are chartered arbitrarily, e.g:

Based on core PersonType, the Justice namespace uses three different schemes to define similar lexical items:

  • “Counselor” is described with core PersonType.
  • “Subject” and “Suspect” are both described with specific SubjectType, itself a sub-type of PersonType.
  • “Arrestee” is described with specific ArresteeType, itself a sub-type of SubjectType.

Based on core EntityType:

  • The Human Services namespace bypasses core’s namesake and introduces instead its own specific EmployerType.
  • The Biometrics namespace bypasses possibly overlapping core Measurer and BinaryCaptured and directly uses core EntityType.
Lexical items are meshed disregarding semantics
Lexical items are chartered arbitrarily

Lest expanding lexical items clutter up dictionary semantics, some rules have to be introduced; yet, as noted above, these rules should be limited to information exchange and stop short of knowledge management.

Usage: Roots and Aspects

As far as information exchange is concerned, dictionaries have to deal with lexical and semantic meanings without encroaching on ontologies or knowledge representation. In practice that can be best achieved with dictionaries organized around roots and aspects:

  • Roots and structures (regular, black triangles) are used to anchor information units to business environments, source or destination.
  • Aspects (italics, white triangles) are used to describe how information units are understood and used within business environments.
nformation exchanges are best supported by dictionaries organized around roots and aspects
Information exchanges are best supported by dictionaries organized around roots and aspects

As it happens that distinction can be neatly mapped to core concepts of software engineering.

P.S. Thesauruses & Ontologies

Ontologies are systematic accounts of existence for whatever is considered, in other words some explicit specification of the concepts meant to make sense of a universe of discourse. From that starting point three basic observations can be made:

  1. Ontologies are made of categories of things, beings, or phenomena; as such they may range from simple catalogs to philosophical doctrines.
  2. Ontologies are driven by cognitive (i.e non empirical) purposes, namely the validity and consistency of symbolic representations.
  3. Ontologies are meant to be directed at specific domains of concerns, whatever they can be: politics, religion, business, astrology, etc.

With regard to models, only the second one puts ontologies apart: contrary to models, ontologies are about understanding and are not supposed to be driven by empirical purposes.

On that basis, ontologies can be understood as thesauruses describing galaxies of concepts (stars) and features (planets) held together by semantic gravitation weighted by similarity or proximity. As such ontologies should be NIEM’s tool of choice.

Further Reading

External Links

One thought on “NIEM & Information Exchanges”

  1. I would suggest that NIEM will go the way of the dinosaurs as Natural Language algorithms take hold in a few years. The problem with NIEM is that it isn’t finding and developing new concepts that it can support and build upon but instead it grows knowledge in the rear view mirror. It isn’t a total waste because you can put the NIEM dictionary within the algorithms, but why spend all that time and energy plus money to do so when you can get better integration of data from recognizing the patterns of the language instead of the language itself.

Leave a Reply

Discover more from Caminao's Ways

Subscribe now to keep reading and get access to the full archive.

Continue reading