When its first version was published twenty years ago the prognosis for OMG’s UML (Unified Modeling Language) was of rapid and wide expansion. It didn’t happen, and notwithstanding a noteworthy usage, UML has not become “the” unified modeling language. Beyond the diverse agendas of methods and tools providers, this falling short may have something to do with a lack of robust semantics, as illustrated by UML 2.5’s halfhearted attempt to define individuals.
UML 2.5 Aborted Attempt with Individual
UML 2.5 has often been presented as an attempt to redress the wayward and increasingly convoluted paths taken by the previous versions. Yet, beside some useful (and long needed) clarifications and adjustments, its governing group failed to agree on some compact and unambiguous semantics and had to content itself with perfunctory guidelines introduced as an afterthought.
As a matter of fact the OMG committee may have tried to get its semantics in order, as suggested by the chapter “On semantics” making the point right away with a distinction between the things to be described and the categories to be applied:
“A UML model consists of three major categories of model elements [classifiers, events, and behaviors], each of which may be used to make statements about different kinds of individual things within the system being modeled (termed simply “individuals”in the following)”.
That straightforward understanding (UML is meant to describe individual objects, events, or behaviors) could have provided the semantic cornerstone of a sound approach. Surprisingly, and for obscure reasons, it is soon smudged by syntactical overlapping and semantic ambiguity, the term “individual” being used indifferently as adjective and noun, and then appears to be restricted to classifiers only. That leaves UML with a dearth of clear semantics regarding its scope.
Individual as a Semantic Master Key
The early dismiss of individual as a constitutive concept is unfortunate because, taking a leaf from Archimedes, it could have been the fulcrum on which to place the UML lever.
To begin with, no modeling language, especially one supposed to be unified, can do without some convincing semantics about what it is supposed to describe:
- Analysis (aka extensional) models are descriptive as they deal with external individuals, things or behaviors, capturing their relevant aspects.
- Design (aka prescriptive) models are prescriptive as they describe how to build software components and execute processes.
As analysis and design models serve different purposes they clearly differ. Nonetheless, since UML is meant to straddles the divide between business and systems realms, some rigorous mechanism must ensure a persistent and consistent mapping of individuals.
That could have been neatly achieved with a comprehensive and unified interpretation of individuals, combined with a clear taxonomy of the aspects to be modeled:
- Individuals are whatever occurrences (in business or systems contexts) with identities of their own.
- These individuals (objects, events, or behaviors) can be specified with regard to their structure and relationships.
The logical primacy of this approach is reinforced by its immediate, practical, and conclusive benefits for business processes modeling on one side, model based engineering processes on the other side.
A Key to Business Processes Modeling
As far as business processes are concerned, modeling the part played by supporting systems turns around few critical issues, and these issues can be dealt more clearly and consistently when set in reference to individuals taxonomy (objects, behaviors, events) e.g:
- Functional or non functional requirements ? The former can be associated with individuals, the latter cannot.
- Architecture or application ? The former affect the specification of interactions between individuals, the latter affect only their local features.
- Synchronous or asynchronous ? Specifications can only be made with regard to life-cycles and time-frames: system (objects), process (behaviors), or instant (event).
- Structures or Relationships ? The former are bound to individuals’ identity, the latter are used between different individuals.
- Inheritance or Delegation ? The former is used for the specification of structural or functional features, the latter for individuals’ behaviors.
More generally that understanding of individuals should greatly enhance the alignment of systems functional architectures with business processes.
A Key to Model Based Systems Engineering
As should be expected from a lack of semantic foundations, one of the main characteristics of the UML community is its fragmented practices, regrouped around diagrams (e.g Use case or Class) or task (e.g requirements analysis or code generation).
The challenge can be directly observed for model based system engineering and software development: with the exception of Statecharts for RT modeling, Class diagrams are the only ones used all along engineering processes; the others, when used, are reduced to documentation purposes. That bottleneck in development flows can be seen as the direct consequence of UML restricted semantics: since behaviors are not identified as individuals in their own right, their description cannot be directly translated into software artifacts, but have to be understood as part of active objects descriptions before being translated into class diagrams. Hence the apparent redundancy of corresponding diagrams.
As a corollary, reinstating a unified semantics of individual for both classifiers and behaviors could be the key to a seamless integration of the main UML diagrams. Most important, that would bear out the cross benefits of combining UML and MBSE.
A Key to Enterprise Architecture
Enterprise architecture can be defined in terms of territories and maps, the former materialized by the realities of enterprise organization and business operations, the latter by the assortment of charts, blueprints, or models used to describe enterprise organization, business processes, IT systems, and software applications. Along that understanding the whole endeavor depends on the ability to manage the continuity and consistency of charting; and that cannot be done without a unified and persistent identification mechanism of objects and processes across business, enterprise, and systems.
7 thoughts on “UML’s Semantic Master Key, Lost & Found”
William’s remark goes a long way to explain UML’s lack of traction: “why all this bother about modelling my program, I can just write it.”
I think that Remy is right, that the *original* sin of UML was the insistence of its fathers that it was a way to ‘model your program’, not to model reality. This violated what I understood to be the foundation for o-o programming, and the two decades of sucess with information modelling in the E/R style. I thought we would be carrying that forward. Some, of course, did. so, UML was really schizophrenic, with respect to semantics. Programs or things.
I published a paper with Joaquin Miller and Kevin Tyson in 2004 in IEEE Computer claiming similar things about UML. “Clear, Concise, Consistent UML”.
They thought is was ‘easier to sell’, any maybe it was, till people wondered, ‘why all this bother about modelling my program, I can just write it.’
OTOH, the people who continue to use UML successfully use it to model reality, either informally, like Francis McCabe, or some even following a model driven development style, or something in between, such as domain driven design.
Back to the craziness and ignorance of the UML fathers, though,I think their insistence that UML was a way to model programs led to the idea that the individuals were in ‘another language’ rather than not necessarily in a language at all. Which resulted in strictly partitioning what you could say at each ‘language level’, as in the now 100 year old version of type theory, instead of the higher order logic approach that has been current for the last 50 years. The UML so-called ‘semantics” drove this to absurdity, if anyone ever bothered to use it strictly.
Ashley (like RM-ODP) correctly points out that individuals, events, and behaviors are intended as names for modelling concepts, which are *different from the things modeled. This *IS* important. We need a language with grammatical categories, and a semantics where the linguistic items in these categories, such as proper names, are mapped to the modeled things, such as individuals. ISO 10746 did this in the early 1990s..
Remy tells us that UML 2.5 briefly considered a language with three grammatical categories, classifiers, events and behaviors, each of which reffered to individuals of the appropriate kinds. They dropped this, apparently not even troubling to explain how these kinds of individuals differ.
There is nothing sacred about dividing our experiences of the world into three kinds, We might have two or seven. (Say, relations, roles, things, events. capacities, …). Human language can differ from other human languages . Some languages, for instance, have no names for things, only for the event of the appearance of something. Instead of saying ‘President Obama is over there”, you say “it is President Obamaing over there.’
But however many, as Remy says, you are doomed without a simple semantics, one that says “this is the universe of individuals that we can talk about in this language”.
You are absolutely right, and the difference points to a critical semiotic issue (see link below John Sowa, “Signs and Reality”). But it wasn’t intentional and I don’t think it affects the argument.
There is a subtle but perhaps important difference between the text that you quote from the Chapter “On Semantics” and your commentary on it.
The former says:
“A UML model consists of three major categories of model elements [classifiers, events, and behaviors], each of which may be used to make statements about different kinds of individual things within the system being modeled.”
This implies that “classifiers, events, and behaviors” are components of a language which can be used to make descriptions of individuals, but *are not themselves the individuals*.
The latter says:
“UML is meant to describe individual objects, events, or behaviors” and later: “These individuals (objects, events, or behaviors) can be specified with regard to their structure and relationships”
This implies that that “objects, events, and behaviors” *are themselves the individuals* with which we should be concerned.
Is this difference intentional?
1. At some point, and 20 years is already UML maturity, clear core semantics are to be expected, all the more because logicians did the homework a century ago.
2. I agree with your point about models and code being two different things, but MBSE is reduced to code generation only because UML lacks the semantics to deal with upstream (aka analysis) models.
3. I’ve had the same experience, which make me regret that more formal benefits can’t be achieved.
Some lessons can only be learned in the rearview mirror. IMO, the lessons that the UML community could (should) learn are:
1. Getting a clear semantics is a long term win. (And, BTW, the logicians typically got there first.)
2. UML was hijacked by the Model Driven programming community. This is definitely one of those rearview things: the idea that a UML diagram is a kind of visual representation of a Java program is (again, IMO) a distraction.
3. Personally, I get the most value from UML informally. It helps me to organize myself in a way that is both more rigorous than English and is easier as a focal point of conversations in the engineering team.
IMHO any idea which cannot be distilled and expressed into a single descriptive sentence cannot be presented in a convincing semantic.
Following from, the observation, “no modeling language, especially one supposed to be unified, can do without some convincing semantics about what it is supposed to describe.*”; *I submit, any idea or system which cannot be distilled and expressed into a single descriptive sentence cannot be presented in a convincing semantic.
My experience dictates such simplicity requisite for clarity and in the best semantic of the English language.