GDPR Ontological Primer

Preamble

European Union’s General Data Protection Regulation (GDPR), to come into effect  this month, is a seminal and momentous milestone for data privacy .

Nothing Personal (Arthur Szyk)

Yet, as reported by Reuters correspondents, European enterprises and regulators are not ready; more worryingly, few (except consultants) are confident about GDPR direction.

Misgivings and uncertainties should come as no surprise considering GDPR’s two innate challenges:

  • Regulating privacy rights represents a very ambitious leap into a digital space now at the core of corporate business strategies.
  • Compliance will not be put under a single authority but be overseen by an assortment of national and regional authorities across the European Union.

On that account, ontologies appear as the best (if not the only) conceptual approach able to bring contexts (EU nations), concerns (business vs privacy), and enterprises (organization and systems) into a shared framework.

A workbench built with the Caminao ontological kernel is meant to explore the scope and benefits of that approach, with a beta version (Protégé/OWL 2) available for comments on the Stanford/Protégé portal using the link: Caminao Ontological Kernel (CaKe_GDPR).

Enterprise Architectures & Regulations

Compared to domain specific regulations, GDPR  is a governance-oriented regulation set across business concerns and enterprise organization; but unlike similarly oriented ones like accounting, GDPR is aiming at the nexus of business competition, namely the processing of data into information and knowledge. With such a strategic stake, compliance is bound to become a game-changer cutting across business intelligence, production systems, and decision-making. Hence the need for an integrated, comprehensive, and consistent approach to the different dimensions involved:

  • Concepts upholding businesses, organizations, and regulations.
  • Documentation with regard to contexts and statutory basis.
  • Regulatory options and compliance assessments
  • Enterprise systems architecture and operations

Moreover, as for most projects affecting enterprise architectures, carrying through GDPR compliance is to involve continuous, deep, and wide ranging changes that will have to be brought off without affecting overall enterprise performances.

Ontologies arguably provide a conclusive solution to the problem, if only because there is no other way to bring code, models, documents, and concepts under a single roof. That could be achieved by using ontologies profiles to frame GDPR categories along enterprise architectures models and components.

CakeGDPR_00.jpg
Basic GDPR categories and concepts (black color) as framed by the Caminao Kernel

Compliance implementation could then be carried out iteratively across four perspectives:

  • Personal data and managed information
  • Lawfulness of activities
  • Time and Events
  • Actors and organization.

Data & Information

To begin with, GDPR defines ‘personal data’ as “any information relating to an identified or identifiable natural person (‘data subject’)”. Insofar as logic is concerned that definition implies an equivalence between ‘data’ and ‘information’, an assumption clearly challenged by the onslaught of big data: if proofs were needed, the Cambridge Analytica episode demonstrates how easy raw data can become a personal affair. Hence the need to keep an ontological level of indirection between regulatory intents and the actual semantics of data as managed by information systems.

CakeGDPR_data
Managing the ontological gap between regulatory understandings and compliance footprints

Once lexical ambiguities set apart, the question is not so much about the data bases of well identified records than about the flows of data continuously processed: if identities and ownership are usually set upfront by business processes, attributions may have to be credited to enterprises know-how if and when carried out through data analytics.

Given that the distinctions are neither uniform, exclusive or final, ontologies will be needed to keep tabs on moves and motives. OWL 2 constructs (cf annex) could also help, first to map GDPR categories to relevant information managed by systems, second to sort out natural data from nurtured knowledge.

Activities & Purposes

Given footprints of personal data, the objective is to ensure the transparency and traceability of the processing activities subject to compliance.

Setting apart (see below for events) specific add-ons for notification and personal accesses,  charting compliance footprints is to be a complex endeavor: as there is no reason to assume some innate alignment of intended (regulation) and actual (enterprise) definitions, deciding where and when compliance should apply potentially calls for a review of all processing activities.

After taking into account the nature of activities, their lawfulness is to be determined by contexts (‘purpose limitation’ and ‘data minimization’) and time-frames (‘accuracy’ and ‘storage limitation’). And since lawfulness is meant to be transitive, a comprehensive map of the GDPR footprint is to rely on the logical traceability and transparency of the whole information systems, independently of GDPR.

That is arguably a challenging long-term endeavor, all the more so given that some kind of Chinese Wall has to be maintained around enterprise strategies, know-how, and operations. It ensues that an ontological level of indirection is again necessary between regulatory intents and effective processing activities.

Along that reasoning compliance categories, defined on their own, are first mapped to categories of functionalities (e.g authorization) or models (e.g use cases).

CakeGDPR_activ1
Compliance categories are associated upfront to categories of functionalities (e.g authorization) or models (e.g use cases).

Then, actual activities (e.g “rateCustomerCredit”) can be progressively brought into the compliance fold, either with direct associations with regulations or indirectly through associated models (e.g “ucRateCustomerCredit” use case).

CakeGDPR_activ2
Compliance as carried out through Use Case

The compliance backbone can be fleshed out using OWL 2 mechanisms (see annex) in order to:

  • Clarify the logical or functional dependencies between processing activities subject to compliance.
  • Qualify their lawfulness.
  • Draw equivalence, logical, or functional links between compliance alternatives.

That is to deal with the functional compliance of processing activities; but the most far-reaching impact of the regulation may come from the way time and events are taken into account.

Time & Events

As noted above, time is what makes the difference between data and information, and setting rules for notification makes that difference lawful. Moreover, by adding time constraints to the notifications of changes in personal data, regulators put systems’ internal events on the same standing as external ones. That apparently incidental departure echoes the immersion of systems into digitized business environments, making all time-scales equal whatever their nature. Such flattening is to induce crucial consequences for enterprise architectures.

That shift together with the regulatory intent are best taken into account by modeling events as changes in expectations, physical objects, processes execution, and symbolic objects, with personal data change belonging to the latter.

Gdpr events
Mapping internal (symbolic) and external (actual) events is a critical element of GDPR compliance

Putting apart events specific to GDPR (e.g data breaches), compliance with regard to accuracy and storage limitation regulations will require that all events affecting personal data:

  • Are set in time-frames, possibly overlapping.
  • Have notification constraints properly documented.
  • Have likelihood and costs of potential risks assessed.

As with data and activities, OWL 2 constructs are to be used to qualify compliance requirements.

Actors & Organization

GDPR introduces two specific categories of actors (aka roles): one (data subject) for natural persons, and one for actors set by organizations, either specifically for GDPR assignment, or by delegation to already defined actors.

Gdpr actors
GDPR roles can be set specifically or delegated

OWL 2 can then be used to detail how regulatory roles can be delegated to existing ones, enabling a smooth transition and a dynamic adjustment of enterprise organization with regulatory compliance.

It must be stressed that the semantic distinction between identified agents (e.g natural persons) and the roles (aka UML actors) they play in processes is of particular importance for GDPR compliance because who (or even what) is behind an actor interacting with a system is to remain unknown to the system until the actor can be authentically identified. If that ontological lapse is overlooked there is no way to define and deal with security, confidentiality or privacy regulations.

Conclusion

The use of ontologies brings clear benefits for regulators, enterprise governance, and systems architects.

Without shared conceptual guidelines chances are for the European regulatory orchestra to get lost in squabbles about minutiae before sliding into cacophony.

With regard to governance, bringing systems and regulations into a common conceptual framework is to enable clear and consistent compliance strategies and policies, as well as smooth learning curves.

With regard to architects, ontology-based compliance is to bring cross benefits and externalities, e.g from improved traceability and transparency of systems and applications.

Annex A: Mapping Regulations to Models (sample)

To begin with, OWL 2 can be used to map GDPR categories to relevant resources as managed by information systems:

  • Equivalence: GDPR and enterprise definitions coincide.
  • Logical intersection, union, complement: GDPR categories defined by, respectively, a cross, merge, or difference of enterprise definitions.
  • Qualified association between GDPR and enterprise categories.

Assuming the categories properly identified, the language can then be employed to define the sets of regulated instances:

  • Logical property restrictions, using existential and universal quantification.
  • Functional property restrictions, using joints on attributes values.

Other constructs, e.g cardinality or enumerations, could also be used for specific regulatory constraints.

Finally, some OWL 2 built-in mechanisms can significantly improve the assessment of alternative compliance policies by expounding regulations with regard to:

  • Equivalence, overlap, or complementarity.
  • Symmetry or asymmetry.
  • Transitivity
  • etc.

Annex B: Mapping Regulations to Capabilities

GDPR can be mapped to systems capabilities using well established Zachman’s taxonomy set by crossing architectures functionalities (Who,What,How, Where, When) and layers (business and organization), systems (logical structures and functionalities), and platforms (technologies).

Rules_GDPR
Regulatory Compliance vs Architectures Capabilities

These layers can be extended as to apply uniformly across external ontologies, from well-defined (e.g regulations) to fuzzy (e.g business prospects or new technologies) ones, e.g:

Ontologies, capabilities (Who,What,How, Where, When), and architectures (enterprise, systems, platforms).

Such mapping is to significantly enhance the transparency of regulatory policies.

Further Reading

External Links

2 thoughts on “GDPR Ontological Primer”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.