LLMs & The Matter of Transparency

The more is found the less is known (Wang Qingsong)

The tsunami-like bursting of Large language models (LLMs) has taken both business and technology communities by surprise provoking both knee-jerk and cogent reactions; as the former are calling for new regulations and even pause, the latter put their mind on transparency issues and the potential benefits of symbolic AI for conversational and mediated communication.

In the actual (aka digital) world of LLMs semantic vectors are expressed through the encoding of parameters; that’s where transparency issues should be dealt with, except that they can’t, due to the exponential complexity of lineages induced by billions of parameters. As it stands, pure LLMs must thus rely on a dual assumption: an implicit mix of conversational purposes and grammatical constraints free of social or cognitive biases; or a trust in the wisdom of crowds; hence the interest for hybrid models that could meet transparency expectations. Beyond the intrinsic opacity of LLMs, one would expect a hybrid pre-training to be aligned with the cognitive triptych of:

  • Facts, for data and percepts
  • Categories, for information and reason
  • Concepts, for knowledge and judgment
A frame for hybrid models

and basic pivots:

  • Thesauruses, for the mapping of conversational sequences to meanings according to contexts and purposes
  • Taxonomies, for the mapping of features to categories
  • Logic, for the transformation of phrases and the translation of sentences

Based on this frame the aim of hybrid pre-training should be to ensure the traceability of exchanges between sequences, sentences, and phrases:

  • Sequences/Sentences: temporal capabilities could be added to thesauruses in order to chronicle the formation of concepts
  • Sequences/Phrases: taxonomies could be fleshed out as to minute the arrangements of features into semantic or grammatical categories
  • Phrases/Sentences: ontologies should include logical clauses and predicates in order to ensure the reliability and traceability of transformations
The matter of transparency

It must be stressed that these capabilities are meant to be part and parcel of LLMs used by application, independently of reinforcement learning during conversations.


Kaleidoscope Series

Other internal references

External References

%d bloggers like this: