The tsunami-like bursting of Large language models (LLMs) has taken both business and technology communities by surprise provoking both knee-jerk and cogent reactions; as the former are calling for new regulations and even pause, the latter put their mind on transparency issues and the potential benefits of symbolic AI for conversational and mediated communication.
In the actual (aka digital) world of LLMs semantic vectors are expressed through the encoding of parameters; that’s where transparency issues should be dealt with, except that they can’t, due to the exponential complexity of lineages induced by billions of parameters. As it stands, pure LLMs must thus rely on a dual assumption: an implicit mix of conversational purposes and grammatical constraints free of social or cognitive biases; or a trust in the wisdom of crowds; hence the interest for hybrid models that could meet transparency expectations. Beyond the intrinsic opacity of LLMs, one would expect a hybrid pre-training to be aligned with the cognitive triptych of:
- Facts, for data and percepts
- Categories, for information and reason
- Concepts, for knowledge and judgment
and basic pivots:
- Thesauruses, for the mapping of conversational sequences to meanings according to contexts and purposes
- Taxonomies, for the mapping of features to categories
- Logic, for the transformation of phrases and the translation of sentences
Based on this frame the aim of hybrid pre-training should be to ensure the traceability of exchanges between sequences, sentences, and phrases:
- Sequences/Sentences: temporal capabilities could be added to thesauruses in order to chronicle the formation of concepts
- Sequences/Phrases: taxonomies could be fleshed out as to minute the arrangements of features into semantic or grammatical categories
- Phrases/Sentences: ontologies should include logical clauses and predicates in order to ensure the reliability and traceability of transformations
It must be stressed that these capabilities are meant to be part and parcel of LLMs used by application, independently of reinforcement learning during conversations.
FURTHER READING
Kaleidoscope Series
- Signs & Symbols
- Generative & General Artificial Intelligence
- Thesauruses, Taxonomies, Ontologies
- EA Engineering interfaces
- Ontologies Use cases
- Complexity
- Cognitive Capabilities
Other internal references
- Things Speaking in Tongues
- What Did You Learn Last Year ?
- Brands, Bots, & Storytelling
- Transcription & Deep Learning
- Out of Mind Content Discovery
- Caminao Framework Overview
- A Knowledge Engineering Framework
- Knowledge interoperability
- Edges of Knowledge
- The Pagoda Playbook
- ABC of EA: Agile, Brainy, Competitive
- Knowledge-driven Decision-making (1)
- Knowledge-driven Decision-making (2)
- Ontological Text Analysis: Example