The growing footprint of generative AI (GAI) in user-oriented applications puts a question mark on knowledge value chains and more generally on the relationship between language and intelligence. The evolution of languages may provide some clues.
Communication & Representation
Evolution set the original distinction between spoken and written languages, as well as the one between communication and representation. For the human species these distinctions have been accompanied by a transition through symbols and mediated communication; which didn’t happen for other species. It thus can be argued that while communication is common to all animal species, combining communication and symbolic representation remains specific to humans, and supposedly to human intelligence.
Alphabets & Logograms
As illustrated by the evolution of languages, representation technologies are polymorphs: alphabets (representation) have been employed with phonetics (communication), and logograms (representation) have been expressed not only as signs but also as phonemes (communication). Moreover, as epitomised by Kanji, the technologies are interoperable: a common logographic system supporting (written) representations, to be shared between different alphabetic ones for (spoken) communication.
Computational Linguistics
Linguistics have taken a new turn with the arrival of computers as a third agent along humans and nature, with computational linguistics introducing a layered perspective:
- Nominal layer: words used to put names on facts (lexicons)
- Modeling and/or programming layer: grammars meant to be executed by machines (syntax and semantics)
- Natural layer: lexicons, semantics, and grammars as used by humans (pragmatics)
At first, computational linguistics have considered natural and machine languages as isomorphic layered constructs, until new cognitive developments undermined the mind-as-a-machine illusion, and machine learning technologies replaced a structural isomorphism with an operational one.
Operational Linguistics & Large Language Models
Broadly speaking, Machine-learning technologies tend to replace a mind-as-a-machine paradigm with a machine-as-mind one, using large language models (LLM) as a test-bed. To that end LLM, and more generally GAI, considers words as facts from which meanings can be mined from massive communication datasets, with grammars and canned pragmatics providing the backbones of conversations.
That approach can be better understood when set in its operational context and compared to empiric and formal alternatives (see figure above):
- Empiric (or scientific) approaches use domain-specific syntax and semantics to map facts into categories and models
- Formal (or logic) approaches use generic and truth-preserving syntax and semantics to align concepts and presumptive models with categories
- Generative (or nominal) approaches bypass categories and rely instead on semantic grammars (a combination of syntax and semantics) and pragmatics for the alignment of facts and meanings.
That operational perspective points to an intrinsic caveat of generative approaches: their reliance on implicit contents, as represented by the area below the NW/SE diagonal in the diagram above.
Language & Complexity
Depending on the “G” in GAI standing for “General” or “Generative”, the focus is put on intelligence or language; as it happens, both paths lead to the same fundamental issue, namely unifying views of knowledge anatomy (prism), sharing (languages), and learning (development):
- Anatomy: best described through the knowledge diffraction into names, concepts, and categories
- Sharing: the two-pronged language distinction (communication/representation and written/spoken) can best solve the conundrum of open-end linguistic specificities dealing with the universality of human knowledge
- Learning: based on revisited Spinoza’s taxonomy (observations, reason, judgement, experience), knowledge development (individual or collective) can be best explained as the use of language to translate implicit knowledge into explicit representations
Whether it’s general (intelligence) or generative (language), GAI schemes don’t encompass all these dimensions. More specifically, they cannot deal with the gap between two kinds of complexity, nature on the one hand, human minds on the other hand: that can only be achieved with language used to build bridges from the two sides of the gap.