LLMs: A Functional Perspective

Tuning a LLM transponder

LLMs as Shortwave Radios

Signals & Words

Large language models can be compared to radio transponders relaying signals between emitters and receivers, with documents or datasets at the beginning and readers at the end.

LLMs as Shortwave Radios

Radio transmitters use frequencies to transmit signals and can thus be identified accordingly by listeners’ devices. But for radios emitting in the shortwave spectrum of frequencies interferences require some tuning in order to identify the source (?) of signals. As it happens LLMs tuning can be defined similarly.

Taking the exemple of three SW English information channels, and assuming the encoding of resources indexed for British, American, and international English, radio listeners may have to iterate through four typical stages:

  • Setting the context (a)
  • Numerical tuning, to find a reliable frequency (b)
  • Semantic tuning, to pick an English information channel, taking into account geographical variants (c)
  • Pragmatic tuning, to select his/her favorite based on broadcasts (d)
Tuning LLMs like shortwave radios

Applying the metaphor to LLMs puts the light on the functional specificity of these LLMs processing stages.

Process Overview

Overall using Large language models involves three basic steps:

  • Encoding generic or specific resources (documents and datasets): tokens
  • Training and tuning models: tokens > words > meanings
  • Using models: conversations

Generic resources are supposed to comprehensively reflect the whole range of a language manifestations, written or verbal; specific resources are subsets defined by organisations (typically domains). Resources are encoded into neural networks (yellow color) with nodes representing tokens and connectors observed associations; a panoply of machine learning algorithms are then used to turn neural networks into semantic ones (mixed colors) of words and meanings.

LLMs Overview

Beyond the variety of algorithms at work, the training and tuning of LLMs can be characterized by the semantics of nodes and the nature of operations:

  • Meanings can emerge from tokens (terms), be rooted in selected resources (categories), or stem from targeted retrievals (concepts).
  • Concomitantly, meanings are built through the weighting of observed associations between tokens (numerical tuning), the consistency of outcomes (semantic tuning), the relevance of outcomes to the context and/or expectations of use (pragmatic tuning).

Despite the apparent distinction between a prior training of models and a later tuning when models are put to use, most of these operations are commonly combined across feedback loops, raising key transparency and traceability issues.

LLMs Functional Components

To be of any use LLMs must be transparent with regard to contexts and purposes, and their deliverables be traceable along processes.

Resources

While the quality of resources is clearly a primary factor of LLMs usability, its impact on transparency is generally overlooked except for metadata; that’s because data is detached from its informational context, namely organisation and models. A primary objective is thus to characterise resources prior their encoding:

  • Generic: massive, comprehensive, and undifferentiated raw resources
  • Specific: subsets of resources pertaining to selected domains and/or issues
  • Curated: subsets of contents from filtered resources

While all resources are then numerically encoded into neural networks, their representation as vectors can be used to manage the granularity of deliverables and consequently their footprint in workflows.

Workflow

Numerical vectors are the nuts and bolts of LLMs as they are used to represent numeric affinities between tokens as well as semantic connexions between words. As such they provide vessels for the meanings accrued across intermediate LLMs stages; taking into account the kind of targeted contents, three typical “embeddings” can be set apart:

  • Frozen: representation of observed terms with stable semantics
  • Canned: representation of concepts with defined meanings
  • Managed: representation of categories with defined features
LLMs Workflow

While an open array of deep learning algorithms are combined in feedback loops to weave the LLMs fabric, keeping track of embedded motifs may ensure the traceability of processes between resources and outcomes.

Functional Layers

The use of LLMs is by nature iterative and feedback loops may entwine whole cycles of encoding, training, and tuning schemes, hampering LLMs transparency and traceability.

Taking advantage of differentiated embeddings, the traceability issue could be managed through a built-in distinction between numeric and semantic schemes, typically:

  • Semantic access to resources from operational context, eg indexes and queries
  • Numerical encoding and embeddings
  • Numerical training
  • Semantic tuning and prompts
Functional Layers & Contexts

Transparency is a broader issue that takes into account the matching of means and ends, in that case the congruence of LLMs’ operational (systems) and organisational (knowledge) contexts.

Conversations & Contexts

GenLMs and their environments appear to broadly follow the same configuration. On the side of operational contexts, a variety of commonly called Retrieval-augmented generation (RAG) technologies rely on tailored queries to generate different types of training material: all-inclusive, specific, or curated. On the side of organizational (users) contexts, various “chain-of” scripting methods are proposed to define and organize prompts, usually paired with RAGs called dynamically to customize contexts. In between, embeddings provide both static and dynamic semantic gears. By harnessing knowledge graphs (KGs) with retrieval-augmented generation (RAG), Agentic Graphs (AGs) achieve a two-fold objective: taking responsibility for incorporating explicit knowledge into graphs, and enabling dynamic management of domain-specific structured contexts.

While graph-enhanced RAGs can significantly improve the relevance of data used by GenLMs, the contribution of knowledge graphs is limited by the resources used to build them: vector-based retrieval from (unstructured) datasets and documents, or conceptual mapping of (structured) domain-specific schemas. Neither of these methods fully addresses the root causes of hallucinations, namely unfounded facts and nonsensical reasoning.

Yet, graph-enhanced RAGs can do more than just adjusting training and tuning parameters, provided that ontological prisms are employed to bind LLMs with their semantic, conceptual, and logical contexts.

Binding Contexts
  • Semantic binding: between individual (glossaries/thesauruses) and system-defined (taxonomies/schemas) meanings.
  • Conceptual binding: between individual (glossaries/thesauruses) and domain-based (ontologies) meanings.
  • Logical binding: between domain-based (ontologies) and system-defined (taxonomies/schemas) meanings.

The convergence of generative language models and knowledge graphs points to a specialization of agents between language and knowledge. Along that rationale, agents responsible for communication and user experience would rely on semantics and pragmatics to ensure the consistency of meanings, while agents in charge of shared representations would focus on knowledge-based activities. To that effect, agents would have to be defined by cognitive capabilities such as observation, reasoning, and judgment.

That framing would allow for the subcontracting of GenLMs for knowledge-intensive activities, outlining the upper layers of an agentic collaboration architecture.

Further Reading

Ontological Prism

Other Caminao References

Other References