Reason & Knowledge
Reasoning is a cognitive process that uses symbolic representations to extend knowledge beyond what can be observed.
Symbolic Representations
While animal species are endowed with a wide range of cognitive capabilities, human is the only one that relies on the processing of symbolic representations through observation (facts), communication (meaningful symbols), and reasoning (models):
- Labels are signs used to identify elements in environments
- Symbols are signs that stand for something else in the mind of observers
- Relationships can be defined between labels to build models
- Models can be fleshed out with meanings and rules
Reasoning can thus be carried out formally on models without external meanings (pure logic) or concretely on models meant to represent specific contexts and/or concerns:
- Maths use symbolic representations for the sole purpose of pure reasoning independently of environments
- Sciences and engineering apply logic to symbolic representations of physical environments
- Philosophy and politics apply logic to symbolic representations of symbolic environments
In any case maths, and in particular mathematical logic, provide the nuts and bolts of every kind of reasoning.
Edges of Knowledge
Formal logic relies on three major types of inference depending on necessary (deduction) or non-necessary (induction, abduction) inferences:
Deduction infers individual facts (Ci:f) from categories’ features (C:F); eg If all birds have feathers and owls are birds, then owls have feathers.
Induction infers categories’ features from observed facts; eg given that owls are birds and have feathers, one may assume that birds have feathers.
Abduction add conceptual (non factual) arguments in support of inductions; eg given that owls are birds and have feathers, and considering that feathers enable flying, one may assume that birds have feathers.
Intuition can be framed similarly and defined as direct associations between experience (facts) and beliefs (concepts); eg birds make songs and like music.
Pivoting from pure logic to practical reasoning thus seems straightforward for deduction (a truth-preserving operation) but comes with pitfalls for induction and abduction, as can be illustrated by probabilities. Taking a leaf from Aubrey Clayton’s Bernoulli’s Fallacy, sampling probabilities (pure logic) pertain to designed categories like games of chance (eg the probability of throwing a certain number with a pair of dice); by contrast inferential probabilities (applied logic) pertain to actual environments (eg the probability of hurricanes, or loan defaults). The former are deductive as they go from designed categories to managed facts; the latter are inductive or abductive as they go from observed facts to assumed categories. Hence the importance of defining reasoning modalities when inference is applied to environments.
Reasoning Modalities
As noted above, pure logical inference can be necessary (deduction) or non-necessary (induction, abduction). When applied to practical reasoning that logical distinction must be extended as to encompass empirical (facts) or philosophical (concepts) modalities:
- Empirical (extensional) modalities, for reasoning along facts’ actual, temporal and stochastic dimensions
- Philosophical (intensional) modalities, for reasoning across conceptual tiers: abstract (no intended relevancy with regard to environments), concrete (intended relevancy with regard to physical or symbolic environments), virtual (relevancy with regard to hypothetical or fictional environments), or nominal (syntactic or semiotic textual relevancy)
- Logical modalities, for the internal consistency of symbolic representations and rules: modal logic (what is necessary or possible), deontic logic (what is mandatory or permissible), temporal logic, fuzzy logic, etc.
Expounding reasoning modalities puts the focus on the gears moving the logical, empirical, and nominal wheels supporting reasoning processes.
Reasoning Processes
Reasoning processes are best defined in terms of motion across symbolic representations of facts, categories, and concepts, with corresponding gears ensuring smooth moves between words (facts) and meanings (concepts); observations (facts) and features (categories); and objectives (concepts) and arguments (categories).
Reasoning Cogs & Gears
Reasoning necessarily starts with the attachment of words to facts, and of meanings to words; reasoning can then carry on juggling with alternative interpretations and even conclude, taking documents as facts.
Alternatively, and more concretely, reasoning processes can start with the attachment of named features to datasets and observations, proceed with statistical inference to explore causal chains and models, and finally induce semantic and/or syntactic categories. Such moves may be driven bottom-up with named features emerging from observations, and/or be steered top-down by designed categories.
Topping both semantic networks and statistical inference, logical inference is meant to conduct the whole of reasoning processes, ensuring the alignment of arguments (logic) and objectives (intents).
It so appears that the fabric of reasoning is weaved by crossed threads whose semantics are set by topics (sciences, society, business, …) as well as modalities (nominal, actual, virtual, …); it ensues that, trivial ones apart, reasoning processes can be organised around three explicit hinges: nominal (words/meanings), logical (objectives/arguments), and empirical (observations/features). That’s where the transparency and traceability of automated reasoning can be best achieved.
Reasoning Footprints
A major caveat of using Generative AI (GAI) , in particular Large language models (LLMs), is the lack of transparency and traceability of reasoning processes; hence the benefits of making plain inference footprints.
Bounded reasoning processes is about the meaning of words within finite and unambiguous semantic contexts:
- Nominal reasoning operates on tokens (“cats”, “eat”, “mice”) with arbitrary meanings detached from context.
- Bounded reasoning deals with the alignment of words and meanings as carried out without non-necessary (ie inductive or abductive) inferences; such reasoning processes may (bounded environments) or may not (bounded semantics) use categories to make necessary (ie deductive) inferences.
Boundless reasoning processes is about the relationships between meanings and environments:
- Empirical reasoning deals with the alignment of thesauruses (words and meanings) and taxonomies (words and types)
- Conceptual reasoning deals with the alignment of thesauruses (words and meanings) and ontologies (modalities and categories)
- Comprehensive reasoning deals with the closure of empirical and conceptual reasoning, with its locus set between documented and observed facts
As should be expected, challenges appear when empirical and conceptual reasonings must be squared, in other words when science must be aligned with philosophy.
Closing Arguments
At first it seems that interpretations and inferences could enable reasoning round trips between discourses (documents) and observations (datasets). That would ignore the hiatus [?] between documents and datasets subject to different temporalities, the former canned meanings without explicit time-stamps, the latter time-stamped observations.
It ensues that, reasoning clockwise from facts, documents come with words (a) whose time-related meanings may or may not be congruent with the ones bore out by current observations (b). Conversely, reasoning counterclockwise from datasets, new meanings rooted in observations and models will not necessarily align with the meanings of legacy documents. Could the closing issue being dealt solely through nominal inferences ? that’s what Large language models (LLMs) aim at.
Once LLMs encoded, their reasoning capabilities are meant to be sustained in three ways, possibly combined:
- Retrieval augmented generation (RAG) uses selected datasets to focus LLMs reasoning processes
- Targeted training uses knowledge graphs to align LLMs reasoning processes with specific corpus of knowledge
- Prompt engineering extends the Socratic method to dialogues with machines
Used separately or combined, these schemes come with major caveats:
- Efficiency: prompt engineering is like reverse learning as it entails a refactoring of human efficient reasoning in order to align it with machines capabilities
- Reliability: without external reference models LLMs interpretations are prone to a wide range of hallucinations
- Interoperability: there is no reason to assume that the meanings emerging from nominal soups can be inherently aligned with the structured contents of knowledge graphs
More generally LLMs introduce a confusion between nominal (documents) and factual (datasets) correlations on the one hand, statistical and logical inferences (causations) on the other hand.
FURTHER READING
Ontological Prism
- Signs & Symbols
- Generative & General Artificial Intelligence
- Thesauruses, Taxonomies, Ontologies
- EA Engineering interfaces
- Ontologies Use cases
- Complexity
- Cognitive Capabilities
- LLMs & the matter of transparency
- LLMs & the matter of regulations
- Learning
- Uncertainty & The Folds of Time
Caminao Framework
- Caminao Framework Overview
- Knowledge interoperability
- Edges of Knowledge
- The Pagoda Playbook
- ABC of EA: Agile, Brainy, Competitive
- Knowledge-driven Decision-making (1)
- Knowledge-driven Decision-making (2)
- Ontological Text Analysis: Example