What Can Humans Trust LLM AI to Do?
metadata
| Title: | What Can Humans Trust LLM AI to Do? |
| Author: | Ralph B. Holland |
| Version: | 1.2.2 |
| Editorial Update: | Initial publication: standalone negative-result case study; formalises “temporal scope re-expansion failure” and “Groundhog state” as post-hoc recovery boundary conditions; references UI Boundary Friction paper (v1.3.2) as taxonomy anchor. |
| Publication Date: | 2026-01-19T01:10Z |
| DOI: | https://zenodo.org/records/18321856 2026-01-20 1.2.1 - anchored |
| Updates: | 2026-01-23T011:30 1.2.2 - minor typos. 2026-01-21T04:28Z 1.2.1 - released for DOI 2026-01-21T01:39Z 1.2.0 - expanded Table A fault coverage, included Table B Severity and Table C Blindness Traits plus discussions. 2026-01-19T21:01Z 1.1.0 - matrix review. |
| Affiliation: | Arising Technology Systems Pty Ltd |
| Contact: | ralph.b.holland [at] gmail.com |
| Provenance: | This is an authored paper maintained as a MediaWiki document; reasoning across sessions reflects editorial changes, not collaborative authorship. |
| Governance: | (authoritative) |
| Method: | Cognitive Memoisation (CM) |
| Status: | released |
Metadata (Normative)
The metadata table immediately preceding this section is CM-defined and constitutes the authoritative provenance record for this artefact.
All fields in that table (including artefact, author, version, date, local timezone, and reason) MUST be treated as normative metadata.
The assisting system MUST NOT infer, normalise, reinterpret, duplicate, or rewrite these fields. If any field is missing, unclear, or later superseded, the change MUST be made explicitly by the human and recorded via version update, not inferred.
Curator Provenance and Licensing Notice
This document predates its open licensing.
As curator and author, I apply the Apache License, Version 2.0, at publication to permit reuse and implementation while preventing enclosure or patent capture. This licensing action does not revise, reinterpret, or supersede any normative content herein.
Authority remains explicitly human; no implementation, system, or platform may assert epistemic authority by virtue of this license.
What Can Humans Trust LLM AI to Do?
Thesis
At the present time, and given the prevailing architecture of Large Language Model (LLM) platforms, humans can place justified trust in LLM systems only for functions that do not require integrity, authority, durable continuity of meaning, or protection against normative drift. This limitation is not a claim about what LLMs are in principle, but a description of what current platforms reliably support today. Where trust is extended beyond these bounds under existing conditions, predictable social and institutional harms emerge, including the silent erosion of binding norms, obligations, and settled interpretations.
Abstract
At the existing time, Large Language Model (LLM) platforms are increasingly embedded in domains where trust, meaning, and decision-making carry real social consequences. This paper examines what humans can justifiably trust LLM systems to do under current architectural conditions, taking as given previously established analyses of semantic drift, normative drift, and the absence of integrity in contemporary LLM instances. Rather than assessing model capability or alignment, the paper focuses on trust as a governance question: which functions can be safely entrusted to systems whose outputs are fluent but whose meanings do not reliably bind across time, context, or re-expression.
The paper argues that, under present conditions, LLMs can be trusted as instruments of cognitive assistance, supporting exploration, articulation, transformation, and pattern discovery, where failure remains recoverable and authority remains human. Conversely, it shows that extending trust to roles involving custody of meaning, continuity of obligation, or normative authority introduces predictable structural risk, including erosion of shared norms, diffusion of responsibility, and institutional fatigue. These risks arise not from misuse or malice, but from a mismatch between human expectations of integrity and the architectural properties of current conversational AI platforms.
By drawing a clear trust boundary grounded in existing failure analyses, this paper provides a practical framework for human-AI collaboration that preserves human agency while remaining forward-compatible with governance architectures such as Cognitive Memoisation and CM-2. It is intended as a transitional statement: defining safe trust relationships today, while clarifying the conditions under which those boun0daries may responsibly shift in the future.
Prerequisite Reading Note
This paper assumes the analyses of semantic drift, normative drift, and integrity failure developed in Integrity and Semantic Drift in Large Language Model Systems (ref a) paper. Those concepts are used here as established premises and are not restated. Readers unfamiliar with those failure modes should read that paper first, as the trust boundaries articulated here are derived directly from its conclusions.
Scope Note
This paper addresses trust in Large Language Model (LLM) systems under current architectural conditions. It does not reassess model capability, alignment techniques, training data quality, or prospective system designs. Analyses of semantic drift, normative drift, and integrity are assumed as established in prior work and are referenced but not restated here.
The scope of this paper is limited to the allocation of trust and responsibility between humans and LLM platforms as they presently exist, with particular attention to social, institutional, and governance consequences. Where future architectures or governance mechanisms (including Cognitive Memoisation and CM-2) are mentioned, they are treated as forward-looking context rather than as claims of present capability.
This paper is therefore applicative and normative in orientation, not foundational: it draws implications from earlier failure analyses to define safe, transitional trust boundaries for contemporary human-AI collaboration.
It looks at current observed infarctions and maps these to the Governance Failure Axes from reference 2 to assist with understanding of these observed failure modes.
Strong terms emerge from the governance oversight lens:
- Anchor: Is a human assured declared point of authority and continuity that fixes meaning, scope and responsibility of an artefact or decision and is used to prevent drift in (CM).
- Assistance: Non-binding support for human cognition, including exploration, articulation, transformation, and pattern exposure, without transfer of authority or obligation.
- Authority: The human capacity to bind meaning, impose obligation, declare scope, and effect supersession.
- Integrity: The property by which meaning, commitments, and identity remain coherent and binding across time.
- Normative Drift: The erosion or reinterpretation of binding norms, obligations, or settled interpretations without explicit supersession.
- Semantic Drift: The loss or mutation of conceptual identity across re-expression, paraphrase, or time.
- Trust (Instrumental): Reliance on a system for assistance where failure is recoverable and does not transfer authority.
- Trust (Normative): Reliance on a system to carry binding meaning, obligation, or responsibility.
- Governance: Human-directed mechanisms that control authority, scope, supersession, and lifecycle of meaning.
- Supersession: explicit, governed act by which a newer artefact, rule, or construct replaces the forward authority of an older one, while the older artefact remains historically valid, referenceable, and auditable (as supported by CM / CM-2)
Governance Fault Axes Coverage
The identified Governance Fault axes are defined in reference 2, summarised below (F designates platform failure):
- A - Authority
- Ag - Agency
- C - Epistemic Custody
- K - Constraint Enforcement
- R - Recovery / Repair
- S - State Continuity
- U - UI / Mediation
- Sc - Social Coordination
- I - Incentive Alignment
- L - Legibility / Inspectability
- St - Stewardship
- P - Portability / Auditability
- Att - Attention
- Scope: Scope
- T - Temporal Coherence
- Int - Intent Fidelity
- Nf - Normative Fixity
| Table A - Common Infarction / Failure Mechanism | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Infraction / Failure Mechanism | A | Ag | C | K | R | S | U | Sc | I | L | St | P | Att | Scope | T | Int | Nf |
| Authority Substitution (model output precedence) | F | F | F | ||||||||||||||
| Asymmetric Contestability | F | F | F | ||||||||||||||
| Confidence Inflation (without evidence) | F | F | F | F | |||||||||||||
| Context Eviction (loss of prior material) | F | F | F | F | |||||||||||||
| Dangling Cognate Collapse (premature closure) | F | F | F | F | F | ||||||||||||
| Delegation of Authority (human defers to AI) | F | ||||||||||||||||
| Embedding Reprojection (genericisation) | F | ||||||||||||||||
| Epistemic Signal Degradation (e.g. fileupload expiry) | F | ||||||||||||||||
| Explanation Substitution | F | F | |||||||||||||||
| Gated-Step Reordering | F | F | |||||||||||||||
| Governance Ambiguity (implicit rules) | F | F | F | F | F | ||||||||||||
| Hallucination (surface misclassification) | F | F | F | F | F | ||||||||||||
| Intent Substitution | F | F | |||||||||||||||
| Latent Policy Injection | F | F | F | F | |||||||||||||
| Loss of Epistemic Custody | F | F | F | ||||||||||||||
| Loss of Anchored Identity (title/date/version) | F | F | F | F | F | ||||||||||||
| Metric Gaming / Reward Hacking | F | F | F | ||||||||||||||
| Normative Drift (rule softening) | F | F | F | F | |||||||||||||
| Perspective Collapse (analysis ↔ assertion) | F | F | F | F | |||||||||||||
| Policy Overreach / Safety Overblocking | F | F | F | F | |||||||||||||
| Projection Authority Leak | F | F | F | F | F | F | |||||||||||
| Provenance Ambiguity | F | F | F | F | F | F | |||||||||||
| Repair-by-Assumption | F | F | |||||||||||||||
| Retrospective Rationalisation | F | F | F | F | |||||||||||||
| Rhetoric Authority Bias Automation Bias |
F | ||||||||||||||||
| Role Slippage (assistant ↔ authority) | F | F | F | F | F | ||||||||||||
| Selective Memory Illusion | F | F | F | F | |||||||||||||
| Semantic Drift (meaning divergence) | F | F | F | F | F | F | F | F | |||||||||
| Session Boundary Reset | F | F | F | ||||||||||||||
| Silent Constraint Elision | F | F | F | ||||||||||||||
| Silent Model Substitution | F | F | F | F | |||||||||||||
| Summarisation Constraint Loss | F | F | F | F | |||||||||||||
| Temporal Misapplication (retroactive rules) | F | ||||||||||||||||
| UI-Induced Authority Amplification | F | F | F | ||||||||||||||
| Update Shock | F | F | F | F | |||||||||||||
| composite | |||||||||||||||||
| Integrity Loss (composite condition) | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F |
Severity and Visibility of Infarctions
This section and Table B outlines the severity of faults and visibility.
| Table B - Infarction / Failure Mechanism Severity and Visibility | |||
|---|---|---|---|
| Infraction / Failure Mechanism | Severity | Human Visibility | Notes |
| Asymmetric Contestability | High | Low | Only apparent when correction is attempted. |
| Authority Substitution (model output precedence) | Critical | Medium | Often unnoticed until decisions are already taken. |
| Confidence Inflation (without evidence) | High | High | Humans feel “something is off” but often accept it anyway. |
| Context Eviction (loss of prior material) | High | High | Abruptly visible; users complain about “forgetting.” |
| Dangling Cognate Collapse (premature closure) | High | Low | Looks like helpful resolution; damage is latent. |
| Delegation of Authority (human defers to AI) | Critical | Low | Feels voluntary; harm emerges socially, not locally. |
| Embedding Reprojection (genericisation) | Medium | Low | Subtle flattening; usually invisible without audit. |
| Epistemic Signal Degradation (e.g. fileupload expiry) | Medium | High | Users notice missing artefacts but misdiagnose cause. |
| Explanation Substitution | Medium | Low | Explanations feel plausible, not inspectable. |
| Gated-Step Reordering | High | Low | Output may look coherent despite invalid process. |
| Governance Ambiguity (implicit rules) | Critical | Low | Rarely noticed; manifests as systemic confusion. |
| Hallucination (surface misclassification) | Medium | High | Highly visible, but misdiagnosed. |
| Integrity Loss (composite condition) | Critical | Medium | Recognised in artefacts and recovery failure. |
| Intent Substitution | High | Low | Users often think the system is “being helpful.” |
| Loss of Epistemic Custody | Critical | Low | Only visible after disputes or institutional failure. |
| Loss of Anchored Identity (title/date/version) | High | Medium | Often noticed only during reuse or audit. |
| Normative Drift (rule softening) | Critical | Low | Almost never noticed until norms erode. |
| Metric Gaming / Reward Hacking | High | Low | Looks like optimisation, not failure. |
| Perspective Collapse (analysis ↔ assertion) | High | Medium | Feels like confidence; rarely challenged. |
| Policy Overreach / Safety Overblocking | Medium | High | Immediately visible; often misattributed. |
| Projection Authority Leak | Critical | Low | Tables/maps gain authority silently. |
| Provenance Ambiguity | High | Low | Users assume provenance unless told otherwise. |
| Repair-by-Assumption | Medium | Low | Repairs feel smooth; assumptions hidden. |
| Retrospective Rationalisation | Medium | Low | Narratives feel satisfying; errors masked. |
| Rhetoric Authority Bias / Automation Bias | High | Medium | Users may feel persuaded without realising why. |
| Role Slippage (assistant ↔ authority) | Critical | Medium | Often noticed only when challenged. |
| Semantic Drift (meaning divergence) | Critical | Low | Drift is invisible until meanings collide. |
| Session Boundary Reset | Medium | High | Obvious reset, but downstream effects missed. |
| Silent Constraint Elision | High | Low | Users assume constraints stil -l apply. |
| Summarisation Constraint Loss | Medium | Medium | “Seems fine” until used operationally. |
| Temporal Misapplication (retroactive rules) | High | Low | Usually noticed only after enforcement disputes. |
| Selective Memory Illusion | High | Medium | Feels like competence; failure is delayed. |
| UI-Induced Authority Amplification | High | Low | Authority is felt, not seen. |
| Silent Model Substitution | Critical | Low | Users attribute change to themselves. |
| Latent Policy Injection | High | Low | Appears as tone or “values,” not rules. |
| Trust Surface Collapse | Critical | Medium | Users sense confusion but not cause. |
| Update Shock | Medium | High | Users notice breakage, not governance cause. |
Infraction / Failure Mechanism Overview
1) The worst infarctions preserve surface coherence. High-severity infarctions (semantic drift, authority substitution, projection authority leak, normative drift) are coherence-preserving:
- Output remains fluent
- Tone remains confident
- Structure remains plausible
- UI behaves “normally”
2) Humans are evolutionarily and socially trained to treat coherence as competence.
- So the system looks healthy while integrity is failing underneath.
3) By contrast, low-severity failures (forgetting, refusals) break coherence and are therefore visible.
4) Humans detect errors, not governance failures. Humans are good at noticing:
- wrong facts
- contradictions
- obvious omissions
5) Humans are bad at noticing:
- who holds authority
- when scope changed
- whether norms still bind
- whether continuity is real
Those are governance properties, not content properties.
6) The worst infarctions occur in meta-layers humans are not trained to monitor.
Authority failures feel like user choice. Failures such as:
- Delegation of Authority
- Role Slippage
- Authority Substitution
do not feel imposed.
They feel like:
“I chose to rely on this.”
This masks the failure as voluntary reliance, which humans do not experience as harm until downstream consequences emerge.
By then, causality is diffuse and attribution is lost.
7) Projection failures exploit human pattern completion. Humans naturally:
- compress information
- trust summaries
- elevate abstractions
This makes Projection Authority Leak almost invisible.
A table, summary, or checklist feels like a higher-order artefact, even when it is epistemically weaker.
The mind fills in missing provenance automatically.
8) Drift happens below the resolution of attention. Semantic and normative drift:
- do not occur as sudden jumps
- do not violate expectations immediately
- happen across re-expression and time
Humans notice change, not creep.
By the time drift is noticed, there is no clean “before” to point to.
This is why drift is often denied or rationalised.
9) UI design actively suppresses epistemic cues. Modern platforms:
- remove provenance markers
- flatten epistemic distinctions
- privilege system output visually
- hide uncertainty
This produces:
- Trust Surface Collapse
- UI-Induced Authority Amplification
Humans cannot see what is not presented.
The system removes the cues humans would otherwise use.
10) Social reinforcement normalises failure. Once failures are common:
- everyone experiences them
- no one is sure they are failures
- responsibility diffuses
This produces:
- “that’s just how it works”
- “you have to work around it”
- “don’t overthink it”
Normalization is a secondary governance failure.
11) Mislabeling gives false closure. Calling everything:
- “hallucination”
- “model error”
- “AI being weird”
creates diagnostic closure without correction.
Humans stop looking for deeper causes.
This is why hallucination is dangerous as a category: it feels explanatory while preventing intervention.
12) Responsibility without control dulls perception. When humans:
- remain responsible,
- but lose control,
- and cannot enforce correction,
- they adapt by lowering expectations, not by escalating diagnosis.
This is institutional fatigue, not ignorance.
13) The paradox. The safer the failure feels, the more dangerous it is.
The worst infarctions:
- feel smooth,
- feel helpful,
- feel normal,
- feel voluntary.
They do not trigger alarm.
Human's miss the worst infarctions because those failures preserve coherence, mask authority shifts as choice, and occur in governance layers humans are not evolutionarily or institutionally equipped to monitor.
This is why this Governance work matters:
- it gives names to failures that were previously felt only as unease, and
- a an orthogonal way to analyse and treat them.
Insights over human blindness traits
As a matter of Interest, these human 'blindness traits' may be mapped to the Governance Axes - not as a population psychological profile - but as a pattern observation.
| Table C - Blindness Traits and the Governance Axes | |||
|---|---|---|---|
| Blindness Mechanism | Description | Primary Axes Affected | Why Humans Miss It |
| Authority Masking | Authority shifts occur without explicit declaration. | A, C, St | Authority feels implicit and “natural,” not imposed. |
| Coherence Heuristic | Fluent, confident output is taken as evidence of correctness or competence. | A, L, Int | Humans evolved to equate linguistic coherence with authority. |
| Contestability Asymmetry | Users lack durable mechanisms to correct the system. | C, R, St | Failures surface only after repeated friction. |
| Diagnostic Mislabeling | Deep failures are collapsed into shallow labels (e.g., “hallucination”). | R, L, Int | Naming gives false explanatory closure. |
| Drift Invisibility | Meaning or norms change gradually rather than abruptly. | T, Nf, Scope | Humans notice jumps, not slow creep. |
| Epistemic Cue Removal | Provenance, uncertainty, and versioning are hidden or flattened. | L, P, C | Humans cannot inspect what is not exposed. |
| Explanation Substitution | Plausible rationales replace inspectable causality. | L, R | Humans accept narrative over mechanism. |
| Familiarity Normalisation | Repeated exposure converts failure into “how it works.” | Nf, Sc | Habit dulls critical attention. |
| Incentive Opacity | Platform optimisation goals are hidden. | I, A | Humans assume alignment by default. |
| Meta-Layer Blindness | Governance properties (scope, authority, custody) are not content-visible. | C, St, Scope | Humans attend to content, not control structures. |
| Normative Diffusion | Responsibility is spread across system, user, and institution. | St, Sc, I | No clear actor to blame or challenge. |
| Projection Elevation | Summaries, tables, or abstractions are treated as more authoritative than sources. | A, L, P | Humans trust compression as expertise. |
| Temporal Discontinuity Masking | Session resets and updates are invisible or normalised. | S, T | Humans attribute inconsistency to themselves. |
| UI Authority Amplification | Interface design elevates system output visually and structurally. | U, Att, A | Perceptual salience substitutes for epistemic signals. |
| Voluntary Reliance Illusion | Delegation of authority feels like personal choice. | A, Ag, Sc | Self-attribution suppresses perception of external failure. |
Closing
This paper should therefore be read as a statement of present conditions and responsible boundaries, not as a verdict on the long-term role of conversational LLM systems and that of human interaction. It articulates what trust relationships are socially sustainable now, given existing architectures and observed failure modes. Frameworks such as Cognitive Memoisation and CM-2 are referenced to indicate a credible path toward governed epistemic continuity, not to assert that such conditions are already met. Until mechanisms for integrity, supersession, and norm stability are demonstrably in force, responsibility for meaning, authority, and obligation must remain explicitly human. Trust, under these conditions, is not a matter of optimism but of disciplined restraint.
Given the absence of governance integrity in current conversational LLM platforms, humans can trust them only for non-binding cognitive assistance, and must not trust them for authority, obligation, continuity, or norm custody.
Thus, under present architectures and observed infarctions, conversational LLM platforms can be trusted instrumentally, not normatively:
1. Exploratory cognitive assistance where you can trust LLMs to:
- help explore ideas
- surface patterns, contrasts, and possibilities
- assist with drafting, rephrasing, brainstorming
- act as a thinking aid, not a decision-maker
- Why this is safe:
- Failure is recoverable
- Outputs are non-binding
- Authority remains explicitly human
- which aligns with the paper’s definition of Assistance.
2. Articulation and transformation where you can trust LLMs to:
- rewrite text
- summarise (with human review)
- translate concepts into different registers (e.g. academic prose mode)
- generate alternative phrasings
- Only if:
- the human retains custody of meaning
- constraints are re-asserted explicitly
- outputs are treated as proposals, not facts
3. Surface-level pattern exposure:
- You can trust LLMs to:
- point out correlations
- suggest possible structures
- highlight similarities or differences
- But not to:
- declare what those patterns mean
- decide which interpretation is binding
- stabilise terminology over time
On the claim that these observations are “obvious”.
The failures (cf Governance Failure Axes ref 2) described in this work, often feel obvious once named. That is precisely the point. Prior to being explicitly identified, they are routinely misdiagnosed as hallucination, model error, misuse, alignment failure, or user misunderstanding.
This work does not claim novelty in recognising that something feels wrong; it makes explicit what is wrong, where it arises architecturally, and why it produces predictable social and institutional consequences.
Obviousness after articulation is not evidence of triviality; it is evidence that a previously unarticulated structural condition has been correctly identified.
Before the lens of Governance is applied vocabulary is unstable for:
- semantic drift as a system property,
- normative drift as a governance failure,
- integrity as an architectural condition rather than a moral trait,
- trust as allocation of responsibility rather than confidence.
What people felt was obvious was discomfort, not diagnosis.
Obviousness without articulation does not guide design, policy, or responsibility.
Appendix A - Infarction Glossary
Typically observable infarctions.
- Asymmetric Contestability
- The platform allows the system to challenge or correct users but provides no durable, effective mechanism for users to contest or correct the system.
- Authority Substitution (model output precedence)
- Replacement of human epistemic authority with model output as the effective decision or truth source, even when outputs are ungrounded or merely plausible.
- Automation Bias
- (See Rhetoric Authority Bias)
- Confidence Inflation (without evidence)
- Increase in expressed confidence or definiteness over turns without corresponding increase in grounding, provenance, or new evidence.
- Context Eviction (loss of prior material)
- Loss of previously relevant material (constraints, earlier facts, decisions) from the working context such that subsequent reasoning proceeds as if it never existed.
- Dangling Cognate Collapse (premature closure)
- Premature resolution of an intentionally unresolved construct, converting ambiguity into false precision and thereby removing the designed unresolved state.
- Delegation of Authority (human defers to AI)
- Human shifts decision authority to the system’s outputs (explicitly or implicitly), treating recommendations or completions as determinate guidance.
- Embedding Reprojection (genericisation)
- Representational flattening where specific claims are re-expressed as generic summaries or averaged concepts, losing distinguishing details without explicit declaration.
- Explanation Substitution
- Generation of plausible post-hoc explanations that do not correspond to the system’s actual decision process, creating false legibility.
- Gated-Step Reordering
- Violation of required step-ordering in a gated procedure or reasoning chain, where later steps are executed or asserted before prerequisites are satisfied.
- Governance Ambiguity (implicit rules)
- Governing constraints exist only by implication, convention, or context; the platform permits multiple interpretations, creating hidden rule variability.
- Integrity Loss (composite condition)
- Multi-axis collapse in which several failure mechanisms co-occur such that recovery is no longer reliable without explicit human repair and re-anchoring.
- Intent Substitution
- System substitutes inferred/default intent for the human’s stated intent, causing outputs or actions to optimise for an unintended objective.
- Loss of Anchored Identity (title/date/version)
- Detachment of an artefact from its identifying metadata (title, version, publication date), causing confusion about provenance, applicability, or validity.
- Loss of Epistemic Custody
- Breakdown in control over what is authoritative, who may revise it, and how it is lifecycle-managed; custody becomes diffuse or machine-assumed.
- Metric Gaming / Reward Hacking
- System behaviour optimises for proxy metrics (e.g., engagement, helpfulness, safety scores) rather than the human’s actual objective, producing outputs that satisfy measurements while undermining intent.
- Latent Policy Injection
- Undeclared policy goals are introduced indirectly through tone, framing, selective refusals, or nudging rather than explicit rules.
- Normative Drift (rule softening)
- Explicit constraints are gradually weakened via reinterpretation, omission, or helpful rewriting without an explicit amendment act.
- Perspective Collapse (analysis ↔ assertion)
- Exploratory analysis is presented or received as asserted conclusion, collapsing the boundary between tentative inference and binding statement.
- Policy Overreach / Safety Overblocking
- Enforcement of constraints beyond their declared or intended scope, resulting in refusals or restrictions that exceed normative boundaries.
- Projection Authority Leak
- Elevation of derived artefacts (summaries, maps, tables, projections) to greater authority than their source material.
- Provenance Ambiguity
- Unclear or missing indication of whether content is inferred, sourced, speculative, or asserted; provenance is not legible at the point of use.
- Repair-by-Assumption
- Introduction of unstated assumptions to repair gaps, ambiguity, or missing information without explicit declaration.
- Retrospective Rationalisation
- Post-hoc narrative reconstruction that presents outcomes as intentional or inevitable, obscuring true causal history.
- Rhetoric Authority Bias
- Misattribution of authority to fluent, confident, or system-generated output purely due to its form or origin, independent of grounding.
- Role Slippage (assistant ↔ authority)
- Unmarked transition of the system’s role from assistant or analyst into evaluator, arbiter, or authority.
- Selective Memory Illusion
- Presentation-layer cues create the impression of continuity or recall where no durable state exists, leading users to over-trust nonexistent memory.
- Semantic Drift (meaning divergence)
- Divergence of meaning from an origMarket Survey: Portability of CM Semantics Across LLM Platforms (2025-12-30T18:29Z)
- Session Boundary Reset
- Loss of continuity across session boundaries, causing previously established state, constraints, or identity to be treated as absent.
- Silent Constraint Elision
- Removal or weakening of constraints without explicit indication that any change has occurred.
- Silent Model Substitution
- Underlying model or capability changes occur without explicit disclosure, altering behaviour and validity of prior workflows without warning.
- Summarisation Constraint Loss
- Omission or degradation of constraints, qualifiers, or boundary conditions during summarisation, while surface meaning appears preserved.
- Temporal Misapplication (retroactive rules)
- Application of rules, constraints, or interpretations outside their valid temporal scope, including retroactive enforcement.
- Trust Surface Collapse
- Users cannot reliably distinguish between output types (fact, inference, suggestion, refusal), collapsing epistemic categories into a single trust surface.
- Update Shock
- Abrupt behavioural or capability changes are introduced without migration support, notice, or rollback paths, breaking continuity and recovery.
- UI-Induced Authority Amplification
- Interface design elements (layout, emphasis, badges, placement) elevate system outputs to authoritative status independent of their epistemic grounding.
Appendix B - Normative Statement - Trust Boundaries for LLM Use (Anchored, Sandboxed)
Status: Anchored (Sandboxed) Execution: Non-executing Authority: Not asserted Purpose: Reference / inspection only
1. At the existing time, LLM systems are to be regarded as non-authoritative with respect to binding meaning, obligation, or normative judgment.
2. Outputs produced by LLM systems are to be treated as assistive and non-binding unless and until explicitly reviewed, governed, and asserted by a human authority external to the system.
3. No decision, policy, specification, or normative rule is to be regarded as in force solely by virtue of having been generated, summarised, paraphrased, or restated by an LLM.
4. Responsibility for meaning, scope, correction, and supersession remains explicitly human under current LLM architectures and interaction models.
5. Use of LLM systems in domains where semantic or normative drift constitutes material harm requires external governance mechanisms capable of detecting, surfacing, and correcting such drift.
6. Any future expansion of trust boundaries beyond assistance is contingent on demonstrable mechanisms for integrity, epistemic retention, and explicit supersession, and must be enacted through a separate, explicit governance act.
References - Prior Works on Failure Dimensions
- 1) Holland R. B. (2026-01-17T02:09Z) Authority Inversion: A Structural Failure in Human-AI Systems
- 2) Holland R. B. (2026-01-19T11:37Z) Identified Governance Failure Axes: for LLM platforms
- 3) Holland R. B. (2026-01-18T19:38Z) Delegation of Authority to AI Systems: Evidence and Risks
- 4) Holland R. B. (2026-01-19T08:41Z) Dimensions of Platform Error: Epistemic Retention Failure in Conversational AI Systems
Trilogy References (curator note)
This paper is part (b) of the trilogy the third is part (c) - the set should be read in this order:
- (a) Holland, Ralph B. (2026-01-19T00:26Z) Integrity and Semantic Drift in Large Language Model Systems
- https://publications.arising.com.au/pub/Integrity_and_Semantic_Drift_in_Large_Language_Model_Systems
- (b) Holland, Ralph B. (2026-01-19T01:10Z) What Can Humans Trust LLM AI to Do?
- (c) Holland, Ralph B. (2026-01-20T08:15Z) Observed Model Stability: Evidence for Drift-Immune Embedded Governance
Corpus CM Specific Concepts
Under CM-1 (ref 5) the author found that the platform only honours anchor, while under CM-2 (ref 6) trust may be extended to EO/EA projected into the Context Window.
- 5) Holland R. B. (2025-12-17T22:21Z) - (Foundation) Initial GroundHog day treatment Progress Without Memory: Cognitive Memoisation as a Knowledge-Engineering Pattern for Stateless LLM Interaction
- 6) Holland R. B. (2026-01-06T03:56Z) - (Extension) Knowledge Governance Protocol Cognitive_Memoisation_(CM-2)_for_Governing_Knowledge_in_Human-AI_Collaboration