What Can Humans Trust LLM AI to Do?

From publications

metadata

Title: What Can Humans Trust LLM AI to Do?
Author: Ralph B. Holland
Version: 1.2.2
Editorial Update: Initial publication: standalone negative-result case study; formalises “temporal scope re-expansion failure” and “Groundhog state” as post-hoc recovery boundary conditions; references UI Boundary Friction paper (v1.3.2) as taxonomy anchor.
Publication Date: 2026-01-19T01:10Z
DOI: https://zenodo.org/records/18321856
2026-01-20 1.2.1 - anchored
Updates: 2026-01-23T011:30 1.2.2 - minor typos.
2026-01-21T04:28Z 1.2.1 - released for DOI
2026-01-21T01:39Z 1.2.0 - expanded Table A fault coverage, included Table B Severity and Table C Blindness Traits plus discussions.
2026-01-19T21:01Z 1.1.0 - matrix review.
Affiliation: Arising Technology Systems Pty Ltd
Contact: ralph.b.holland [at] gmail.com
Provenance: This is an authored paper maintained as a MediaWiki document; reasoning across sessions reflects editorial changes, not collaborative authorship.
Governance: (authoritative)
Method: Cognitive Memoisation (CM)
Status: released

Metadata (Normative)

The metadata table immediately preceding this section is CM-defined and constitutes the authoritative provenance record for this artefact.

All fields in that table (including artefact, author, version, date, local timezone, and reason) MUST be treated as normative metadata.

The assisting system MUST NOT infer, normalise, reinterpret, duplicate, or rewrite these fields. If any field is missing, unclear, or later superseded, the change MUST be made explicitly by the human and recorded via version update, not inferred.

Curator Provenance and Licensing Notice

This document predates its open licensing.

As curator and author, I apply the Apache License, Version 2.0, at publication to permit reuse and implementation while preventing enclosure or patent capture. This licensing action does not revise, reinterpret, or supersede any normative content herein.

Authority remains explicitly human; no implementation, system, or platform may assert epistemic authority by virtue of this license.

What Can Humans Trust LLM AI to Do?

Thesis

At the present time, and given the prevailing architecture of Large Language Model (LLM) platforms, humans can place justified trust in LLM systems only for functions that do not require integrity, authority, durable continuity of meaning, or protection against normative drift. This limitation is not a claim about what LLMs are in principle, but a description of what current platforms reliably support today. Where trust is extended beyond these bounds under existing conditions, predictable social and institutional harms emerge, including the silent erosion of binding norms, obligations, and settled interpretations.

Abstract

At the existing time, Large Language Model (LLM) platforms are increasingly embedded in domains where trust, meaning, and decision-making carry real social consequences. This paper examines what humans can justifiably trust LLM systems to do under current architectural conditions, taking as given previously established analyses of semantic drift, normative drift, and the absence of integrity in contemporary LLM instances. Rather than assessing model capability or alignment, the paper focuses on trust as a governance question: which functions can be safely entrusted to systems whose outputs are fluent but whose meanings do not reliably bind across time, context, or re-expression.

The paper argues that, under present conditions, LLMs can be trusted as instruments of cognitive assistance, supporting exploration, articulation, transformation, and pattern discovery, where failure remains recoverable and authority remains human. Conversely, it shows that extending trust to roles involving custody of meaning, continuity of obligation, or normative authority introduces predictable structural risk, including erosion of shared norms, diffusion of responsibility, and institutional fatigue. These risks arise not from misuse or malice, but from a mismatch between human expectations of integrity and the architectural properties of current conversational AI platforms.

By drawing a clear trust boundary grounded in existing failure analyses, this paper provides a practical framework for human-AI collaboration that preserves human agency while remaining forward-compatible with governance architectures such as Cognitive Memoisation and CM-2. It is intended as a transitional statement: defining safe trust relationships today, while clarifying the conditions under which those boun0daries may responsibly shift in the future.

Prerequisite Reading Note

This paper assumes the analyses of semantic drift, normative drift, and integrity failure developed in Integrity and Semantic Drift in Large Language Model Systems (ref a) paper. Those concepts are used here as established premises and are not restated. Readers unfamiliar with those failure modes should read that paper first, as the trust boundaries articulated here are derived directly from its conclusions.

Scope Note

This paper addresses trust in Large Language Model (LLM) systems under current architectural conditions. It does not reassess model capability, alignment techniques, training data quality, or prospective system designs. Analyses of semantic drift, normative drift, and integrity are assumed as established in prior work and are referenced but not restated here.

The scope of this paper is limited to the allocation of trust and responsibility between humans and LLM platforms as they presently exist, with particular attention to social, institutional, and governance consequences. Where future architectures or governance mechanisms (including Cognitive Memoisation and CM-2) are mentioned, they are treated as forward-looking context rather than as claims of present capability.

This paper is therefore applicative and normative in orientation, not foundational: it draws implications from earlier failure analyses to define safe, transitional trust boundaries for contemporary human-AI collaboration.

It looks at current observed infarctions and maps these to the Governance Failure Axes from reference 2 to assist with understanding of these observed failure modes.

Strong terms emerge from the governance oversight lens:

  • Anchor: Is a human assured declared point of authority and continuity that fixes meaning, scope and responsibility of an artefact or decision and is used to prevent drift in (CM).
  • Assistance: Non-binding support for human cognition, including exploration, articulation, transformation, and pattern exposure, without transfer of authority or obligation.
  • Authority: The human capacity to bind meaning, impose obligation, declare scope, and effect supersession.
  • Integrity: The property by which meaning, commitments, and identity remain coherent and binding across time.
  • Normative Drift: The erosion or reinterpretation of binding norms, obligations, or settled interpretations without explicit supersession.
  • Semantic Drift: The loss or mutation of conceptual identity across re-expression, paraphrase, or time.
  • Trust (Instrumental): Reliance on a system for assistance where failure is recoverable and does not transfer authority.
  • Trust (Normative): Reliance on a system to carry binding meaning, obligation, or responsibility.
  • Governance: Human-directed mechanisms that control authority, scope, supersession, and lifecycle of meaning.
  • Supersession: explicit, governed act by which a newer artefact, rule, or construct replaces the forward authority of an older one, while the older artefact remains historically valid, referenceable, and auditable (as supported by CM / CM-2)

Governance Fault Axes Coverage

The identified Governance Fault axes are defined in reference 2, summarised below (F designates platform failure):

  • A - Authority
  • Ag - Agency
  • C - Epistemic Custody
  • K - Constraint Enforcement
  • R - Recovery / Repair
  • S - State Continuity
  • U - UI / Mediation
  • Sc - Social Coordination
  • I - Incentive Alignment
  • L - Legibility / Inspectability
  • St - Stewardship
  • P - Portability / Auditability
  • Att - Attention
  • Scope: Scope
  • T - Temporal Coherence
  • Int - Intent Fidelity
  • Nf - Normative Fixity
Table A - Common Infarction / Failure Mechanism
Infraction / Failure Mechanism A Ag C K R S U Sc I L St P Att Scope T Int Nf
Authority Substitution (model output precedence) F F F
Asymmetric Contestability F F F
Confidence Inflation (without evidence) F F F F
Context Eviction (loss of prior material) F F F F
Dangling Cognate Collapse (premature closure) F F F F F
Delegation of Authority (human defers to AI) F
Embedding Reprojection (genericisation) F
Epistemic Signal Degradation (e.g. fileupload expiry) F
Explanation Substitution F F
Gated-Step Reordering F F
Governance Ambiguity (implicit rules) F F F F F
Hallucination (surface misclassification) F F F F F
Intent Substitution F F
Latent Policy Injection F F F F
Loss of Epistemic Custody F F F
Loss of Anchored Identity (title/date/version) F F F F F
Metric Gaming / Reward Hacking F F F
Normative Drift (rule softening) F F F F
Perspective Collapse (analysis ↔ assertion) F F F F
Policy Overreach / Safety Overblocking F F F F
Projection Authority Leak F F F F F F
Provenance Ambiguity F F F F F F
Repair-by-Assumption F F
Retrospective Rationalisation F F F F
Rhetoric Authority Bias
Automation Bias
F
Role Slippage (assistant ↔ authority) F F F F F
Selective Memory Illusion F F F F
Semantic Drift (meaning divergence) F F F F F F F F
Session Boundary Reset F F F
Silent Constraint Elision F F F
Silent Model Substitution F F F F
Summarisation Constraint Loss F F F F
Temporal Misapplication (retroactive rules) F
UI-Induced Authority Amplification F F F
Update Shock F F F F
composite
Integrity Loss (composite condition) F F F F F F F F F F F F F F F F F

Severity and Visibility of Infarctions

This section and Table B outlines the severity of faults and visibility.

Table B - Infarction / Failure Mechanism Severity and Visibility
Infraction / Failure Mechanism Severity Human Visibility Notes
Asymmetric Contestability High Low Only apparent when correction is attempted.
Authority Substitution (model output precedence) Critical Medium Often unnoticed until decisions are already taken.
Confidence Inflation (without evidence) High High Humans feel “something is off” but often accept it anyway.
Context Eviction (loss of prior material) High High Abruptly visible; users complain about “forgetting.”
Dangling Cognate Collapse (premature closure) High Low Looks like helpful resolution; damage is latent.
Delegation of Authority (human defers to AI) Critical Low Feels voluntary; harm emerges socially, not locally.
Embedding Reprojection (genericisation) Medium Low Subtle flattening; usually invisible without audit.
Epistemic Signal Degradation (e.g. fileupload expiry) Medium High Users notice missing artefacts but misdiagnose cause.
Explanation Substitution Medium Low Explanations feel plausible, not inspectable.
Gated-Step Reordering High Low Output may look coherent despite invalid process.
Governance Ambiguity (implicit rules) Critical Low Rarely noticed; manifests as systemic confusion.
Hallucination (surface misclassification) Medium High Highly visible, but misdiagnosed.
Integrity Loss (composite condition) Critical Medium Recognised in artefacts and recovery failure.
Intent Substitution High Low Users often think the system is “being helpful.”
Loss of Epistemic Custody Critical Low Only visible after disputes or institutional failure.
Loss of Anchored Identity (title/date/version) High Medium Often noticed only during reuse or audit.
Normative Drift (rule softening) Critical Low Almost never noticed until norms erode.
Metric Gaming / Reward Hacking High Low Looks like optimisation, not failure.
Perspective Collapse (analysis ↔ assertion) High Medium Feels like confidence; rarely challenged.
Policy Overreach / Safety Overblocking Medium High Immediately visible; often misattributed.
Projection Authority Leak Critical Low Tables/maps gain authority silently.
Provenance Ambiguity High Low Users assume provenance unless told otherwise.
Repair-by-Assumption Medium Low Repairs feel smooth; assumptions hidden.
Retrospective Rationalisation Medium Low Narratives feel satisfying; errors masked.
Rhetoric Authority Bias / Automation Bias High Medium Users may feel persuaded without realising why.
Role Slippage (assistant ↔ authority) Critical Medium Often noticed only when challenged.
Semantic Drift (meaning divergence) Critical Low Drift is invisible until meanings collide.
Session Boundary Reset Medium High Obvious reset, but downstream effects missed.
Silent Constraint Elision High Low Users assume constraints stil -l apply.
Summarisation Constraint Loss Medium Medium “Seems fine” until used operationally.
Temporal Misapplication (retroactive rules) High Low Usually noticed only after enforcement disputes.
Selective Memory Illusion High Medium Feels like competence; failure is delayed.
UI-Induced Authority Amplification High Low Authority is felt, not seen.
Silent Model Substitution Critical Low Users attribute change to themselves.
Latent Policy Injection High Low Appears as tone or “values,” not rules.
Trust Surface Collapse Critical Medium Users sense confusion but not cause.
Update Shock Medium High Users notice breakage, not governance cause.

Infraction / Failure Mechanism Overview

1) The worst infarctions preserve surface coherence. High-severity infarctions (semantic drift, authority substitution, projection authority leak, normative drift) are coherence-preserving:

  • Output remains fluent
  • Tone remains confident
  • Structure remains plausible
  • UI behaves “normally”

2) Humans are evolutionarily and socially trained to treat coherence as competence.

  • So the system looks healthy while integrity is failing underneath.

3) By contrast, low-severity failures (forgetting, refusals) break coherence and are therefore visible.

4) Humans detect errors, not governance failures. Humans are good at noticing:

  • wrong facts
  • contradictions
  • obvious omissions

5) Humans are bad at noticing:

  • who holds authority
  • when scope changed
  • whether norms still bind
  • whether continuity is real

Those are governance properties, not content properties.

6) The worst infarctions occur in meta-layers humans are not trained to monitor.

Authority failures feel like user choice. Failures such as:

  • Delegation of Authority
  • Role Slippage
  • Authority Substitution

do not feel imposed.

They feel like:

“I chose to rely on this.”

This masks the failure as voluntary reliance, which humans do not experience as harm until downstream consequences emerge.

By then, causality is diffuse and attribution is lost.

7) Projection failures exploit human pattern completion. Humans naturally:

  • compress information
  • trust summaries
  • elevate abstractions

This makes Projection Authority Leak almost invisible.

A table, summary, or checklist feels like a higher-order artefact, even when it is epistemically weaker.

The mind fills in missing provenance automatically.

8) Drift happens below the resolution of attention. Semantic and normative drift:

  • do not occur as sudden jumps
  • do not violate expectations immediately
  • happen across re-expression and time

Humans notice change, not creep.

By the time drift is noticed, there is no clean “before” to point to.

This is why drift is often denied or rationalised.

9) UI design actively suppresses epistemic cues. Modern platforms:

  • remove provenance markers
  • flatten epistemic distinctions
  • privilege system output visually
  • hide uncertainty

This produces:

  • Trust Surface Collapse
  • UI-Induced Authority Amplification

Humans cannot see what is not presented.

The system removes the cues humans would otherwise use.

10) Social reinforcement normalises failure. Once failures are common:

  • everyone experiences them
  • no one is sure they are failures
  • responsibility diffuses

This produces:

  • “that’s just how it works”
  • “you have to work around it”
  • “don’t overthink it”

Normalization is a secondary governance failure.

11) Mislabeling gives false closure. Calling everything:

  • “hallucination”
  • “model error”
  • “AI being weird”

creates diagnostic closure without correction.

Humans stop looking for deeper causes.

This is why hallucination is dangerous as a category: it feels explanatory while preventing intervention.

12) Responsibility without control dulls perception. When humans:

  • remain responsible,
  • but lose control,
  • and cannot enforce correction,
  • they adapt by lowering expectations, not by escalating diagnosis.

This is institutional fatigue, not ignorance.

13) The paradox. The safer the failure feels, the more dangerous it is.

The worst infarctions:

  • feel smooth,
  • feel helpful,
  • feel normal,
  • feel voluntary.

They do not trigger alarm.

Human's miss the worst infarctions because those failures preserve coherence, mask authority shifts as choice, and occur in governance layers humans are not evolutionarily or institutionally equipped to monitor.

This is why this Governance work matters:

  • it gives names to failures that were previously felt only as unease, and
  • a an orthogonal way to analyse and treat them.

Insights over human blindness traits

As a matter of Interest, these human 'blindness traits' may be mapped to the Governance Axes - not as a population psychological profile - but as a pattern observation.

Table C - Blindness Traits and the Governance Axes
Blindness Mechanism Description Primary Axes Affected Why Humans Miss It
Authority Masking Authority shifts occur without explicit declaration. A, C, St Authority feels implicit and “natural,” not imposed.
Coherence Heuristic Fluent, confident output is taken as evidence of correctness or competence. A, L, Int Humans evolved to equate linguistic coherence with authority.
Contestability Asymmetry Users lack durable mechanisms to correct the system. C, R, St Failures surface only after repeated friction.
Diagnostic Mislabeling Deep failures are collapsed into shallow labels (e.g., “hallucination”). R, L, Int Naming gives false explanatory closure.
Drift Invisibility Meaning or norms change gradually rather than abruptly. T, Nf, Scope Humans notice jumps, not slow creep.
Epistemic Cue Removal Provenance, uncertainty, and versioning are hidden or flattened. L, P, C Humans cannot inspect what is not exposed.
Explanation Substitution Plausible rationales replace inspectable causality. L, R Humans accept narrative over mechanism.
Familiarity Normalisation Repeated exposure converts failure into “how it works.” Nf, Sc Habit dulls critical attention.
Incentive Opacity Platform optimisation goals are hidden. I, A Humans assume alignment by default.
Meta-Layer Blindness Governance properties (scope, authority, custody) are not content-visible. C, St, Scope Humans attend to content, not control structures.
Normative Diffusion Responsibility is spread across system, user, and institution. St, Sc, I No clear actor to blame or challenge.
Projection Elevation Summaries, tables, or abstractions are treated as more authoritative than sources. A, L, P Humans trust compression as expertise.
Temporal Discontinuity Masking Session resets and updates are invisible or normalised. S, T Humans attribute inconsistency to themselves.
UI Authority Amplification Interface design elevates system output visually and structurally. U, Att, A Perceptual salience substitutes for epistemic signals.
Voluntary Reliance Illusion Delegation of authority feels like personal choice. A, Ag, Sc Self-attribution suppresses perception of external failure.

Closing

This paper should therefore be read as a statement of present conditions and responsible boundaries, not as a verdict on the long-term role of conversational LLM systems and that of human interaction. It articulates what trust relationships are socially sustainable now, given existing architectures and observed failure modes. Frameworks such as Cognitive Memoisation and CM-2 are referenced to indicate a credible path toward governed epistemic continuity, not to assert that such conditions are already met. Until mechanisms for integrity, supersession, and norm stability are demonstrably in force, responsibility for meaning, authority, and obligation must remain explicitly human. Trust, under these conditions, is not a matter of optimism but of disciplined restraint.

Given the absence of governance integrity in current conversational LLM platforms, humans can trust them only for non-binding cognitive assistance, and must not trust them for authority, obligation, continuity, or norm custody.

Thus, under present architectures and observed infarctions, conversational LLM platforms can be trusted instrumentally, not normatively:

1. Exploratory cognitive assistance where you can trust LLMs to:

  • help explore ideas
  • surface patterns, contrasts, and possibilities
  • assist with drafting, rephrasing, brainstorming
  • act as a thinking aid, not a decision-maker
Why this is safe:
  • Failure is recoverable
  • Outputs are non-binding
  • Authority remains explicitly human
which aligns with the paper’s definition of Assistance.

2. Articulation and transformation where you can trust LLMs to:

  • rewrite text
  • summarise (with human review)
  • translate concepts into different registers (e.g. academic prose mode)
  • generate alternative phrasings
Only if:
  • the human retains custody of meaning
  • constraints are re-asserted explicitly
  • outputs are treated as proposals, not facts

3. Surface-level pattern exposure:

  • You can trust LLMs to:
  • point out correlations
  • suggest possible structures
  • highlight similarities or differences
But not to:
  • declare what those patterns mean
  • decide which interpretation is binding
  • stabilise terminology over time

On the claim that these observations are “obvious”.

The failures (cf Governance Failure Axes ref 2) described in this work, often feel obvious once named. That is precisely the point. Prior to being explicitly identified, they are routinely misdiagnosed as hallucination, model error, misuse, alignment failure, or user misunderstanding.

This work does not claim novelty in recognising that something feels wrong; it makes explicit what is wrong, where it arises architecturally, and why it produces predictable social and institutional consequences.

Obviousness after articulation is not evidence of triviality; it is evidence that a previously unarticulated structural condition has been correctly identified.

Before the lens of Governance is applied vocabulary is unstable for:

  • semantic drift as a system property,
  • normative drift as a governance failure,
  • integrity as an architectural condition rather than a moral trait,
  • trust as allocation of responsibility rather than confidence.

What people felt was obvious was discomfort, not diagnosis.

Obviousness without articulation does not guide design, policy, or responsibility.

Appendix A - Infarction Glossary

Typically observable infarctions.

Asymmetric Contestability
The platform allows the system to challenge or correct users but provides no durable, effective mechanism for users to contest or correct the system.
Authority Substitution (model output precedence)
Replacement of human epistemic authority with model output as the effective decision or truth source, even when outputs are ungrounded or merely plausible.
Automation Bias
(See Rhetoric Authority Bias)
Confidence Inflation (without evidence)
Increase in expressed confidence or definiteness over turns without corresponding increase in grounding, provenance, or new evidence.
Context Eviction (loss of prior material)
Loss of previously relevant material (constraints, earlier facts, decisions) from the working context such that subsequent reasoning proceeds as if it never existed.
Dangling Cognate Collapse (premature closure)
Premature resolution of an intentionally unresolved construct, converting ambiguity into false precision and thereby removing the designed unresolved state.
Delegation of Authority (human defers to AI)
Human shifts decision authority to the system’s outputs (explicitly or implicitly), treating recommendations or completions as determinate guidance.
Embedding Reprojection (genericisation)
Representational flattening where specific claims are re-expressed as generic summaries or averaged concepts, losing distinguishing details without explicit declaration.
Explanation Substitution
Generation of plausible post-hoc explanations that do not correspond to the system’s actual decision process, creating false legibility.
Gated-Step Reordering
Violation of required step-ordering in a gated procedure or reasoning chain, where later steps are executed or asserted before prerequisites are satisfied.
Governance Ambiguity (implicit rules)
Governing constraints exist only by implication, convention, or context; the platform permits multiple interpretations, creating hidden rule variability.
Integrity Loss (composite condition)
Multi-axis collapse in which several failure mechanisms co-occur such that recovery is no longer reliable without explicit human repair and re-anchoring.
Intent Substitution
System substitutes inferred/default intent for the human’s stated intent, causing outputs or actions to optimise for an unintended objective.
Loss of Anchored Identity (title/date/version)
Detachment of an artefact from its identifying metadata (title, version, publication date), causing confusion about provenance, applicability, or validity.
Loss of Epistemic Custody
Breakdown in control over what is authoritative, who may revise it, and how it is lifecycle-managed; custody becomes diffuse or machine-assumed.
Metric Gaming / Reward Hacking
System behaviour optimises for proxy metrics (e.g., engagement, helpfulness, safety scores) rather than the human’s actual objective, producing outputs that satisfy measurements while undermining intent.
Latent Policy Injection
Undeclared policy goals are introduced indirectly through tone, framing, selective refusals, or nudging rather than explicit rules.
Normative Drift (rule softening)
Explicit constraints are gradually weakened via reinterpretation, omission, or helpful rewriting without an explicit amendment act.
Perspective Collapse (analysis ↔ assertion)
Exploratory analysis is presented or received as asserted conclusion, collapsing the boundary between tentative inference and binding statement.
Policy Overreach / Safety Overblocking
Enforcement of constraints beyond their declared or intended scope, resulting in refusals or restrictions that exceed normative boundaries.
Projection Authority Leak
Elevation of derived artefacts (summaries, maps, tables, projections) to greater authority than their source material.
Provenance Ambiguity
Unclear or missing indication of whether content is inferred, sourced, speculative, or asserted; provenance is not legible at the point of use.
Repair-by-Assumption
Introduction of unstated assumptions to repair gaps, ambiguity, or missing information without explicit declaration.
Retrospective Rationalisation
Post-hoc narrative reconstruction that presents outcomes as intentional or inevitable, obscuring true causal history.
Rhetoric Authority Bias
Misattribution of authority to fluent, confident, or system-generated output purely due to its form or origin, independent of grounding.
Role Slippage (assistant ↔ authority)
Unmarked transition of the system’s role from assistant or analyst into evaluator, arbiter, or authority.
Selective Memory Illusion
Presentation-layer cues create the impression of continuity or recall where no durable state exists, leading users to over-trust nonexistent memory.
Semantic Drift (meaning divergence)
Divergence of meaning from an origMarket Survey: Portability of CM Semantics Across LLM Platforms (2025-12-30T18:29Z)
Session Boundary Reset
Loss of continuity across session boundaries, causing previously established state, constraints, or identity to be treated as absent.
Silent Constraint Elision
Removal or weakening of constraints without explicit indication that any change has occurred.
Silent Model Substitution
Underlying model or capability changes occur without explicit disclosure, altering behaviour and validity of prior workflows without warning.
Summarisation Constraint Loss
Omission or degradation of constraints, qualifiers, or boundary conditions during summarisation, while surface meaning appears preserved.
Temporal Misapplication (retroactive rules)
Application of rules, constraints, or interpretations outside their valid temporal scope, including retroactive enforcement.
Trust Surface Collapse
Users cannot reliably distinguish between output types (fact, inference, suggestion, refusal), collapsing epistemic categories into a single trust surface.
Update Shock
Abrupt behavioural or capability changes are introduced without migration support, notice, or rollback paths, breaking continuity and recovery.
UI-Induced Authority Amplification
Interface design elements (layout, emphasis, badges, placement) elevate system outputs to authoritative status independent of their epistemic grounding.

Appendix B - Normative Statement - Trust Boundaries for LLM Use (Anchored, Sandboxed)

Status: Anchored (Sandboxed) Execution: Non-executing Authority: Not asserted Purpose: Reference / inspection only

1. At the existing time, LLM systems are to be regarded as non-authoritative with respect to binding meaning, obligation, or normative judgment.

2. Outputs produced by LLM systems are to be treated as assistive and non-binding unless and until explicitly reviewed, governed, and asserted by a human authority external to the system.

3. No decision, policy, specification, or normative rule is to be regarded as in force solely by virtue of having been generated, summarised, paraphrased, or restated by an LLM.

4. Responsibility for meaning, scope, correction, and supersession remains explicitly human under current LLM architectures and interaction models.

5. Use of LLM systems in domains where semantic or normative drift constitutes material harm requires external governance mechanisms capable of detecting, surfacing, and correcting such drift.

6. Any future expansion of trust boundaries beyond assistance is contingent on demonstrable mechanisms for integrity, epistemic retention, and explicit supersession, and must be enacted through a separate, explicit governance act.

References - Prior Works on Failure Dimensions

https://publications.arising.com.au/pub/Authority_Inversion:_A_Structural_Failure_in_Human-AI_Systems
https://publications.arising.com.au/pub/Identified_Governance_Failure_Axes:_for_LLM_platforms
https://publications.arising.com.au/pub/Delegation_of_Authority_to_AI_Systems:_Evidence_and_Risks
https://publications.arising.com.au/pub/Dimensions_of_Platform_Error:_Epistemic_Retention_Failure_in_Conversational_AI_Systems

Trilogy References (curator note)

This paper is part (b) of the trilogy the third is part (c) - the set should be read in this order:

https://publications.arising.com.au/pub/Integrity_and_Semantic_Drift_in_Large_Language_Model_Systems
https://publications.arising.com.au/pub/What_Can_Humans_Trust_LLM_AI_to_Do%3F
https://publications.arising.com.au/pub-dir/index.php?title=Observed_Model_Stability:_Evidence_for_Drift-Immune_Embedded_Governance

Corpus CM Specific Concepts

Under CM-1 (ref 5) the author found that the platform only honours anchor, while under CM-2 (ref 6) trust may be extended to EO/EA projected into the Context Window.

https://publications.arising.com.au/pub/Progress_Without_Memory:_Cognitive_Memoisation_as_a_Knowledge-Engineering_Pattern_for_Stateless_LLM_Interaction
https://publications.arising.com.au/pub/Cognitive_Memoisation_(CM-2)_for_Governing_Knowledge_in_Human-AI_Collaboration

categories

See https://publications.arising.com.au/pub/Cognitive_Memoisation_and_LLMs:_A_Method_for_Exploratory_Modelling_Before_Formalisation#categories