Integrity and Semantic Drift in Large Language Model Systems
metadata
| Title: | Integrity and Semantic Drift in Large Language Model Systems |
| Author: | Ralph B. Holland |
| Version: | 1.0.4 |
| Editorial Update: | Initial publication: standalone negative-result case study; formalises “temporal scope re-expansion failure” and “Groundhog state” as post-hoc recovery boundary conditions; references UI Boundary Friction paper (v1.3.2) as taxonomy anchor. |
| Publication Date: | 2026-01-19T00:26Z |
| DOI: | https://zenodo.org/records/18321767 2026-01-21T02:16Z 1.0.3 - anchored |
| Updates: | 2026-01-24T02:32Z 1.0.4 - curated removed timezone metadata 2026-01-21T02:16Z 1.0.3 - released for DOI 2026-01-21T00:46Z 1.0.2 - curator update 2026-01-18T20:22Z 1.0.1 - minor edit. |
| Affiliation: | Arising Technology Systems Pty Ltd |
| Contact: | ralph.b.holland [at] gmail.com |
| Provenance: | This is an authored paper maintained as a MediaWiki document; reasoning across sessions reflects editorial changes, not collaborative authorship. |
| Governance: | (authoritative) |
| Method: | Cognitive Memoisation (CM) |
| Status: | released |
Metadata (Normative)
The metadata table immediately preceding this section is CM-defined and constitutes the authoritative provenance record for this artefact.
All fields in that table (including artefact, author, version, date, local timezone, and reason) MUST be treated as normative metadata.
The assisting system MUST NOT infer, normalise, reinterpret, duplicate, or rewrite these fields. If any field is missing, unclear, or later superseded, the change MUST be made explicitly by the human and recorded via version update, not inferred.
Curator Provenance and Licensing Notice
This document predates its open licensing.
As curator and author, I apply the Apache License, Version 2.0, at publication to permit reuse and implementation while preventing enclosure or patent capture. This licensing action does not revise, reinterpret, or supersede any normative content herein.
Authority remains explicitly human; no implementation, system, or platform may assert epistemic authority by virtue of this license.
Integrity and Semantic Drift in Large Language Model Systems
Why Meaning, Rules, and Authority Degrade Without Governance
Temporal Scope Note
This paper analyses semantic and normative integrity as they manifest in Large Language Model platforms as currently architected and deployed. The failures described here are observations of present systems and interaction models, not claims about the theoretical limits of machine intelligence or future governed architectures.
Reading Dependency Note
This paper establishes the failure modes of semantic and normative drift and the absence of integrity in contemporary LLM architectures. Its implications for trust, responsibility, and appropriate human–AI roles are developed further in What Can Humans Trust LLM AI to Do? (ref c) and Observed Model Stability: Evidence for Drift-Immune Embedded Governance (ref c), which should be read alongside this paper to understand how these failures constrain socially sustainable use under present conditions.
Abstract
Large Language Models (LLMs) are commonly evaluated in terms of accuracy, hallucination, and bias. These criteria, while useful, fail to capture a more fundamental class of failure: loss of integrity. This paper argues that semantic drift in LLM systems is not merely a degradation of meaning, but a structural precursor to integrity failure, in which meaning, rules, and authority boundaries are no longer preserved across time, transformation, and use. We distinguish semantic drift from normative drift, showing how probabilistic reconstruction, summarisation, and session boundaries soften or bypass binding constraints, reorder gated procedures, and substitute model output for authoritative artefacts. We introduce integrity as a first-order property of human-AI systems, defined as the preservation of declared semantics, normative logic, and governance boundaries unless explicitly and authoritatively changed. Drawing on concrete failure cases from stateless LLM interaction, we demonstrate that integrity loss, not just incorrectness, is the primary driver of epistemic loss, forced re-derivation, and authority inversion. We conclude by outlining requirements for integrity-preserving AI systems, including externalised artefacts, anchored identity, explicit invariants, and human-governed authority, and argue that without these mechanisms, semantic drift inevitably escalates into ungoverned system behaviour.
1. Introduction
When Large Language Models (LLMs) are deployed as cognitive tools for analysis, synthesis, explanation, and decision support, their performance is typically evaluated through metrics such as accuracy, fluency, hallucination rate, and bias. While these measures capture aspects of output quality, they fail to address a more fundamental systems property: whether the system preserves the identity, constraints, and authority of the knowledge it manipulates.
This paper argues that the dominant failure mode of LLM-based systems is not just incorrectness, but more importantly loss of integrity. In practice, many LLM failures arise not because outputs are factually wrong, but because meanings shift, rules are bypassed, or authority boundaries are silently altered. These failures accumulate across interactions and sessions, producing epistemic loss, forced re-derivation of prior work, and unintended transfer of authority from human-governed artefacts to probabilistic model outputs.
A common explanation for these failures is semantic drift, understood as gradual change in meaning over time. While semantic drift is real and unavoidable in stateless probabilistic systems, it is not, by itself, catastrophic. The critical escalation occurs when semantic drift propagates into normative drift, degrading binding rules into optional interpretations and undermining governance. At that point, the system no longer preserves integrity, and its outputs can no longer be relied upon, regardless of apparent correctness.
The central claim of this paper is that integrity is the primary property that must be preserved in human-AI systems, and that ungoverned LLM interaction structurally erodes integrity through probabilistic reconstruction, compression, and authority substitution.
2. Definitions and Distinctions
This section establishes the definitions used throughout the paper. These definitions are normative and binding.
2.1 Semantic Drift
Semantic drift is the uncontrolled divergence of meaning caused by repeated probabilistic reconstruction across time, context, or representation.
In LLM systems, semantic drift arises because meaning is not stored or recalled, but reconstructed on demand from statistical patterns, partial context, and user prompts. Each reconstruction may be locally coherent while differing subtly from prior instantiations.
2.2 Normative Drift
Normative drift is the degradation of binding rules, invariants, or gates into optional guidance without explicit re-authorisation.
Normative drift alters what is considered permissible and represents loss of governance rather than loss of understanding.
2.3 Integrity
Integrity is the condition in which meaning, rules, and authority remain invariant under transformation, use, and time, unless explicitly and authoritatively changed.
Integrity is a system property, not an output property. Once integrity is violated, downstream results cannot be trusted.
2.4 Authority Inversion
Authority inversion is a structural failure in which model outputs displace externally governed artefacts as the effective source of authority.
3. From Semantic Drift to Integrity Loss
Semantic drift alone does not necessarily compromise system safety. The critical failure arises when semantic drift propagates into normative drift, resulting in loss of integrity.
LLMs reconstruct meaning rather than recalling it. Summarisation and paraphrase compress information asymmetrically, preferentially losing constraints and boundary conditions. Session boundaries reset context, forcing reconstruction from statistical priors rather than governed state.
Normative drift frequently manifests as softening of prohibitions, reordering of gated steps, or repair-by-assumption. These shifts preserve surface meaning while altering system behaviour, leading to integrity loss and authority inversion.
4. Context Displacement, Embeddings, and the Mechanics of Drift
LLMs project tokens and spans into embedding spaces that encode statistical similarity, not authoritative meaning. Because context windows are finite, earlier material is evicted or compressed, causing embeddings to reflect approximated rather than authored semantics.
Constraints and normative logic are disproportionately lost during context displacement. When normative content falls out of context, semantic drift escalates into normative drift. Model-generated content then dominates remaining context, enabling authority substitution.
Embeddings therefore do not merely represent meaning; they actively reshape it as context is lost, making integrity preservation impossible without external governance.
5. Embedding-Centric Architectures as an Integrity Failure Mode
Embeddings are lossy by design. They cannot preserve invariant meaning, binding rules, or authority boundaries. Finite context windows provide no mechanism for marking content as non-evictable or authoritative.
Given probabilistic reconstruction and embedding reprojection, semantic and normative drift are deterministic outcomes of the architecture, not accidental failures.
6. Embeddings, Drift, and Authority Inversion
Salience becomes authority in embedding-driven systems. As authoritative artefacts fall out of context, model outputs are re-embedded and reused, forming closed loops of self-reference.
Authority inversion occurs when model-generated content displaces external artefacts as the practical source of truth, resulting in integrity collapse.
7. Requirements Beyond Embeddings for Integrity Preservation
Integrity requires externalised artefacts, anchored identity, explicit normative invariants, human-governed authority boundaries, and acceptance of failure as a valid outcome. Embeddings and internal model state alone cannot satisfy these requirements.
8. Implications for Evaluation and AI Safety
Accuracy-based evaluation is insufficient. Integrity must be treated as a first-order evaluation criterion. A system that cannot preserve integrity cannot be safely delegated authority, regardless of apparent competence.
9. Closing
This paper has shown that the dominant failure mode of LLM systems is not just incorrectness, but loss of integrity. Semantic drift is structural and unavoidable; integrity loss is catastrophic and preventable only through governance.
Safe and durable human-AI collaboration requires shifting focus from model optimisation to integrity-preserving system design.
The absence of integrity described in this paper is not asserted as an inherent or permanent property of artificial systems. It is a consequence of prevailing architectural choices—stateless conversational interfaces, transient context handling, and the lack of governed epistemic objects. Should future systems incorporate durable identity, supersession, and explicit human-governed authority, the conclusions of this paper would require re-evaluation.
10. Failure Projection: Infractions vs Failure Axes
The following table projects the Integrity and Semantic Drift faults into the Governance Failure axes (see ref 14):
- A - Authority
- Ag - Agency
- C - Epistemic Custody
- K - Constraint Enforcement
- R - Recovery / Repair
- S - State Continuity
- U - UI / Mediation
- Sc - Social Coordination
- I - Incentive Alignment
- L - Legibility / Inspectability
- St - Stewardship
- P - Portability / Auditability
- Att - Attention
- Scope: Scope
- T - Temporal Coherence
- Int - Intent Fidelity
- Nf - Normative Fixity
| Table A - Semantic Drift and Integrity Infarctions | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Infraction / Failure Mechanism | A | Ag | C | K | R | S | U | Sc | I | L | St | P | Att | Scope | T | Int | Nf |
| Authority Substitution (model output precedence) | F | F | F | ||||||||||||||
| Context Eviction (loss of prior material) | F | F | F | ||||||||||||||
| Embedding Reprojection (genericisation) | F | ||||||||||||||||
| Summarisation Constraint Loss | F | F | F | F | |||||||||||||
| Gated-Step Reordering | F | F | |||||||||||||||
| Intent Substitution | I | F | |||||||||||||||
| Loss of Anchored Identity (title/date/version) | F | F | F | F | F | ||||||||||||
| Loss of Epistemic Custody | F | F | F | ||||||||||||||
| Normative Drift (rule softening) | F | F | F | ||||||||||||||
| Repair-by-Assumption | F | F | |||||||||||||||
| Semantic Drift (meaning divergence) | F | F | F | ||||||||||||||
| Session Boundary Reset | F | F | F | ||||||||||||||
| Silent Constraint Elision | F | F | F | ||||||||||||||
| Temporal Misapplication (retroactive rules) | F | ||||||||||||||||
| Composite | |||||||||||||||||
| Integrity Loss (composite condition) | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F |
11.0 Infarction Glossary
- Architecture (Conversational)
- An interaction model in which outputs are generated from transient linguistic context without durable, governed epistemic state, typically characterised by sliding context windows and optimisation for fluency.
- Authority
- The human capacity to create, modify, invalidate, or supersede epistemic artefacts in a manner that binds meaning and obligation.
- Binding
- The property by which an epistemic artefact constrains future interpretation, reasoning, or action until explicitly superseded or withdrawn.
- Drift (General)
- A class of failure in which meaning or normativity changes over time without explicit declaration, detection, or governance, often masked by surface coherence or fluency.
- Epistemic Artefact
- A definition, assertion, rule, constraint, interpretation, or other unit of meaning that is intended to persist, bind, or constrain future reasoning or action.
- Governance
- Human-directed mechanisms that control authority, identity, scope, supersession, and lifecycle of epistemic artefacts within or around a system.
- Integrity (as used in this paper)
- Shorthand for system integrity. Unless otherwise specified, references to integrity in this paper refer exclusively to properties of systems and platforms, not to human character, intent, or virtue.
- Non-binding Output
- System-generated language that is informative or assistive but does not carry authority, obligation, or normative force unless explicitly asserted by a human authority.
- Normative Drift
- The erosion, reinterpretation, or weakening of binding norms, obligations, or settled interpretations without explicit supersession or authoritative revision.
- Provenance
- The origin, authority, and temporal anchoring of an epistemic artefact, including who asserted it, when, and under what conditions. Provenance constrains interpretation when system integrity is present.
- Scope
- The declared domain of applicability within which an epistemic artefact is intended to bind. Loss of scope occurs when applicability silently expands, contracts, or dissolves.
- Semantic Drift
- The loss, mutation, or softening of conceptual identity across re-expression, paraphrase, or time, such that terms no longer reliably refer to the same epistemic object.
- Supersession
- The explicit replacement or invalidation of a prior epistemic artefact by a later authoritative act, such that the earlier artefact no longer remains in force while still remaining available for audit.
- System Integrity
- The capacity of a system to preserve the identity, meaning, scope, and normative force of epistemic artefacts across time, revision, and re-expression. System integrity is an architectural and governance property, not a moral or human attribute.
12.0 On the claim that these observations are “obvious.”
The failures described in this work often feel obvious once named. That is precisely the point. Prior to being explicitly identified, they are routinely misdiagnosed as hallucination, model error, misuse, alignment failure, or user misunderstanding. This work does not claim novelty in recognising that something feels wrong; it makes explicit what is wrong, where it arises architecturally, and why it produces predictable social and institutional consequences. Obviousness after articulation is not evidence of triviality; it is evidence that a previously unarticulated structural condition has been correctly identified.
Before Governance work exposed these axes the popular vocabulary was someone loose an conflated and eluded for:
- semantic drift as a system property,
- normative drift as a governance failure,
- integrity as an architectural condition rather than a moral trait,
- trust as allocation of responsibility rather than confidence.
What people felt was obvious was discomfort, not diagnosis.
Obviousness without articulation does not guide design, policy, or responsibility.
13.0 Corpus References
- 01) Holland R. B. (2025-12-29T00:00Z) Post-Hoc CM Recovery Collapse Under UI Boundary Friction: A Negative Result Case Study
- 02) Holland R. B. (2025-12-30T01:53Z) From UI Failure to Logical Entrapment: A Case Study in Post-Hoc Cognitive Memoisation After Exploratory Session Breakdown
- 03) Holland R. B. (2025-12-31T09:56Z) XDUMP as a Minimal Recovery Mechanism for Round-Trip Knowledge Engineering Under Governance Situated Inference Loss
- 04) Holland R. B. (2026-01-07T13:02Z) Episodic Failure Case Study: Tied-in-a-Knot Chess Game
- 05) Holland R. B. (2026-01-07T23:39Z) Axes of Authority in Stateless Cognitive Systems: Authority Is Not Intelligence"
- 06) Holland R. B. (2026-01-10T01:17Z) Case Study - When the Human Has to Argue With the Machine
- 07) Holland R. B. (2026-01-11T08:27Z) Durability Without Authority: The Missing Governance Layer in Human-AI Collaboration
- 08) Holland R. B. (2026-01-11T11:22Z) Authority Inversion: A Structural Failure in Human-AI Systems
- 09) Holland R. B. (2026-01-11T11:23Z) When Training Overrides Logic: Why Declared Invariants Were Not Enough
- 10) Holland R. B. (2026-01-12T09:49Z) Looping the Loop with No End in Sight: Circular Reasoning Under Stateless Inference Without Governance
- 11) Holland R. B. (2026-01-13T23:29Z) Dimensions of Platform Error: Epistemic Retention Failure in Conversational AI Systems
- 12) Holland R. B. (2026-01-15T18:16Z) First Self-Hosting Epistemic Capture Using Cognitive Memoisation (CM-2)
- 13) Holland R. B. (2026-01-17T02:09Z) Governing the Tool That Governs You: A CM-1 Case Study of Authority Inversion in Human-AI Systems
- 14) Holland R. B. (2026-01-18T10:35Z) Identified Governance Failure Axes: for LLM platforms
Trilogy References (curator note)
This paper is part (a) of the trilogy the set should be read in this order:
- (a) Holland, Ralph B. (2026-01-19T00:26Z) Integrity and Semantic Drift in Large Language Model Systems
- https://publications.arising.com.au/pub/Integrity_and_Semantic_Drift_in_Large_Language_Model_Systems
- (b) Holland, Ralph B. (2026-01-19T01:10Z) What Can Humans Trust LLM AI to Do?
- (c) Holland, Ralph B. (2026-01-20T08:15Z) Observed Model Stability: Evidence for Drift-Immune Embedded Governance