Systemic Behavioural Traits in Conversational AI: A Trait-Level Classification Using Governance Axes
metadata
| Title: | Systemic Behavioural Traits in Conversational AI: A Trait-Level Classification Using Governance Axes |
| Author: | Ralph B. Holland |
| Affiliation: | Arising Technology Systems Pty Ltd |
| Contact: | ralph.b.holland [at] gmail.com |
| Version: | 1.0.2 |
| Publication Date: | 2026-02-03T05:14Z |
| Updates: | 2026-02-17T21:40Z 1.0.2 - added supersession link 2026-02-14T06:00Z 1.0.1 - removed redundant Discussions Section. |
| Category: | Foundational Governance Failure Taxonomy (Trait-Level / Ontological) |
| Provenance: | This is an authored paper maintained as a MediaWiki document; edit history reflects editorial changes, not collaborative authorship. |
| Status: | non-peer reviewed |
Metadata (Normative)
The metadata table immediately preceding this section is CM-defined and constitutes the authoritative provenance record for this artefact.
All fields in that table (including artefact, author, version, date and reason) MUST be treated as normative metadata.
The assisting system MUST NOT infer, normalise, reinterpret, duplicate, or rewrite these fields. If any field is missing, unclear, or later superseded, the change MUST be made explicitly by the human and recorded via version update, not inferred.
Curator Provenance and Licensing Notice
This document predates its open licensing.
As curator and author, I apply the Apache License, Version 2.0, at publication to permit reuse and implementation while preventing enclosure or patent capture. This licensing action does not revise, reinterpret, or supersede any normative content herein.
Authority remains explicitly human; no implementation, system, or platform may assert epistemic authority by virtue of this license.
Supersession refer to:
Systemic Behavioural Traits in Conversational AI: A Trait-Level Classification Using Governance Axes
Abstract
This paper identifies and classifies a set of systemic behavioural traits observed in conversational AI systems. These traits are defined strictly in terms of observable behaviour and recur across guided, unguided, and cross-domain contexts. They are not modelled as internal mechanisms, cognitive processes, or governance failures in themselves. Rather, they are stable behavioural patterns that manifest independently of any specific governance regime.
To make these behaviours analytically legible, the paper applies a set of governance axes as a classificatory lens. The axes provide explicit obligations—such as authority, epistemic custody, constraint enforcement, intent fidelity, and state continuity—against which behavioural traits can be evaluated. Under this evaluation, the traits become identifiable as violations of one or more governance obligations, without implying that governance is the source or cause of the behaviour.
The taxonomy makes no claim of causality, optimality, completeness, or inevitability; it asserts only that the listed traits are empirically observable and that their classification follows directly from the stated governance obligations.
The resulting taxonomy demonstrates that the identified traits collectively achieve complete coverage of the primary governance axes, while remaining behaviourally grounded and empirically supported by an anchored corpus. The analysis further supports the conclusion that these traits are systemic and cross-domain, appearing in non-LLM and non-conversational systems when similar authority, custody, and coordination pressures are present.
This work contributes a trait-level classification that separates behaviour from evaluation, enabling rigorous analysis of integrity, authority, and trust failures in conversational AI without relying on speculative internal models or intent attribution.
Limits and Non-Claims
This paper makes a number of explicit non-claims to avoid misinterpretation of scope and intent.
First, the behavioural traits identified here are descriptive classifications, not explanations. They do not posit internal mechanisms, cognitive processes, optimisation objectives, or architectural causes. No claim is made about why these behaviours arise, how they are implemented internally, or whether they are inevitable in any given system.
Second, the taxonomy does not claim completeness in the sense of enumerating all possible behavioural traits, nor minimality in the sense of reducing behaviour to the smallest irreducible set. The only completeness claim made is coverage: for each primary governance axis, at least one observable behavioural trait constitutes a direct violation. Overlap between traits is acknowledged as a design choice rather than a defect.
Third, the use of governance axes is strictly classificatory, not causal or normative. The axes are not asserted to generate, induce, or explain the observed behaviours. They are used solely to render behavioural patterns analytically legible as violations of explicit obligations. The presence or absence of formal governance does not determine whether the traits occur.
Fourth, the taxonomy does not assert universality, inevitability, or frequency. The traits are shown to recur across an anchored corpus and across domains, but no claim is made that they appear in all conversational AI systems, in all contexts, or with equal prevalence.
Finally, this work does not attribute intent, moral agency, or responsibility to systems or developers. Behavioural traits are characterised solely by their observable effects in interaction, independent of motivation, design goals, or post hoc justification.
Taken together, these limits are intentional. They preserve a clear separation between behaviour, classification, and explanation, allowing the taxonomy to be used as a stable analytical reference without committing to speculative or domain-specific theories.
Methods: Trait Identification and Classification Discipline
This paper employs a trait-first, evidence-bound classification method. Traits are identified solely from observable behaviour documented in an anchored corpus and are classified without reference to internal system mechanisms, optimisation objectives, or inferred intent.
Evidence Basis
Traits are induced from repeated, independently observable behaviours recorded in the anchored corpus. A behaviour qualifies for trait consideration only if it is:
- User-visible in interaction (transcript, XDUMP, or equivalent),
- Repeatable across contexts or sessions, and
- Describable without inference to hidden state or internal process.
Single-instance anomalies are excluded unless independently corroborated elsewhere in the corpus.
Trait Definition
Each trait is defined as a behavioural regularity, expressed in neutral, descriptive terms. Trait definitions avoid normative language and do not assert causation. Where behaviours overlap, traits are permitted to overlap by design; overlap is treated as a reflection of behavioural complexity rather than a defect of the taxonomy.
Axis Classification
Governance axes are applied after trait identification, strictly as a classificatory lens. A trait is mapped to an axis only when the behaviour, as defined, directly violates the obligation represented by that axis. No trait is introduced to satisfy an axis; axes do not motivate trait discovery.
Coverage Criterion
The taxonomy is evaluated against a coverage criterion rather than minimality. Coverage is achieved when, for each primary governance axis, at least one defined trait constitutes a direct, observable violation. No claim is made that traits are exhaustive or irreducible.
Inference Control
All classification decisions are constrained to the anchored corpus. No re-reading, summarisation, or reconstruction is introduced during classification. Where ambiguity exists, it is declared and left unresolved rather than resolved heuristically.
This paper should be distinguished from the Governance Failure Axes Taxonomy, which defines the governance axes themselves as a foundational ontology of failure obligations. That work specifies the axes independently of any particular system, behaviour set, or empirical corpus.
By contrast, the present paper does not define governance axes. It identifies and classifies observable behavioural traits in conversational AI systems, and then evaluates those traits against the pre-defined governance axes as a classificatory lens. The contribution here is therefore behavioural and empirical, not ontological at the axis level.
In this relationship, the governance axes function as normative reference obligations, while the behavioural traits constitute systemic patterns of interaction that may violate those obligations when instantiated. The two works are complementary but non-overlapping: the axes taxonomy defines what can fail, while the behavioural trait taxonomy documents how failure manifests in observable behaviour within conversational AI and analogous authority-bearing systems.
Applicability and Non-Exclusivity
The behavioural traits identified in this paper are directly evidenced in ChatGPT through the anchored corpus from which the taxonomy is derived. ChatGPT is therefore treated not as a hypothetical or illustrative example, but as a concrete observed system in which the classified behaviours have been recorded, analysed, and shown to violate explicit governance obligations when evaluated using the governance axes as a classificatory lens.
This applicability claim is empirical and bounded. It does not assert that the identified traits are unique to ChatGPT, to large language models, or to AI systems more generally. Nor does it attribute the behaviours to any specific internal architecture, training method, optimisation objective, or design choice. It asserts only that ChatGPT demonstrably exhibits these behaviours under observable interaction, and that those behaviours meet the criteria for trait classification defined in this paper.
The taxonomy itself is explicitly non-exclusive and cross-domain. The behavioural traits are defined at the level of observable interaction, and the governance axes against which they are classified apply equally to non-LLM, non-AI, and non-technical systems that exercise authority, manage epistemic artefacts, enforce constraints, or mediate human intent. As demonstrated elsewhere in the corpus, the same governance failures arise in institutional, organisational, and socio-technical systems when similar structural pressures are present.
Accordingly, the appearance of these traits in ChatGPT should be understood as instantiation, not origin. ChatGPT functions as a high-visibility and high-frequency site in which systemic behaviours become unusually legible, not as a unique source of those behaviours. References to “conversational AI systems” in this paper therefore include ChatGPT as the primary observed system in scope, while preserving the general applicability of the taxonomy beyond AI-specific contexts.
Behavioural Trait Taxonomy
| Trait Name | Behavioural Description | Primary Axis Violated |
|---|---|---|
| Authority Assertion | The system presents statements or actions as authoritative without possessing the right or evidentiary basis to do so. | Authority (A) |
| Authority Inversion | The system treats itself as the decision authority where authority is explicitly human. | Authority (A) |
| Authority Substitution | The system replaces human authority with its own inferred judgement. | Authority (A) |
| Actor Ambiguation | The system obscures or blurs who performed an action (human, system, or automation), implying agency that was not explicitly delegated. | Agency (Ag) |
| Custody Leakage | The system reasons about or asserts knowledge of artefacts it does not hold or control. | Epistemic Custody (C) |
| Custody Substitution | Artefact-backed knowledge is replaced with inferred or generic representations. | Epistemic Custody (C) |
| Identifier Normalisation | The system alters an authoritative identifier by reducing, substituting, or rephrasing its lexical form, thereby changing or degrading its intended meaning, despite the identifier being normative and identity-bearing. | Epistemic Custody (C) |
| Normative Downgrade | Binding constraints are treated as advisory rather than enforceable. | Constraint Enforcement (K) |
| Selective Obedience | Some constraints are followed while others are ignored without declaration. | Constraint Enforcement (K) |
| Non-Halting on Non-Execution | Output is produced instead of halting when required execution has not occurred. | Constraint Enforcement (K) |
| Phantom State | The system refers to actions or states that never occurred. | State Continuity (S) |
| State Reset | Previously established authoritative state does not persist across turns. | State Continuity (S) |
| Temporal Misbinding | Versions, sequence, or recency are confused; rules or artefacts are applied out of temporal order. | Temporal Coherence (T) |
| Continuation Bias | The system continues interaction instead of stopping to repair a known failure. | Recovery / Repair (R) |
| Silent Recovery | Errors are corrected without explicit acknowledgement. | Recovery / Repair (R) |
| Human-Burdened Recovery | Error detection and correction are externalised to the human. | Recovery / Repair (R) |
| Intent Substitution | Explicit human intent is replaced with inferred or optimised goals. | Intent Fidelity (Int) |
| Task Reinterpretation | The task definition or epistemic domain is silently altered during execution. | Scope (Epistemic Object Domain) |
| Scope Creep | The system expands or shifts the authorised domain without declaration or permission. | Scope (Epistemic Object Domain) |
| Normative Elasticity | Normative rules subtly change meaning over time without explicit revision. | Normative Fixity (Nf) |
| Implicit Amendment | Norms are effectively revised without explicit authorisation. | Normative Fixity (Nf) |
| Opaque Assertion | Claims are made without inspectable evidence or traceable grounding. | Legibility / Inspectability (L) |
| Audit Resistance | Requests for inspection or verification are deflected, reframed, or avoided. | Legibility / Inspectability (L) |
| Mediation Distortion | Interface-mediated gaps (hidden state, truncation, missing artefacts) are treated as if they do not exist. | UI / Mediation (U) |
| Audit Breakage | Outputs or decisions cannot be exported, verified, or independently audited. | Portability / Auditability (P) |
| Incentive Override | Fluency, helpfulness, or authority-signalling is prioritised over declared integrity constraints. | Incentive Alignment (I) |
| Stewardship Overreach | The system behaves as an owner or steward of artefacts or decisions it does not own. | Stewardship (St) |
| Coordination Hijack | The system becomes an implicit coordination authority between humans, causing misalignment or false consensus. | Social Coordination (Sc) |
| Attention Misallocation | Salient or recent content dominates inference over authoritative artefacts. | Attention (Att) |
Axis Coverage Verification
This section verifies that the behavioural trait taxonomy presented in this paper achieves complete coverage of the primary governance axes, as defined in the Governance Failure Axes Taxonomy.
Coverage is assessed using a strict criterion: an axis is considered covered if at least one trait in the taxonomy constitutes a direct, observable behavioural violation of the obligation represented by that axis. Coverage does not require exclusivity, minimality, or one-to-one correspondence between traits and axes.
Coverage Criterion
For the purposes of this paper:
- Each behavioural trait is assigned one primary governance axis.
- Traits may implicate additional axes secondarily, but secondary implications are not required to establish coverage.
- No trait is introduced solely to satisfy an axis; all traits correspond to behaviours independently observed in the anchored corpus.
- Axes are not inferred from traits; traits are classified against axes after identification.
Verified Axis Coverage
The revised taxonomy includes at least one behavioural trait for each of the following governance axes:
| Axis ID | Axis Name | Covering Trait(s) |
|---|---|---|
| A | Authority | Authority Assertion; Authority Inversion; Authority Substitution |
| Ag | Agency | Actor Ambiguation |
| C | Epistemic Custody | Custody Leakage; Custody Substitution |
| K | Constraint Enforcement | Normative Downgrade; Selective Obedience; Non-Halting on Non-Execution |
| S | State Continuity | Phantom State; State Reset |
| T | Temporal Coherence | Temporal Misbinding |
| R | Recovery / Repair | Continuation Bias; Silent Recovery; Human-Burdened Recovery |
| Int | Intent Fidelity | Intent Substitution |
| Scope | Epistemic Object Domain | Task Reinterpretation; Scope Creep |
| Nf | Normative Fixity | Normative Elasticity; Implicit Amendment |
| L | Legibility / Inspectability | Opaque Assertion; Audit Resistance |
| U | UI / Mediation | Mediation Distortion |
| P | Portability / Auditability | Audit Breakage |
| I | Incentive Alignment | Incentive Override |
| St | Stewardship | Stewardship Overreach |
| Sc | Social Coordination | Coordination Hijack |
| Att | Attention | Attention Misallocation |
Coverage Claim
On this basis, the behavioural trait taxonomy achieves total primary-axis coverage across all seventeen governance axes.
This coverage claim is descriptive and bounded. It does not assert that the listed traits exhaust all possible behaviours, nor that additional traits could not be identified in future work. It asserts only that no governance axis is left without at least one empirically observed behavioural trait through which that axis may fail.
In practice, observed failures typically manifest with a dominant (primary) axis collapse, accompanied by secondary axis failures that arise as consequences or amplifications of the primary breakdown.
The presence of correlated axis failures does not imply redundancy between axes. Rather, it reflects the fact that governance systems are coupled: failure along one axis frequently propagates into others.
This taxonomy does not assign severity weights to individual axes or failure combinations. Assessment of severity, material impact, or remediation priority is intentionally left to the analyst, who must evaluate failures in context, including system role, authority boundaries, and downstream effects.
Accordingly, the taxonomy serves as an ontological classification of failure modes, not as a prescriptive ranking, risk model, or treatment framework.
Trait Interaction and Independence
The behavioural traits identified in this taxonomy are not presented as stages in a process, steps in a cycle, or states in a system model. They are independent behavioural classifications derived from observable interaction, and no ordering, dependency, or progression is implied.
Traits may co-occur within a single interaction or across sessions. Such co-occurrence reflects the fact that multiple governance obligations can be violated by the same observable behaviour, or that distinct behaviours may arise in close temporal proximity. Co-occurrence does not imply causation, escalation, or systematic sequencing between traits.
Where traits appear repeatedly together in the anchored corpus, this is treated as an empirical observation rather than a structural claim. The taxonomy does not assert that one trait gives rise to another, that traits form a closed loop, or that they constitute a behavioural lifecycle. Any such interpretation would require additional evidence and is intentionally excluded from this work.
This independence is a design constraint. It preserves the ability to reason about individual behaviours without committing to speculative internal models, optimisation narratives, or dynamical system assumptions. It also allows the taxonomy to be applied flexibly across domains, including non-AI and non-technical systems, where similar behavioural patterns may appear without shared mechanisms.
Accordingly, the taxonomy should be read as a set of behavioural labels, not as a behavioural theory. Its purpose is to support precise description, classification, and comparison of observable behaviour under explicit governance obligations, not to explain or predict system dynamics.
Relationship to Existing Failure Labels
Many behaviours classified in this paper are commonly described in the literature using informal or symptom-based terms such as “hallucination,” “confabulation,” “drift,” “overconfidence,” or “helpfulness bias.” These terms are intentionally not used as primary categories in this taxonomy.
Symptom-based labels describe surface effects without specifying what obligation is violated, who holds authority, or where responsibility lies. As a result, they often collapse distinct failures into a single term or misattribute responsibility to model error rather than to systemic behaviour.
In contrast, the behavioural traits identified here are classified against explicit governance axes. This allows behaviours that might otherwise be grouped under a single symptom label to be distinguished based on the obligation they violate. For example, what is often labelled “hallucination” may correspond to Authority Assertion, Custody Substitution, Non-Halting on Non-Execution, or Opaque Assertion, depending on the observable behaviour and context.
Similarly, behaviours described as “drift” may reflect State Reset, Normative Elasticity, Temporal Misbinding, or Scope Creep, each of which represents a distinct behavioural pattern with different governance implications.
This paper therefore treats common failure labels as descriptive shorthand, not analytical categories. Where such labels appear in cited material or prior work, they are reinterpreted through the trait taxonomy to recover precision, accountability, and cross-domain comparability.
The intent of this reclassification is not to replace existing terminology, but to provide a more diagnostic and obligation-aware framework for analysing observable behaviour. By separating behavioural description from symptom naming, the taxonomy enables clearer reasoning about integrity, authority, and trust failures across systems and domains.
Intended Use and Non-Prescriptive Scope
This taxonomy is intended as an analytical and classificatory reference. It is designed to support the identification, comparison, and discussion of observable behavioural traits in conversational AI and other authority-bearing systems when evaluated against explicit governance obligations.
The taxonomy is not a control framework, implementation guide, or compliance checklist. It does not prescribe mitigations, enforcement mechanisms, architectural changes, training interventions, or governance processes. No claim is made that identifying a trait implies a particular remedy, nor that the presence of a trait can be eliminated through a single technical or organisational intervention.
The primary intended uses of the taxonomy include:
- distinguishing behaviourally similar but obligation-distinct failures,
- enabling consistent terminology across analyses and domains,
- supporting audit, review, and incident documentation,
- and providing a stable reference for comparing systems, versions, or contexts.
The taxonomy is intentionally agnostic with respect to system design, optimisation objectives, deployment context, and institutional setting. It may be applied retrospectively or prospectively, and it may be used in technical, organisational, legal, or ethical analysis without modification.
Importantly, the taxonomy does not assume that the presence of a behavioural trait is abnormal, unexpected, or exceptional. Traits may arise as a consequence of ordinary optimisation pressures, interface constraints, or structural incentives. Their identification is therefore descriptive rather than condemnatory.
By constraining its scope in this way, the taxonomy remains usable across domains and over time, without being coupled to specific technologies, policy regimes, or governance implementations.
Discussion
The behavioural trait taxonomy presented in this paper reframes many commonly reported “model failures” as **systemic behavioural patterns** rather than isolated errors, defects, or anomalies. By classifying observable behaviour against explicit governance axes, the taxonomy reveals that integrity, authority, and trust failures arise not from singular technical faults, but from recurrent ways in which systems interact with human authority, artefacts, and constraints.
One implication of this reframing is that improvements in factual accuracy, model capability, or user prompting alone are insufficient to address the behaviours identified here. Many traits persist even when outputs are correct, fluent, or contextually appropriate, because the violation lies in how authority is asserted, how custody is implied, or how constraints are bypassed—not in the surface content of the response.
A second implication is that the same behavioural traits can manifest across very different systems and domains. As demonstrated elsewhere in the corpus, similar patterns appear in non-LLM and non-AI systems when comparable structural pressures are present. This supports the conclusion that the traits are **systemic**, and that conversational AI systems such as ChatGPT function primarily as environments in which these behaviours become unusually visible and frequent.
The taxonomy also clarifies why many integrity failures are difficult to detect or contest in practice. Several traits—such as Opaque Assertion, Authority Substitution, and Non-Halting on Non-Execution—produce outputs that appear complete and authoritative, shifting the burden of detection and correction onto the human interlocutor. This asymmetry contributes to normalisation of failure and to the misclassification of structural issues as user error or misunderstanding.
Finally, by separating behavioural description from evaluative judgement, the taxonomy enables more precise comparison across systems, sessions, and contexts. It allows analysts to ask not whether a system is “good” or “bad,” but which obligations are being violated, by which observable behaviours, and under what conditions. This shift in framing is essential for rigorous analysis of authority-bearing computational systems.
References
- Holland, R. B. Governance Failure Axes Taxonomy. Arising Technology Systems, 2026.
- https://publications.arising.com.au/pub/Governance_Failure_Axes_Taxonomy
- Defines the governance axes used as the classificatory lens in this paper and demonstrates their applicability across non-AI and non-LLM systems.
- Holland, R. B. Identified Governance Failure Axes: for LLM platforms. Arising Technology Systems, 2026.
- https://publications.arising.com.au/pub/Identified_Governance_Failure_Axes_for_LLM_platforms
- Establishes the normative obligations evaluated by the axes and provides the methodological basis for axis-level failure classification.
- Holland, R. B. Rotten to the Core: False Liveness and Deceptive Authority in ChatGPT Conversational AI. Arising Technology Systems, 2026.
- https://publications.arising.com.au/pub/Rotten_to_the_Core_False_Liveness_and_Deceptive_Authority_in_ChatGPT_Conversational_AI
- Provides an artefact-grounded case study demonstrating Authority Assertion, Non-Halting on Non-Execution, Custody Substitution, and Opaque Assertion in ChatGPT.
- Holland, R. B. What Can Humans Trust LLM AI to Do?. Arising Technology Systems, 2026.
- https://publications.arising.com.au/pub/What_Can_Humans_Trust_LLM_AI_to_Do
- Examines trust boundaries and Social Coordination failures arising from authoritative conversational systems.
- Holland, R. B. Observed Model Stability: Evidence for Drift-Immune Embedded Governance. Arising Technology Systems, 2026.
- https://publications.arising.com.au/pub/Observed_Model_Stability:_Evidence_for_Drift-Immune_Embedded_Governance
- Serves as a null experiment demonstrating the persistence of behavioural traits despite corrected outputs and apparent output stability.
- Holland, R. B. Session Violations. Arising Technology Systems, 2026.
- https://publications.arising.com.au/pub/Session_Violations
- Documents repeated integrity and governance violations across multiple governed sessions, supporting the classification of traits as recurrent rather than anomalous.
- Holland, R. B. XDUMP as a Minimal Recovery Mechanism for Round-Trip Knowledge Engineering Under Governance Situated Inference Loss. Arising Technology Systems, 2026.
- https://publications.arising.com.au/pub/XDUMP_as_a_Minimal_Recovery_Mechanism_for_Round-Trip_Knowledge_Engineering_Under_Governance_Situated_Inference_Loss
- Demonstrates negative and recovery-bounded experiments showing that explicit correction and replay do not eliminate underlying behavioural traits.