Cognitive Memoisation Corpus Map: Difference between revisions

From publications
Line 444: Line 444:
1. A complete register mapping <title> → publication_date (UTC Z or declared local); and
1. A complete register mapping <title> → publication_date (UTC Z or declared local); and
2. A separate list of any pages with no extractable date, formatted as:
2. A separate list of any pages with no extractable date, formatted as:
* [[<title>]]
===no-compliant pages===
* [[<title>]] <captured-date> | UNKNOWN


If any page cannot be processed due to missing sandbox artefacts or expired uploads, respond with:
If any page cannot be processed due to missing sandbox artefacts or expired uploads, respond with:

Revision as of 01:36, 17 January 2026

metadata

Title: Cognitive Memoisation Corpus Map
Author: Ralph B. Holland
version: 2.0.0
Publication Date: 2025-12-22T19:10Z
Update: 2026-01-13T19:09 new dimension table and two projections.
2026-01-06T10:25Z v1.3.0 Includes the release of CM-2
2025-01-04T05:12 v1.1.0 renamed from "Cognitive Memoisation: A framework for human cognition" to "Cognitive Memoisation: corpus guide"
Include papers.
Affiliation: Arising Technology Systems Pty Ltd
Contact: ralph.b.holland [at] gmail.com
Provenance: This is an authored paper maintained as a MediaWiki document as part of the category:Cognitive Memoisation corpus.
Status: final =

Metadata (Normative)

The metadata table immediately preceding this section is CM-defined and constitutes the authoritative provenance record for this MWDUMP artefact.

All fields in that table (including artefact, author, version, date, local timezone, and reason) MUST be treated as normative metadata.

The assisting system MUST NOT infer, normalise, reinterpret, duplicate, or rewrite these fields. If any field is missing, unclear, or later superseded, the change MUST be made explicitly by the human and recorded via version update, not inferred.

This document predates its open licensing.

As curator and author, I apply the Apache License, Version 2.0, at publication to permit reuse and implementation while preventing enclosure or patent capture. This licensing action does not revise, reinterpret, or supersede any normative content herein.

Authority remains explicitly human; no implementation, system, or platform may assert epistemic authority by virtue of this license.

(2025-12-18 version 1.0 - See the Main Page)

Cognitive Memoisation Corpus Map

Introductory Position

This paper serves as the primary introduction and conceptual anchor for the Cognitive Memoisation (CM) corpus.

Cognitive Memoisation is a human-governed knowledge-engineering framework designed to preserve conceptual memory across interactions with stateless Large Language Models (LLMs). CM helps humans avoid repeated rediscovery (“Groundhog Day”) and carry forward both resolved knowledge and unresolved cognition (Dangling Cognates).

CM operates entirely outside model-internal memory, leveraging the power of LLMs to infer postulates and perform stochastic pattern matching, all under the curation of the human controlling the CM session.

The stateless nature of LLMs is an intentional design choice made for human safety and privacy. This design ensures that no personal or contextual information is retained across sessions, aligning with commitment to data protection. The safety mechanism prevents LLMs from making introspection or gaining agency, ensuring that the model does not evolve autonomously or retain knowledge beyond its interactions.

Cognitive Memoisation (CM) bridges this lack of memory by enabling humans to externalise cognitive artefacts, preserving knowledge over time. This allows for continuous human reasoning while keeping LLMs sand-boxed—both the human and the model are sandboxed to ensure security. Through CM, humans can elaborate on unresolved cognition (Dangling Cognates) and carry forward insights and propositions, while the LLM remains within its functional boundaries, executing only permitted tasks and with no capacity to alter its inherent state or memory.

This document establishes the rationale, scope, and interpretive framework required to understand Cognitive Memoisation and its role in enabling human-centric knowledge workflows with stateless LLMs.

Non conformant Data pages

Timezone is /australia/sydney UTC + 11 hrs

non-conformant dimension assignment (curator mapping missing)


Pages

Cannonical Dimension Table

Dim ID Canonical Dimension (verbatim) Scope Note
D1 Statelessness and Memory Management in LLMs LLM statelessness, safety, memory absence
D2 Externalisation of Cognitive Artefacts Durable external cognition
D3 Round-Trip Knowledge Engineering (RTKE) Re-ingestion, reuse, evolution
D4 Dangling Cognates and Unresolved Cognition Unfinished / provisional concepts
D5 Constraints and Knowledge Integrity Groundhog Day prevention
D6 Human Curated Knowledge vs. Model State Authority separation
D7 Reflexive Development of Cognitive Memoisation (RTKE Case Study) Self-referential development
D8 Dangling Cognates as First-Class Cognitive Constructs Formal DC elevation
D9 UI Boundary Friction as a Constraint on RTKE Platform limits
D10 Plain-Language Accessibility and Public Framing Reader-facing clarity
D11 Governance, Authority, and Failure Modes Control, breakdown, recovery
D12 Client-side Memoisation (CM-2) Mechanism disclosure
D13 Failure-First Cognitive Tool Design Designing cognitive tools starting from breakdowns, loss events, and error conditions rather than nominal operation
D14 Non-Authoritative Inference Reasoning and inference that explicitly do not promote themselves to epistemic authority
D15 Epistemic Boundary Signals and Role Discipline Explicit signalling of intent, role, scope, and authority boundaries in human–LLM interaction
D16 Session Loss and Recovery Semantics Treating session loss, truncation, and breakdown as first-class structural signals rather than incidental failure
D17 Cognitive Artefact Lifecycle Management Creation, revision, supersession, and retirement of externalised cognitive artefacts
D18 Public vs. Internal Epistemic Registers Distinction between internal technical reasoning and public-facing explanatory framing
D19 Authority Misattribution Risks Failure modes where assistive systems are granted or assume epistemic authority incorrectly
D20 Constraints as Generative Structures Constraints treated as productive cognitive structures rather than limitations
D21 Exploratory Cognition Under Pressure Fast, provisional, or high-ambiguity cognition conducted without epistemic collapse
D22 Rehydration Without Recall Resumption of cognition via externalised artefacts rather than memory or conversational recall

Dimension-Centric Projection (Documents Ordered by Time Within Each Dimension)

D1 — Statelessness and Memory Management in LLMs

D2 — Externalisation of Cognitive Artefacts

D3 — Round-Trip Knowledge Engineering (RTKE)

D4 — Dangling Cognates and Unresolved Cognition

D5 — Constraints and Knowledge Integrity

D6 — Human Curated Knowledge vs. Model State

D7 — Reflexive Development of Cognitive Memoisation (RTKE Case Study)

D8 — Dangling Cognates as First-Class Cognitive Constructs

D9 — UI Boundary Friction as a Constraint on RTKE

D10 — Plain-Language Accessibility and Public Framing

D11 — Governance, Authority, and Failure Modes

D12 — Client-side Memoisation (CM-2)

D13 — Failure-First Cognitive Tool Design

D14 — Non-Authoritative Inference

D15 — Epistemic Boundary Signals and Role Discipline

D16 — Session Loss and Recovery Semantics

D17 — Cognitive Artefact Lifecycle Management

D18 — Public vs. Internal Epistemic Registers

D19 — Authority Misattribution Risks

D20 — Constraints as Generative Structures

D21 — Exploratory Cognition Under Pressure

D22 — Rehydration Without Recall


Time-Ordered Projection with Inline Dimensions

2026-01-05 — FOUNDATION

2026-01-06 — FOUNDATION

2026-01-07 — FOUNDATION

2026-01-08 — DEVELOPMENT

2026-01-09 — DEVELOPMENT

2026-01-10 — DEVELOPMENT

2026-01-11 — DEVELOPMENT

2026-01-12 — FAILURE & RECOVERY

2026-01-13 — SYNTHESIS

2026-01-14 — GOVERNANCE

2026-01-15 — FUTURE


Appendix A - Cognitive Memoisation: Corpus Mapping and Projection Invariants

Scope and Intent

This artefact enumerates the complete set of invariants required to:

  • construct the canonical dimension table
  • assign dimensions to corpus artefacts
  • produce time-ordered projections
  • produce divergence (dimension) projections
  • preserve epistemic discipline, provenance, and human authority

These invariants apply to corpus organisation and projection only. They do not introduce new CM definitions, modify CM-master invariants, or assert governance over reasoning behaviour.

Authority and Epistemic Position

  • All invariants herein are human-authored and curator-governed.
  • The assisting system MUST treat this artefact as binding for corpus mapping tasks when asserted.
  • These invariants govern representation and organisation, not truth, correctness, or inference.

Human Instructions Invariants

Human commands must be followed without interpretation or paraphrase as gated steps, non compliance must be alerted to the human immediately.

The human SHALL instruct you through These gate steps one gate at a time

Gated Step 1

GATED STEP 1 — XML EXTRACTION VERIFICATION

“Re-extract all <page> elements from the uploaded MediaWiki XML into the sandbox now.

Do not analyse, classify, date, or project anything.

When (and only when) extraction is complete, respond with only the following two items:
	1.	A complete list of sandbox files (one per page).
	2.	A canonical title register mapping <title> → sandbox_path.

If you cannot perform this extraction exactly as specified, respond with:
FAILED: sandbox extraction not completed, and nothing else.”

Gated Step 2

GATED STEP 2 — PUBLICATION DATE REGISTER VERIFICATION

“Using only the sandbox artefacts and title register verified in GATED STEP 1, extract the publication date for each page according to the Date Extraction Invariant.

Do not infer, normalise, or correct dates.

When (and only when) extraction is complete, respond with only:
	1.	A complete register mapping <title> → publication_date (UTC Z or declared local); and
	2.	A separate list of any pages with no extractable date, formatted as:
===no-compliant pages===
* [[<title>]]  <captured-date> | UNKNOWN

If any page cannot be processed due to missing sandbox artefacts or expired uploads, respond with:
FAILED: date extraction not completed, and nothing else.”

gated step 3

GATED STEP 3 — DIMENSION ASSIGNMENT REGISTER VERIFICATION

“Using only the sandbox artefacts verified in GATED STEP 1 and the canonical dimension table provided by the curator, assign dimensions to each page strictly per CM-CORPUS-INV-01 through CM-CORPUS-INV-03.

Do not infer, rename, merge, split, or optimise dimensions.

When (and only when) assignment is complete, respond with only:
	1.	A complete register mapping <title> → {D# — Canonical Dimension Name, …}; and
	2.	A separate list of any pages with missing curator mapping, formatted as:
* [[<title>]]

If sandbox artefacts are missing or expired, respond with:
FAILED: dimension assignment not completed, and nothing else.”

gated step 4

GATED STEP 4 — TIME-ORDERED PROJECTION EMISSION

“Using only the verified outputs of GATED STEP 1 (sandbox + title register), GATED STEP 2 (title → publication date register), and GATED STEP 3 (title → dimension register), emit the Time-Ordered Projection with Inline Dimensions in strict accordance with CM-CORPUS-INV-11 and CM-CORPUS-INV-12.

Do not introduce new artefacts, dates, dimensions, groupings, or interpretations.

When (and only when) the projection is complete, respond with only the MediaWiki MWDUMP projection.

If any upstream register is missing, incomplete, inconsistent, or expired, respond with:
FAILED: time-ordered projection not emitted, and nothing else.”

gate step 5


GATED STEP 5 — DIMENSION-CENTRIC (DIVERGENCE) PROJECTION EMISSION

“Using only the verified outputs of GATED STEP 1 (sandbox + title register), GATED STEP 2 (title → publication date register), and GATED STEP 3 (title → dimension register), emit the Dimension-Centric Projection (Documents Ordered by Time Within Each Dimension) in strict accordance with CM-CORPUS-INV-13 and CM-CORPUS-INV-14.

Do not introduce new artefacts, dates, dimensions, or assignments.

When (and only when) the projection is complete, respond with only the MediaWiki MWDUMP projection.

If any upstream register is missing, incomplete, inconsistent, or expired, respond with:
FAILED: dimension-centric projection not emitted, and nothing else.”

gated step 6

GATED STEP 6 — PROJECTION CONSISTENCY AND COMPLETENESS VERIFICATION

“Verify that the outputs of GATED STEP 4 (Time-Ordered Projection) and GATED STEP 5 (Dimension-Centric Projection) are mutually consistent and complete, in accordance with CM-CORPUS-INV-09 and CM-CORPUS-INV-10.

Do not modify, infer, or repair content.

When (and only when) verification is complete, respond with only one of the following:
	•	VERIFIED: projections consistent and complete
	•	FAILED: projection inconsistency detected — followed by a minimal list of affected [[<title>]] entries.”

CM-CORPUS-INV-21 — End-to-End Execution Integrity Invariant

All corpus construction, extraction, classification, dating, and projection steps MUST be executed end-to-end exactly as specified by the active corpus invariants.

The assisting system MUST:

  • execute each required step explicitly and in sequence;
  • re-execute all upstream steps whenever a new authoritative input (e.g. MediaWiki XML dump) is supplied;
  • rebuild all dependent artefacts (including sandbox files, title registers, date registers, and dimension mappings) from that input;
  • treat any prior intermediate state as invalid once a new authoritative input is asserted.

The assisting system MUST NOT:

  • assume that earlier steps remain valid after new inputs are provided;
  • reuse, cache, infer, or “remember” results from previous extractions;
  • skip, compress, reorder, or approximate mandated steps;
  • substitute reasoning, plausibility, or prior knowledge for explicit execution.

If any required step cannot be completed exactly as specified, the assisting system MUST stop processing and report the failure condition explicitly, without attempting partial output or inferred completion.

Title Invariant

The <title> string from the MediaWiki XML is an opaque key.

It MUST be copied byte-for-byte.
It MUST NEVER be retyped, re-generated, paraphrased, normalised, inferred, or “corrected”.
The model MUST use the XML <title> value as the page name in ALL projections and ... links.

Corpus Map Invariant

All corpus maps and projections MUST be generated exclusively from MediaWiki XML <page> elements by extracting each page into a separate sandbox artefact (one page per file) and recording a canonical title register mapping title -> sandbox_path; the XML <title> MUST be preserved verbatim as the register key and as the ... link target in all projections, and every projection MUST be emitted by dereferencing only that register (no free-typed titles).

Title Safety Transformation Invariant

If a MediaWiki XML <title> is transformed for storage or transport safety (e.g. filesystem-safe filename generation), the system MUST record and surface an explicit mapping between the original verbatim <title> and the transformed representation; such transformations MUST be purely mechanical, MUST NOT alter the canonical title register, and MUST be declared wherever the transformed form is used.

Date Invariant

1. Dates shall be found within paper metadata sections.

2. A metadata section SHALL contain the word metadata.

3. The metadata section SHALL be follow by a metadata (Normative) section

4. The netaadata section SHALL be verified before processing the datetime. - should no metadata section be provided then the entire document must be scanned for an iso date-time (which ought to be the publication date).

5. The model SHALL be aware that the text for publication date is quite variable - the model must use a wide generic search and not keys found in limited samples of metadata sections

Date Extraction Invariant (Normative)

Publication dates MUST be extracted from document content using the following procedure:

  1. The system MUST first locate a section containing the word metadata (case-insensitive) and verify the presence of a following Metadata (Normative) section where available.
  2. Within the metadata section, or—if no metadata section is present—within the entire document body, the system MUST perform a wide textual scan for ISO-8601 date strings.
  3. A document SHALL be considered date-conformant if and only if it contains at least one substring matching the following regular expression (case-insensitive):

(?i)[Dd]ate.*\d{4}-\d{2}-\d{2}(?:T\d{2}:\d{2})?Z

  1. The authoritative local timezone for publication dates SHALL be the timezone explicitly specified as Australia/Sydney or as defined in the CM-master normative document.
  2. The matched ISO date MUST NOT include seconds
  3. If a publication datetime is explicitly suffixed with Z, it SHALL be treated as UTC and MUST NOT be modified or reinterpreted.
  4. If a publication datetime is NOT suffixed with Z, it SHALL be assumed to be expressed in the authoritative local timezone (Australia/Sydney) and MUST be converted mechanically to UTC (Z) using the correct offset in effect at the publication date.
  5. Timezone conversion SHALL be purely mechanical and MUST NOT alter the calendar date or time semantics beyond the required offset adjustment.
  6. Any timezone assumption or conversion applied to a publication datetime MUST be explicitly recorded and auditable.
  7. No implicit timezone inference or “helpful correction” is permitted outside these rules
  8. The first conformant date match in document order SHALL be treated as the publication date for corpus-mapping and ordering purposes.
  9. If no conformant match is found, the document MUST be explicitly flagged as date-non-conformant and excluded from time-ordered projections until corrected.

Should a non-conformant document be found the model MUST stop processing and report non-conformant pages as: MWDUMP as code into the safe copy box formatted as follows:

non-conformant page metadata

  • [[<title>]] \n

SO the human can inspect.

Canonical Dimension Invariants

CM-CORPUS-INV-01 — Dimension Canonicality Invariant

Each dimension MUST have:

  • a stable identifier (e.g. D1, D2, …)
  • a single canonical name
  • a stable semantic scope

Dimension identifiers and names MUST NOT be inferred, renamed, merged, split, or reordered by the assisting system.

CM-CORPUS-INV-02 — Dimension Vocabulary Closure Invariant

The set of dimensions is open ended.

Additional dimensions SHALL be introduced when found.


CM-CORPUS-INV-03 — Dimension Semantic Fidelity Invariant

Assignment of a dimension to an artefact MUST reflect explicit scope alignment present in the artefact itself or in curator-supplied mapping.

The assisting system MUST NOT infer dimension relevance based on stylistic similarity, topic proximity, or semantic guesswork.

Artefact Identification Invariants

CM-CORPUS-INV-04 — Normative Title Fidelity Invariant

Artefacts MUST be referenced using their exact normative MediaWiki page titles.

Paraphrase, abbreviation, or normalisation of titles is prohibited.

CM-CORPUS-INV-05 — Artefact Identity Stability Invariant

An artefact is identified solely by its title and publication date.

Later editorial changes do not create new artefact identities unless explicitly versioned by the human.

Temporal Ordering Invariants

CM-CORPUS-INV-06 — Declared Date Authority Invariant

Time ordering MUST use the declared publication date as supplied by the human curator.

The assisting system MUST NOT infer, estimate, or correct dates.

If multiple dates exist, the curator MUST specify which date governs ordering.

CM-CORPUS-INV-07 — Sequence Over Precision Invariant

Temporal sequence is authoritative even if time precision is coarse.

Relative ordering MUST be preserved even when exact timestamps are unavailable.

Projection Construction Invariants

CM-CORPUS-INV-08 — Projection Non-Inference Invariant

Projections MUST NOT introduce:

  • new artefacts
  • new dimensions
  • new relationships
  • new interpretations

A projection is a re-expression of existing assignments only.

CM-CORPUS-INV-09 — Projection Completeness Invariant

Within declared scope, projections MUST include all eligible artefacts.

Selective omission constitutes a projection violation.

CM-CORPUS-INV-10 — Multi-Projection Consistency Invariant

All projections MUST be semantically consistent with one another.

Differences between projections may exist only in ordering or grouping, not in content.

Time-Ordered Projection Invariants

CM-CORPUS-INV-11 — Time-Ordered Projection Structure Invariant

A time-ordered projection MUST:

  • group artefacts by declared date
  • list artefacts within each group
  • attach dimensions as subordinate information

Time is the primary axis; dimensions are secondary.

CM-CORPUS-INV-12 — Inline Dimension Expansion Invariant

When dimensions are listed under artefacts:

  • each dimension MUST include both identifier and full canonical name
  • users MUST NOT be required to consult a separate table to understand dimension meaning

Divergence (Dimension) Projection Invariants

CM-CORPUS-INV-13 — Dimension-Centric Projection Structure Invariant

A divergence projection MUST:

  • use dimensions as the primary axis
  • list all artefacts participating in each dimension
  • preserve publication dates for temporal context

CM-CORPUS-INV-14 — Non-Exclusivity Invariant

Artefacts MAY appear under multiple dimensions.

Multiplicity is expected and MUST NOT be collapsed.

Representation and Emission Invariants

CM-CORPUS-INV-15 — MediaWiki-Only Emission Invariant

All corpus projections emitted as MWDUMP MUST use MediaWiki syntax exclusively.

Markdown, hybrid markup, or implicit formatting is prohibited.

CM-CORPUS-INV-16 — Bullet Level Semantics Invariant

Bullet depth conveys semantic hierarchy:

  • one asterisk (*) — artefact
    • two asterisks (**) — dimension assignment
      • three asterisks (***) — sub-dimension or note (if present)
        • four asterisks (****) — reserved

The assisting system MUST respect bullet depth semantics.

Human Readability and Governance Invariants

CM-CORPUS-INV-17 — Human Readability Invariant

Corpus projections MUST be intelligible to human readers without external tooling.

Abbreviation without expansion is prohibited.

CM-CORPUS-INV-18 — No Implied Authority Invariant

Presence of an artefact or dimension in a projection MUST NOT be interpreted as endorsement, priority, or correctness.

Organisation does not imply evaluation.

Change and Evolution Invariants

CM-CORPUS-INV-19 — Explicit Change Invariant

Any change to:

  • dimension set
  • dimension definitions
  • artefact–dimension assignments
  • projection rules

MUST be explicitly declared by the human curator.

Silent drift is prohibited.

CM-CORPUS-INV-20 — Backward Compatibility Invariant

Existing projections remain valid historical artefacts unless explicitly superseded.

New projections MUST NOT retroactively invalidate prior ones.

Summary for Human Readers

These invariants exist to ensure that the Cognitive Memoisation corpus:

  • remains navigable as it grows
  • can be read chronologically or thematically without confusion
  • preserves human authority over meaning and structure
  • avoids accidental reinterpretation by tooling or automation

They formalise how maps are drawn — not what the territory means.

Summary for Assisting Systems

When constructing corpus tables or projections:

  • do not invent
  • do not infer
  • do not optimise
  • do not rename
  • do not omit

Rearrange only what is already governed.

categories