Publications Access Graphs: Difference between revisions
| Line 193: | Line 193: | ||
|} | |} | ||
=Corpus Lead-In Projection: Colour-Map Hardening Invariants= | ===Corpus Lead-In Projection: Colour-Map Hardening Invariants=== | ||
This section hardens the visual determinism of the Corpus Lead-In Projection while allowing controlled corpus growth. | This section hardens the visual determinism of the Corpus Lead-In Projection while allowing controlled corpus growth. | ||
== Authority == | ==== Authority ==== | ||
* This Colour Map is **authoritative** for all listed corpus pages. | * This Colour Map is **authoritative** for all listed corpus pages. | ||
* The assisting system MUST NOT invent, alter, or substitute colours or line styles for listed pages. | * The assisting system MUST NOT invent, alter, or substitute colours or line styles for listed pages. | ||
* Visual encoding is a governed property, not a presentation choice. | * Visual encoding is a governed property, not a presentation choice. | ||
== Binding Rule == | ==== Binding Rule ==== | ||
* For any page listed in the Deterministic Colour Map table: | * For any page listed in the Deterministic Colour Map table: | ||
** The assigned (colour index, colour hex, line style) pair MUST be used exactly. | ** The assigned (colour index, colour hex, line style) pair MUST be used exactly. | ||
** Deviation constitutes a projection violation. | ** Deviation constitutes a projection violation. | ||
== Legend Ordering Separation == | ==== Legend Ordering Separation ==== | ||
* Colour assignment and legend ordering are orthogonal. | * Colour assignment and legend ordering are orthogonal. | ||
* Legend ordering MUST continue to follow the accumulated human GET_ok invariant. | * Legend ordering MUST continue to follow the accumulated human GET_ok invariant. | ||
* Colour assignment MUST NOT be influenced by hit counts, rank, or ordering. | * Colour assignment MUST NOT be influenced by hit counts, rank, or ordering. | ||
== New Page Admission Rule == | ==== New Page Admission Rule ==== | ||
* Pages not present in the current Colour Map | * Pages not present in the current Colour Map MUST appear in a projection. | ||
* New pages MUST be assigned styles in strict sequence order: | * New pages MUST be assigned styles in strict sequence order: | ||
** Iterate line style first, then colour index, exactly as defined in the base palette. | ** Iterate line style first, then colour index, exactly as defined in the base palette. | ||
| Line 223: | Line 219: | ||
* The assisting system MUST NOT reshuffle existing assignments to “make space”. | * The assisting system MUST NOT reshuffle existing assignments to “make space”. | ||
== Provisional Encoding Rule == | ==== Provisional Encoding Rule ==== | ||
* Visual assignments for newly admitted pages are **provisional** until recorded. | * Visual assignments for newly admitted pages are **provisional** until recorded. | ||
* A projection that introduces provisional encodings MUST: | * A projection that introduces provisional encodings MUST: | ||
| Line 229: | Line 225: | ||
** Produce an updated Colour Map table for curator review. | ** Produce an updated Colour Map table for curator review. | ||
== Curator Ratification == | ==== Curator Ratification ==== | ||
* Only the human curator may ratify new colour assignments. | * Only the human curator may ratify new colour assignments. | ||
* Ratification occurs by appending new rows to the Colour Map table with a date stamp. | * Ratification occurs by appending new rows to the Colour Map table with a date stamp. | ||
* Once ratified, assignments become binding for all future projections. | * Once ratified, assignments become binding for all future projections. | ||
== Backward Compatibility == | ==== Backward Compatibility ==== | ||
* Previously generated projections remain valid historical artefacts. | * Previously generated projections remain valid historical artefacts. | ||
* Introduction of new pages MUST NOT retroactively alter the appearance of older projections. | * Introduction of new pages MUST NOT retroactively alter the appearance of older projections. | ||
== Failure Mode Detection == | ==== Failure Mode Detection ==== | ||
* If a projection requires more unique (colour, line-style) pairs than the declared palette provides: | * If a projection requires more unique (colour, line-style) pairs than the declared palette provides: | ||
** The assisting system MUST fail explicitly. | ** The assisting system MUST fail explicitly. | ||
** Silent reuse, substitution, or visual approximation is prohibited. | ** Silent reuse, substitution, or visual approximation is prohibited. | ||
== Rationale (Non-Normative) == | ==== Rationale (Non-Normative) ==== | ||
* This hardening ensures: | * This hardening ensures: | ||
** Cross-run visual comparability | ** Cross-run visual comparability | ||
Revision as of 10:01, 30 January 2026
CM corpus access graphs
- 2026-01-30
Corpus Projection Invariants (Normative)
Authority and Governance
- The projections are curator-governed and MUST be reproducible from declared inputs alone.
- The assisting system MUST NOT infer, rename, paraphrase, merge, split, or reorder titles beyond the explicit rules stated here.
- The assisting system MUST NOT optimise for visual clarity at the expense of semantic correctness.
- Any deviation from these invariants MUST be explicitly declared by the human curator with a dated update entry.
Authoritative Inputs
- Input A: Hourly rollup TSVs produced by logrollup tooling.
- Input B: Corpus bundle manifest (corpus/manifest.tsv).
- Input C: Host scope fixed to publications.arising.com.au.
- Input D: Full temporal range present in the rollup set (no truncation).
Eligible Resource Set (Corpus Titles)
- The eligible title set MUST be derived exclusively from corpus/manifest.tsv.
- Column 1 of manifest.tsv is the authoritative MediaWiki page title.
- Only titles present in the manifest (after normalisation) are eligible for projection.
- Titles present in the manifest MUST be included in the projection domain even if they receive zero hits in the period.
- Titles not present in the manifest MUST be excluded even if traffic exists.
Path → Title Extraction
- A rollup record contributes to a page only if a title can be extracted by these rules:
- If path matches /pub/<title>, then <title> is the candidate.
- If path matches /pub-dir/index.php?<query>, the title MUST be taken from title=<title>.
- If title= is absent, page=<title> MAY be used.
- Otherwise, the record MUST NOT be treated as a page hit.
- URL fragments (#…) MUST be removed prior to extraction.
Title Normalisation
- URL decoding MUST occur before all other steps.
- Underscores (_) MUST be converted to spaces.
- UTF-8 dashes (–, —) MUST be converted to ASCII hyphen (-).
- Whitespace runs MUST be collapsed to a single space and trimmed.
- After normalisation, the title MUST exactly match a manifest title to remain eligible.
- Main Page MUST be excluded from this projection.
Accumulated human_get_ok projection
Noise and Infrastructure Exclusions
- The following MUST be excluded prior to aggregation:
- Special:, Category:, Category talk:, Talk:, User:, User talk:, File:, Template:, Help:, MediaWiki:
- /resources/, /pub-dir/load.php, /pub-dir/api.php, /pub-dir/rest.php
- /robots.txt, /favicon.ico
- sitemap (any case)
- Static resources by extension (.png, .jpg, .jpeg, .gif, .svg, .ico, .webp)
Metric Definition
- The only signal used is human_get_ok.
- Redirects and non-human classifications MUST NOT be included.
- No inference from other status codes or agents is permitted.
Temporal Aggregation
- Hourly buckets MUST be aggregated into daily totals per title.
- Accumulated value per title is defined as:
- cum_hits(title, day_n) = Σ daily_hits(title, day_0 … day_n)
- Accumulation MUST be monotonic and non-decreasing.
Axis and Scale Invariants
- X axis: calendar date from earliest to latest available day.
- Major ticks every 7 days.
- Minor ticks every day.
- Date labels MUST be rotated (oblique) for readability.
- Y axis MUST be logarithmic.
- Zero or negative values MUST NOT be plotted on the log axis.
Legend Ordering
- Legend entries MUST be ordered by descending final accumulated human_get_ok.
- Ordering MUST be deterministic and reproducible.
Visual Disambiguation Invariants
- Each title MUST be visually distinguishable.
- The same colour MAY be reused.
- The same line style MAY be reused.
- The same (colour + line style) pair MUST NOT be reused.
- Markers MAY be omitted or reused but MUST NOT be relied upon as the sole distinguishing feature.
Rendering Constraints
- Legend MUST be placed outside the plot area on the right.
- Sufficient vertical and horizontal space MUST be reserved to avoid label overlap.
- Line width SHOULD be consistent across series to avoid implied importance.
Interpretive Constraint
- This projection indicates reader entry and navigation behaviour only.
- High lead-in ranking MUST NOT be interpreted as quality, authority, or endorsement.
- Ordering reflects accumulated human access, not epistemic priority.
Periodic Regeneration
- This projection is intended to be regenerated periodically.
- Cross-run comparisons MUST preserve all invariants to allow valid temporal comparison.
- Changes in lead-in dominance (e.g. Plain-Language Summary vs. CM-1 foundation paper) are observational signals only and do not alter corpus structure.
Corpus Lead-In Projection: Deterministic Colour Map
This table provides the visual encoding for the core corpus pages. For titles not included in the colour map, use colours at your discretion until a Colour Map entry exists.
Colours are drawn from the Matplotlib tab20 palette.
Line styles are assigned to ensure that no (colour + line-style) pair is reused. Legend ordering is governed separately by accumulated human GET_ok.
Corpus Lead-In Projection: Colour-Map Hardening Invariants
This section hardens the visual determinism of the Corpus Lead-In Projection while allowing controlled corpus growth.
Authority
- This Colour Map is **authoritative** for all listed corpus pages.
- The assisting system MUST NOT invent, alter, or substitute colours or line styles for listed pages.
- Visual encoding is a governed property, not a presentation choice.
Binding Rule
- For any page listed in the Deterministic Colour Map table:
- The assigned (colour index, colour hex, line style) pair MUST be used exactly.
- Deviation constitutes a projection violation.
Legend Ordering Separation
- Colour assignment and legend ordering are orthogonal.
- Legend ordering MUST continue to follow the accumulated human GET_ok invariant.
- Colour assignment MUST NOT be influenced by hit counts, rank, or ordering.
New Page Admission Rule
- Pages not present in the current Colour Map MUST appear in a projection.
- New pages MUST be assigned styles in strict sequence order:
- Iterate line style first, then colour index, exactly as defined in the base palette.
- Previously assigned pairs MUST NOT be reused.
- The assisting system MUST NOT reshuffle existing assignments to “make space”.
Provisional Encoding Rule
- Visual assignments for newly admitted pages are **provisional** until recorded.
- A projection that introduces provisional encodings MUST:
- Emit a warning note in the run metadata, and
- Produce an updated Colour Map table for curator review.
Curator Ratification
- Only the human curator may ratify new colour assignments.
- Ratification occurs by appending new rows to the Colour Map table with a date stamp.
- Once ratified, assignments become binding for all future projections.
Backward Compatibility
- Previously generated projections remain valid historical artefacts.
- Introduction of new pages MUST NOT retroactively alter the appearance of older projections.
Failure Mode Detection
- If a projection requires more unique (colour, line-style) pairs than the declared palette provides:
- The assisting system MUST fail explicitly.
- Silent reuse, substitution, or visual approximation is prohibited.
Rationale (Non-Normative)
- This hardening ensures:
- Cross-run visual comparability
- Human recognition of lead-in stability
- Detectable drift when corpus structure changes
- Visual determinism is treated as part of epistemic governance, not aesthetics.
