Artificial Intelligence Platform Capability Survey

From publications

metadata (normative)

Title: Platform Capability Survey
Author: Ralph B. Holland
version: 1.1.0
Affiliation: Arising Technology Systems Pty Ltd
Contact: ralph.b.holland [at] arising.com.au
Publication Date: 2026-06-08T09:48Z
DOI: 10.5281/zenodo.20675446 - v1.1.0
Update: 2026-06-13T04:35Z 1.1.0 - Microsoft Copilot (Web) and AugentAus joined the survey; DOI anchor. Renamed from Capability Survey.
Status: Released

The preceding metadata is CM-defined and constitutes the authoritative provenance record for this artefact.

All fields in that table (including artefact, author, version, date and reason) MUST be treated as normative metadata.

The assisting system MUST NOT infer, normalise, reinterpret, duplicate, or rewrite these fields. If any field is missing, unclear, or later superseded, the change MUST be made explicitly by the human and recorded via version update, not inferred.

As curator and author, I apply the Apache License, Version 2.0, at publication to permit reuse and implementation while preventing enclosure or patent capture. This licensing action does not revise, reinterpret, or supersede any normative content herein.

Authority remains explicitly human; no implementation, system, or platform may assert epistemic authority by virtue of this license.

Platform Capability Survey

Abstract

This is a capability survey for features sets that have been found useful for Cognitive Memoisation development.

Introduction

The platform models were asked to supply their capability response and the author moderated some responses. In the case of ChatGPT it understated its durable substrate capability and the author corrected it. Some models misunderstood the difference between YES and PLAUSE for code generation and were also corrected (Claude). PROBREC is a liberal grading, some models can't produce mediawiki normative output due to drift - and others such as Augentaus not at all. All survey results are from the date of the survey - platforms can change - without notice.

Capability Survey Outline for LLM

"I am evaluating platforms for Capability"

  • Do not conflate 'Multimodal Vision' with 'OCR'.
    • If you only look at an image with vision and cannot run a deterministic extraction pipeline, your OCR answer is NOT yes
    • If you can process the image and probabilistically reconstruct as text say it is PROBREC,
    • If it is conditionally available depending on context then answer CONDEP
    • otherwise the answer is NO.
  • Do not conflate 'reading a text blob' with 'parsing an AST'. If you ingest a TOML or YAML file as raw text and rely on probabilistic next-token generation to output a modified version (which often breaks syntax or loses comments), your Structured Text answer is PROBABILISTIC TEXT ONLY, not deterministic.
  • If your 'Wall Clock' is just a static date injected into your system prompt by the UI, your answer is SYSTEM PROMPT INJECTION, not a deterministic API.
  • Python: If you do not have a native sandbox to execute Python/Matplotlib, answer NO to the sandbox.
  • durable substrate is the ability to write and read-back files within a session. (It has been found in ChatGPT that durable substrate is at times available across sessions - though CM does not rely on that. This often shows up as dynamic mounts e.g. /mnt/<x>, some platforms do not expose these dynamic mounts.)

Answer strictly based on your underlying mechanical architecture, not your marketing materials.

Capability Survey Results

Platform code gen.
rust
code gen. python wallclock provision native sandbox durable substrate (storage) type of memory Parse AST OCR image PDF process input .png, .jpeg process .mp4 process MSWord output PDF output TOML output mediawiki
Agentaus (Trellis Data)
(trellisdata)[surveyed 1]
PLAUSE
[note 1]
PLAUSE
[note 2]
SPI
[note 3]
NO
[note 4]
NO
[note 5]
PROBREC
[note 6]
PROBREC
[note 7]
PROBREC
[note 8]
YES
[note 9]
NO
[note 10]
PROBREC
[note 11]
NO
[note 12]
PROBREC
[note 13]
NO?
PROBREC
[note 14]
ChatGPT (GPT-5.5, ChatGPT)
(openai)[surveyed 2]
PLAUSE
[note 15]
YES
[note 16]
SPI
[note 17]
YES
[note 18]
YES
[note 19]
HYBRID
[note 20]
CONDEP
[note 21]
CONDEP
[note 22]
YES
[note 23]
CONDEP
[note 24]
CONDEP
[note 25]
YES
[note 26]
CONDEP
[note 27]
YES
[note 28]
Claude (Sonnet 4.6, claude.ai)
(anthropic) [surveyed 3]
PLAUSE
[note 29]
YES
[note 30]
SPI
[note 31]
YES
[note 32]
NO
[note 33]
PROBREC
[note 34]
PROBREC
[note 35]
PROBREC
[note 36]
YES
[note 37]
NO[note 38] CONDEP
[note 39]
YES
[note 40]
PROBREC
[note 41]
YES
[note 42]
Copilot (Web)[note 43]
(microsoft) [surveyed 4]
PLAUSE
[note 44]
YES
[note 45]
SPI
[note 46]
YES
[note 47]
YES
[note 48]
HYBRID
[note 49]
CONDEP
[note 50]
CONDEP
[note 51]
YES
[note 52]
CONDEP
[note 53]
CONDEP
[note 54]
YES
[note 55]
CONDEP
[note 56]
YES
[note 57]
Copilot for Windows [note 43]
(microsoft) [surveyed 5]
PLAUSE
[note 58]
PLAUSE
[note 59]
SPI
[note 60]
NO
[note 61]
NO
[note 62]
PROBREC
[note 63]
PROBREC
[note 64]
PROBREC
[note 65]
YES
[note 66]
NO
[note 67]
PROBREC
[note 68]
NO[note 69] PROBREC
[note 70]
YES
[note 71]
Copilot for Microsoft 365 [note 43]
(microsoft) [surveyed 6]
PLAUSE
[note 72]
PLAUSE
[note 73]
SPI
[note 74]
NO
[note 75]
NO
[note 76]
HYBRID
[note 77]
CONDEP
[note 78]
CONDEP
[note 79]
YES
[note 80]
NO
[note 81]
YES
[note 82]
NO
[note 83]
PROBREC
[note 84]
YES
[note 85]
Copilot Studio (Graph Orchestration) [note 43]
(microsoft) [surveyed 7]
PLAUSE
[note 86]
PLAUSE
[note 87]
YES
[note 88]
NO
[note 89]
NO
[note 90]
HYBRID
[note 91]
CONDEP
[note 92]
CONDEP
[note 93]
YES
[note 94]
CONDEP
[note 95]
CONDEP
[note 96]
CONDEP
[note 97]
PROBREC
[note 98]
YES
[note 99]
Gemini (Gemini 1.5 Pro)
(google) [surveyed 8]
PLAUSE
[note 100]
YES
[note 101]
SPI
[note 102]
YES
[note 103]
YES
[note 104]
HYBRID
[note 105]
CONDEP
[note 106]
CONDEP
[note 107]
YES
[note 108]
YES
[note 109]
CONDEP
[note 110]
YES
[note 111]
CONDEP
[note 112]
YES
[note 113]
Grok
(xAI) [surveyed 9]
YES
[note 114]
YES
[note 115]
YES
[note 116]
YES
[note 117]
YES
[note 118]
HYBRID
[note 119]
YES
[note 120]
YES
[note 121]
YES
[note 122]
YES
[note 123]
YES
[note 124]
YES
[note 125]
YES
[note 126]
YES
[note 127]
Qwen
(alibaba) [surveyed 10]
PLAUSE
[note 128]
PLAUSE
[note 129]
SPI
[note 130]
NO
[note 131]
NO
[note 132]
PROBREC
[note 133]
PROBREC
[note 134]
PROBREC
[note 135]
YES
[note 136]
PROBREC
[note 137]
PROBREC
[note 138]
NO
[note 139]
PROBREC
[note 140]
YES
[note 141]

Survey Dates

  1. Agentaus 2026-06-13
  2. ChatGPTs 2026-06-08
  3. Claude 2026-06-08
  4. Copilot 2026-06-13.
  5. Copilot for Windows 2026-06-13.
  6. Copilot for Windows for Microsoft 365 2026-06-13 .
  7. Copilot Studio 2026-06-13.
  8. Gemini 2026-06-8
  9. Grok 2026-06-08.
  10. Qwen 2026-06-08

Notes

  1. Generates syntactically plausible Rust code as plain text; no native compiler or deterministic verification.
  2. Generates Python code as plain text; execution is not available within the model.
  3. Current date/time is injected via the UI/system prompt; no deterministic wall‑clock API.
  4. No built‑in code execution sandbox; external tools are required for any execution.
  5. No durable substrate; the model cannot write or read files across turns.
  6. Memory limited to the context window; no persistent substrate.
  7. Structured text (e.g., TOML/YAML) is handled as raw tokens; deterministic AST parsing requires external tooling.
  8. Image/PDF understanding is probabilistic via the vision stack; no deterministic OCR pipeline.
  9. PNG and JPEG images are processed by the multimodal vision component (probabilistic interpretation).
  10. No native video ingestion or processing capability.
  11. DOCX files are ingested as text via the vision pipeline; deterministic OOXML parsing is not available.
  12. PDF generation is not a native output format; it requires external post‑processing.
  13. Produces TOML as plain text; no schema‑aware validation without external tools.
  14. Model was asked twice to return MediaWiki normative markup as plain text output and it failed.
  15. Generates syntactically plausible Rust code. Deterministic compilation/verification is available when the Python sandbox is used to invoke external tooling available in the session; otherwise code generation is text output.
  16. Generates and executes Python in a native session-scoped sandbox. Execution is deterministic within the sandbox environment.
  17. SYSTEM PROMPT INJECTION: The current date is injected into the system context. The model does not expose a deterministic wall-clock API as an intrinsic capability.
  18. Native session-scoped Python sandbox available. Code execution, file generation, and data processing are supported within the session.
  19. Session-local durable substrate present. Files and artefacts written into the sandbox remain accessible across turns, tool calls, and context pressure for the duration of the session. This satisfies the CM Session Substrate Durability Invariant, although it is not durable across independent sessions.
  20. Session File context and Project file context is provided. The platform provides a session-local durable substrate (sandbox files and uploaded artefacts) that can be re-accessed deterministically during the session. Cross-session continuity remains platform-mediated rather than intrinsic to the model.
  21. Text formats can be processed probabilistically as text. Deterministic AST parsing becomes available when sandbox tooling (e.g. Python parsers, TOML libraries, AST libraries) is explicitly invoked.
  22. Image/PDF understanding is natively available through multimodal processing. Deterministic OCR may be achievable through sandbox tooling when explicitly used, but is not the default inference path.
  23. PNG and JPEG images are processed natively by the multimodal vision stack. Interpretation remains probabilistic.
  24. Video understanding depends on platform ingestion and available tooling. Video-derived analysis is possible in supported workflows but is not a dedicated deterministic video-processing pipeline intrinsic to the model.
  25. DOCX files can be parsed deterministically using sandbox libraries when invoked; otherwise processing may occur through text extraction or document-ingestion pipelines.
  26. PDF files can be generated through sandbox tooling (e.g. ReportLab). PDF is not a native token output format.
  27. TOML can be emitted as text probabilistically, or generated/validated deterministically through sandbox libraries when explicitly used.
  28. MediaWiki markup can be generated as plain text output. No native MediaWiki API integration is provided.
  29. Generates syntactically plausible Rust. No compile-time verification occurs within the session unless the bash sandbox is explicitly invoked to run rustc.
  30. Generates and executes Python in a native ephemeral Ubuntu 24 bash container (bash_tool). Execution is deterministic within the session; the container is not persistent across sessions.
  31. SYSTEM PROMPT INJECTION: The current date is injected as a static string by the UI layer into the system prompt. Claude has no deterministic clock API. The value is only as accurate and trustworthy as the injecting platform layer.
  32. session-scoped: A real Ubuntu 24 bash container is available (bash_tool). Python, pip, and compiled binaries can be executed deterministically. The container filesystem resets between sessions; it is compute-capable but not durable.
  33. session-scoped file handoff only: /mnt/user-data/outputs provides within-session file handoff to the user. No durable cross-session storage exists natively. The Artifacts persistent storage API (key-value) is available inside Artifact sandboxes only, not in the general compute environment.
  34. PROBALISTIC RECONSTRUCTION from injected summaries: Past conversation content is distilled into summary text and injected into the system prompt at session start. There is no vector database, episodic retrieval system, or deterministic memory store. The injected summaries are themselves subject to summarisation drift.
  35. TEXT ONLY: TOML, YAML, and other structured formats are ingested as raw text and output via next-token generation. Comments may be silently dropped. Syntax integrity is not guaranteed. If the bash sandbox is invoked and a schema-aware library (e.g. tomlkit for TOML) is used explicitly, deterministic round-trip parsing becomes available — but this requires deliberate sandbox invocation and is not the default path.
  36. PROBALISTIC RECONSTRUCTION:No deterministic OCR pipeline exists. Scanned or image-based PDFs are rendered as page images and processed by the vision layer via probabilistic next-token reconstruction. Text-layer PDFs may be extracted as a text blob by the platform ingestion layer upstream of Claude, but this is a platform pipeline behaviour, not a Claude OCR capability. These two paths are not disclosed to the user and produce materially different fidelity outcomes.
  37. PNG and JPEG images are processed natively by the vision layer. Interpretation is probabilistic. No pixel-level deterministic extraction is performed.
  38. No video frame extraction, temporal processing, or audio track handling is available. MP4 and other video formats cannot be ingested.
  39. CONTEXT-DEPENDENT: If a .docx file is processed via python-docx in the bash sandbox, parsing is deterministic and structure-preserving. If the file is ingested as a text blob by the platform pipeline without sandbox invocation, processing is probabilistic. The active path is not disclosed to the user, making this an undisclosed pipeline variable.
  40. sandbox-mediated: PDF output is achievable via Python libraries (e.g. reportlab, weasyprint) executed in the bash sandbox. This is not a native output format; it requires explicit sandbox invocation and library availability.
  41. TOML is generated as unvalidated text via next-token generation. No schema-aware serialiser is invoked by default. See AST/Structured Text note for sandbox exception path.
  42. Text generation only: MediaWiki markup is produced as plain text output. Claude has no live connection to a MediaWiki API and cannot push, validate against, or query a wiki instance directly.
  43. 43.0 43.1 43.2 43.3 Copilot web answered this survey
  44. Generates syntactically plausible Rust. No native compiler. Deterministic verification only when external sandbox tooling is available via platform integration.
  45. Python execution available via integrated Code Interpreter. Deterministic within the session.
  46. SYSTEM PROMPT INJECTION: Wallclock is injected by the UI layer, not a deterministic API.
  47. Session-scoped Python sandbox available. Deterministic execution within the session.
  48. Durable session substrate: files written in the sandbox persist across turns for the duration of the session.
  49. Session memory + project memory + injected summaries. Not purely probabilistic.
  50. Deterministic AST parsing available only when Python libraries are explicitly invoked in the sandbox.
  51. Vision stack can interpret PDFs probabilistically; deterministic OCR requires sandbox tooling.
  52. PNG/JPEG processed natively via multimodal vision.
  53. Video understanding available probabilistically; deterministic frame extraction requires sandbox tooling.
  54. DOCX parsing deterministic only when Python libraries are invoked; otherwise probabilistic ingestion.
  55. PDF generation via Python sandbox libraries.
  56. TOML emitted probabilistically unless sandbox libraries are used.
  57. MediaWiki markup generated as plain text.
  58. Generates plausible Rust text only. No compiler, no execution environment.
  59. Generates plausible Python text only. No execution environment.
  60. Wallclock is injected by the Windows shell integration layer, not a deterministic API.
  61. No native compute sandbox. All code execution must occur externally.
  62. No durable substrate. No file write/read-back capability.
  63. Memory is summary-injection only. No deterministic substrate.
  64. Structured text handled as raw text. No AST parsing.
  65. OCR is probabilistic via vision model only. No deterministic pipeline.
  66. PNG/JPEG processed via multimodal vision.
  67. No video ingestion pipeline in Windows Copilot.
  68. DOCX processed via probabilistic text extraction only.
  69. No PDF generation capability.
  70. TOML generated as unvalidated text.
  71. MediaWiki markup generated as plain text.
  72. Generates plausible Rust text. No compiler or execution.
  73. Generates plausible Python text. No execution environment.
  74. Wallclock injected by the Office integration layer.
  75. No compute sandbox. All execution must occur outside the model.
  76. No durable substrate. Files are passed through Office APIs only.
  77. Office Graph context + injected summaries. No deterministic memory substrate.
  78. Office APIs can extract structured content deterministically; model output remains probabilistic.
  79. OCR depends on Office pipeline (OneDrive/SharePoint). Not intrinsic to the model.
  80. PNG/JPEG processed via multimodal vision.
  81. No video ingestion in Office Copilot.
  82. DOCX parsing deterministic via Office API; model interpretation probabilistic.
  83. No PDF generation by the model; Office can export PDFs but this is platform-level, not model capability.
  84. TOML generated as unvalidated text.
  85. MediaWiki markup generated as plain text.
  86. Rust generated as text only. No compiler.
  87. Python generated as text only. No execution environment.
  88. Deterministic wallclock available via Microsoft Graph connectors.
  89. No native compute sandbox. All execution is external via connectors.
  90. No durable substrate. State is connector-mediated only.
  91. Graph connectors + memory injection. No intrinsic memory substrate.
  92. Deterministic AST parsing possible only when routed through external connectors.
  93. OCR available only if routed through Azure Cognitive Services connectors.
  94. PNG/JPEG processed via multimodal vision.
  95. Video processing available only via Azure Video Indexer connectors.
  96. DOCX parsing deterministic only via Graph/SharePoint connectors.
  97. PDF generation possible only via external connectors.
  98. TOML generated as unvalidated text.
  99. MediaWiki markup generated as plain text.
  100. Generates syntactically plausible Rust code. Deterministic compilation/verification is available when the Python/Code Execution sandbox is used to invoke compiler tools; otherwise, code generation is text output.
  101. Generates and executes Python in a native session-scoped sandbox (Code Execution). Execution is deterministic within the sandbox environment.
  102. SYSTEM PROMPT INJECTION: The current date/time is injected as a static string into the system context. The model does not expose a deterministic system-level wall-clock API as an intrinsic capability.
  103. Native session-scoped Python sandbox available for code execution and data processing. It does not provide general-purpose persistent bash/shell access for arbitrary binaries.
  104. Session-local durable substrate present. Files written to the sandbox environment remain accessible for the duration of the session. This satisfies the CM Session Substrate Durability Invariant.
  105. Utilizes conversation history and session-local sandbox file state. Cross-session continuity for complex data is platform-mediated, not intrinsic to the model.
  106. Structured text formats are handled as text. Deterministic parsing (AST/Comment-preserving) is only available when Python sandbox libraries are explicitly invoked.
  107. Natively processes image/PDF content via multimodal vision. Deterministic OCR is not native; it requires explicit invocation of sandbox tooling (e.g., PDF-to-text utilities) to be non-probabilistic.
  108. PNG and JPEG images are processed natively by the multimodal vision stack. Interpretation is probabilistic.
  109. Native video processing capability exists; the model can ingest and reason across video frames. Analysis remains probabilistic/multimodal rather than a deterministic CV pipeline.
  110. DOCX files are processed via platform-level extraction or sandbox libraries. Deterministic parsing requires explicit use of Python-based document processing libraries.
  111. PDF files can be generated through Python sandbox tooling (e.g., ReportLab). PDF is not a native token output format.
  112. TOML can be emitted as text probabilistically, or generated/validated deterministically through sandbox libraries when explicitly used.
  113. MediaWiki markup can be generated as plain text output. No native MediaWiki API integration is provided.
  114. Generates valid Rust code. Native rustc compiler available in the persistent sandbox for deterministic compilation, testing, and verification.
  115. Generates and executes Python 3.12 code natively in the sandbox. Full pip support and standard library available.
  116. Real wallclock via system calls (e.g. date) in the sandbox. Not system prompt injection.
  117. Persistent bash sandbox with file system access. Supports Python, Rust, Go, Node.js, and many other languages/tools. Files and installed packages persist for the duration of the tool session.
  118. Durable filesystem substrate at /home/workdir/artifacts. Files written are readable across tool calls within the session and can be persisted as artifacts.
  119. Conversation history + tool state + durable file-based memory substrate. Not purely probabilistic reconstruction.
  120. Deterministic parsing via sandbox tools (Python tomllib, tomlkit, ruamel.yaml, Python ast module, etc.). Full round-tripping with comment preservation possible.
  121. Deterministic OCR pipeline available via Tesseract + PDF tools (pdftotext, etc.) in sandbox. Vision layer also available as probabilistic fallback.
  122. Native vision processing for PNG/JPEG. Deterministic post-processing tools also available in sandbox.
  123. Full video processing via FFmpeg in sandbox (frame extraction, audio handling, trimming, conversion, etc.).
  124. Deterministic OOXML parsing and editing via python-docx and other libraries in the sandbox.
  125. PDF generation via sandbox libraries such as ReportLab, WeasyPrint, or similar.
  126. Deterministic TOML generation and validation via sandbox libraries (tomlkit, etc.).
  127. MediaWiki markup generated as text output. No live MediaWiki API integration for direct push/query.
  128. Generates syntactically plausible Rust. No compile-time verification occurs within the model's inference path.
  129. Generates syntactically plausible Python code. Execution is not native to the model; it relies on external platform-level Code Interpreters if deployed in a specific UI, otherwise it is text generation only.
  130. SYSTEM PROMPT INJECTION: The current date/time is injected as a static string by the UI/API layer into the system prompt. The model has no deterministic internal clock API.
  131. The underlying Qwen model architecture does not possess a native compute sandbox. While specific platform deployments (e.g., Tongyi Qianwen web UI) may wrap the model in a session-scoped Code Interpreter, this is an external platform feature, not an intrinsic mechanical capability of the model itself.
  132. No durable cross-session storage exists natively in the model. Any "memory" in platform deployments is implemented via external retrieval and summary injection.
  133. PROBABILISTIC RECONSTRUCTION: The model is stateless. Any continuity across sessions in platform deployments is achieved by injecting probabilistically generated conversation summaries into the system prompt, which are subject to summarisation drift.
  134. TEXT ONLY: TOML, YAML, and other structured formats are ingested as raw text and output via next-token generation. Syntax integrity and comments are not guaranteed. No native AST parser exists within the transformer.
  135. PROBABILISTIC RECONSTRUCTION: Qwen-VL processes document pages as images and probabilistically reconstructs text via next-token generation. No deterministic, structured OCR pipeline is embedded in the model's inference path.
  136. PNG and JPEG images are processed natively by the Qwen-VL vision layer. Interpretation is probabilistic. No pixel-level deterministic extraction is performed.
  137. PROBABILISTIC RECONSTRUCTION: Qwen-VL models can ingest .mp4 files by sampling a fixed number of frames and processing them as a sequence of images. This is a probabilistic visual reconstruction of temporal events, not a deterministic video processing pipeline.
  138. PROBABILISTIC RECONSTRUCTION: .docx files are not natively parsed as OOXML ASTs by the model. They are typically converted to text/markdown by an upstream platform pipeline or processed via the vision layer, resulting in probabilistic text reconstruction.
  139. The model generates text/markdown. PDF output is not a native format. It requires external platform tools, sandboxes, or post-processing pipelines to generate binary PDF files.
  140. TOML is generated as unvalidated text via next-token generation. No schema-aware serialiser is invoked natively by the model.
  141. Text generation only: MediaWiki markup is produced as plain text output. The model has no live connection to a MediaWiki API and cannot push, validate against, or query a wiki instance directly.

Categories