Cautilus Promise Ledger

This ledger names the current Cautilus promises and cross-cutting rules. The names are the reader-facing addresses. Compact keys mirror those names for packets, tables, and checks.

Model index: Promise Ledger. View guide: How The Views Relate. Evidence state: Evidence State.

Workflow Promises

  • Readiness: Cautilus shows whether the selected repo is ready for a bounded workflow and what setup remains.
  • Claim Discovery: Cautilus turns selected source docs into source-referenced candidate claims and proof-planning work.
  • Behavior Evaluation: Cautilus evaluates intentful behavior across supported dev and app surfaces.
  • Bounded Improvement: Cautilus improves a selected behavior target under explicit budget, protected checks, and held-out evidence.

Workflow Audit Matrix

promise key commitment user workflow maintainer evidence routes cross-cutting rules
Readiness promise.readiness A user can tell whether Cautilus can operate in the selected repo and what setup is missing before spending workflow budget. Readiness Readiness And Runtime Status, Adapter And Host Ownership Evidence Gaps, Vocabulary Consistency, Agent-Human Resumability, Host-Owned Execution
Claim Discovery promise.claim-discovery A user can scan selected source docs into broad source-referenced candidates and leave curation, likely false negatives, and proof planning visible. Claim Discovery Claim Discovery Workflow, Evidence State And Review Artifacts, Binary And Skill Boundary Reviewable Artifacts, Evidence Gaps, Agent-Human Resumability, Host-Owned Execution
Behavior Evaluation promise.evaluation A user can evaluate intentful behavior across supported dev and app surfaces when deterministic tests alone do not explain the behavior. Behavior Evaluation Evaluation Surfaces And Runners, Reporting And Review Variants, Live Invocation Runtime Reviewable Artifacts, Packet Freshness, Cost And Proof Freshness, Host-Owned Execution
Bounded Improvement promise.improvement A user can improve a selected behavior target while preserving intent, explicit budget, protected checks, held-out evidence, and reviewable revision artifacts. Bounded Improvement Improvement Loop, Scenario History And Proposal Normalization, Active Run And Workspace Lifecycle Reviewable Artifacts, Evidence Gaps, Cost And Proof Freshness, Host-Owned Execution

Cross-Cutting Rules

cross-cutting rule key commitment user attachments maintainer evidence routes
Reviewable Artifacts rule.reviewable-artifacts Workflow output remains reopenable as machine-readable packets and readable reports. Reviewable Artifacts Evidence State And Review Artifacts, Reporting And Review Variants, Active Run And Workspace Lifecycle
Evidence Gaps rule.evidence-gaps Missing or weak evidence stays visible until the claim is proven, narrowed, deferred, or removed. Evidence Gaps Evidence State And Review Artifacts, Improvement Loop, Readiness And Runtime Status
Host-Owned Execution rule.host-owned-execution Host repos own prompts, models, credentials, runtime wiring, fixtures, and acceptance policy while Cautilus owns generic workflow packets and boundaries. Host Ownership Adapter And Host Ownership, Live Invocation Runtime, Binary And Skill Boundary
Vocabulary Consistency rule.vocabulary-consistency One concept keeps one name across user prose, maintainer specs, packets, tests, and Cautilus Agent guidance. Readiness, Claim Discovery, Host Ownership Binary And Skill Boundary, Adapter And Host Ownership, Evidence State And Review Artifacts
Packet Freshness rule.packet-freshness Readable views show current packets and source artifacts rather than stale copied state. Reviewable Artifacts, Evidence Gaps, Behavior Evaluation Evidence State And Review Artifacts, Reporting And Review Variants, Active Run And Workspace Lifecycle
Cost And Proof Freshness rule.cost-and-proof-freshness Expensive eval and improve evidence is shown honestly as selected, prepared, stale, or newly executed proof. Behavior Evaluation, Bounded Improvement Evaluation Surfaces And Runners, Improvement Loop, Scenario History And Proposal Normalization
Agent-Human Resumability rule.agent-human-resumability A human or agent can resume from durable packets, next actions, source refs, and source-bound review feedback that can become reusable learning evidence instead of chat memory. Readiness, Claim Discovery, Host Ownership Evidence State And Review Artifacts, Binary And Skill Boundary, Active Run And Workspace Lifecycle, Reporting And Review Variants

Ledger Checks

Verify all primary reading views exist.
node -e 'const fs = require("node:fs"); for (const path of ["docs/specs/user/index.spec.md", "docs/specs/contracts/index.spec.md", "docs/specs/rules/index.spec.md", "docs/specs/evidence/index.spec.md"]) { if (!fs.existsSync(path)) throw new Error("missing " + path); }'