Cautilus Promise Ledger
This ledger names the current Cautilus promises and cross-cutting rules. The names are the reader-facing addresses. Compact keys mirror those names for packets, tables, and checks.
Model index: Promise Ledger. View guide: How The Views Relate. Evidence state: Evidence State.
Workflow Promises
- Readiness: Cautilus shows whether the selected repo is ready for a bounded workflow and what setup remains.
- Claim Discovery: Cautilus turns selected source docs into source-referenced candidate claims and proof-planning work.
- Behavior Evaluation: Cautilus evaluates intentful behavior across supported
devandappsurfaces. - Bounded Improvement: Cautilus improves a selected behavior target under explicit budget, protected checks, and held-out evidence.
Workflow Audit Matrix
| promise | key | commitment | user workflow | maintainer evidence routes | cross-cutting rules |
|---|---|---|---|---|---|
| Readiness | promise.readiness |
A user can tell whether Cautilus can operate in the selected repo and what setup is missing before spending workflow budget. | Readiness | Readiness And Runtime Status, Adapter And Host Ownership | Evidence Gaps, Vocabulary Consistency, Agent-Human Resumability, Host-Owned Execution |
| Claim Discovery | promise.claim-discovery |
A user can scan selected source docs into broad source-referenced candidates and leave curation, likely false negatives, and proof planning visible. | Claim Discovery | Claim Discovery Workflow, Evidence State And Review Artifacts, Binary And Skill Boundary | Reviewable Artifacts, Evidence Gaps, Agent-Human Resumability, Host-Owned Execution |
| Behavior Evaluation | promise.evaluation |
A user can evaluate intentful behavior across supported dev and app surfaces when deterministic tests alone do not explain the behavior. |
Behavior Evaluation | Evaluation Surfaces And Runners, Reporting And Review Variants, Live Invocation Runtime | Reviewable Artifacts, Packet Freshness, Cost And Proof Freshness, Host-Owned Execution |
| Bounded Improvement | promise.improvement |
A user can improve a selected behavior target while preserving intent, explicit budget, protected checks, held-out evidence, and reviewable revision artifacts. | Bounded Improvement | Improvement Loop, Scenario History And Proposal Normalization, Active Run And Workspace Lifecycle | Reviewable Artifacts, Evidence Gaps, Cost And Proof Freshness, Host-Owned Execution |
Cross-Cutting Rules
| cross-cutting rule | key | commitment | user attachments | maintainer evidence routes |
|---|---|---|---|---|
| Reviewable Artifacts | rule.reviewable-artifacts |
Workflow output remains reopenable as machine-readable packets and readable reports. | Reviewable Artifacts | Evidence State And Review Artifacts, Reporting And Review Variants, Active Run And Workspace Lifecycle |
| Evidence Gaps | rule.evidence-gaps |
Missing or weak evidence stays visible until the claim is proven, narrowed, deferred, or removed. | Evidence Gaps | Evidence State And Review Artifacts, Improvement Loop, Readiness And Runtime Status |
| Host-Owned Execution | rule.host-owned-execution |
Host repos own prompts, models, credentials, runtime wiring, fixtures, and acceptance policy while Cautilus owns generic workflow packets and boundaries. | Host Ownership | Adapter And Host Ownership, Live Invocation Runtime, Binary And Skill Boundary |
| Vocabulary Consistency | rule.vocabulary-consistency |
One concept keeps one name across user prose, maintainer specs, packets, tests, and Cautilus Agent guidance. | Readiness, Claim Discovery, Host Ownership | Binary And Skill Boundary, Adapter And Host Ownership, Evidence State And Review Artifacts |
| Packet Freshness | rule.packet-freshness |
Readable views show current packets and source artifacts rather than stale copied state. | Reviewable Artifacts, Evidence Gaps, Behavior Evaluation | Evidence State And Review Artifacts, Reporting And Review Variants, Active Run And Workspace Lifecycle |
| Cost And Proof Freshness | rule.cost-and-proof-freshness |
Expensive eval and improve evidence is shown honestly as selected, prepared, stale, or newly executed proof. | Behavior Evaluation, Bounded Improvement | Evaluation Surfaces And Runners, Improvement Loop, Scenario History And Proposal Normalization |
| Agent-Human Resumability | rule.agent-human-resumability |
A human or agent can resume from durable packets, next actions, source refs, and source-bound review feedback that can become reusable learning evidence instead of chat memory. | Readiness, Claim Discovery, Host Ownership | Evidence State And Review Artifacts, Binary And Skill Boundary, Active Run And Workspace Lifecycle, Reporting And Review Variants |
Ledger Checks
Verify all primary reading views exist.
node -e 'const fs = require("node:fs"); for (const path of ["docs/specs/user/index.spec.md", "docs/specs/contracts/index.spec.md", "docs/specs/rules/index.spec.md", "docs/specs/evidence/index.spec.md"]) { if (!fs.existsSync(path)) throw new Error("missing " + path); }'