Inventory and gates
Mushi v1 was the negative side of the loop: catch what your users
felt break, classify it, dedupe it, optionally draft a fix. v2 adds
the positive side: a declarative inventory.yaml that names every
user-facing story, page, and action — and five composite gates that
fail the build when the agent’s drafts diverge from the contract.
The model
project ──► story ──► page ──► element ──► action
│
└─► expected_outcomeEach action has a trigger (click, type, submit) and an
expected_outcome block — the assertions a healthy session would still
make. The synthetic monitor re-walks those expectations on a cron, the
crawler discovers candidate routes the SDK has observed in production,
and the gates fail when reality diverges from the file.
expected_outcome is the load-bearing field for the v2 closed loop. It
threads through every stage of the agent pipeline — see
Spec traceability — read AND write side
below for the full chain.
Five gates
The composite GitHub status mushi-mushi/gates rolls up:
mushi-mushi/no-dead-handler— emptyonClick/onSubmitetc. surfaced byeslint-plugin-mushi-mushi.mushi-mushi/no-mock-leak— faker /John Doearrays in non-test paths.- Inventory drift — actions added, removed, or renamed since the last push. Drift requires a changeset entry; the run will fail if the diff is uncommitted.
- Agentic-failure detection — handlers that used to satisfy
expected_outcomeand no longer do (regression across deploys). - Synthetic walk health — the synthetic monitor’s last walk against staging.
The composite check is what your branch protection should require — gates 1 + 2 are static, gates 3 + 4 + 5 talk to the Mushi gateway.
Discovery — the SDK proposes the inventory
Most teams will never hand-author inventory.yaml. Turn on the new
v2.1 SDK option:
<MushiProvider config={{
projectId: '…',
apiKey: '…',
capture: { discoverInventory: true },
}}>The SDK quietly observes routes, data-testids, and outbound API paths
in production. Claude reads the stream and drafts an inventory.yaml
in the Discovery tab on the User stories surface. You accept,
edit, or skip — the production payload itself never carries user data.
Synthetic monitor
The monitor re-walks every action’s expected_outcome against your
staging URL (or any explicit synthetic_target_url). Defaults are
fail-closed: write paths (POST / PUT / DELETE actions) are skipped
unless you opt in per-project:
synthetic_monitor_allow_mutations: trueThe opt-in is meant for staging environments that are isolated from production data and have either an idempotency key or a clean rollback.
Surface mode in the graph
The same inventory powers a Surface toggle on the
knowledge graph — every Page, Element,
and Action node from inventory.yaml overlaid on the live bug graph.
Dead corners (high-traffic pages with no expected_outcome) light up.
Spec traceability — read AND write side
The v1 question every team asked was “how do you keep agent work tied back to the original spec once implementation starts?” Until the 2026-05-09 release the read side (proposer → ingest → gates → status reconciler → admin UI) was tight, and the write side (report → fix → PR → re-verify) was a U-turn — the worker dropped the inventory pointer the moment dispatch started. That’s now closed end-to-end.
report ──► classify-report writes graph_edge (reports_against)
│
▼
inventory Action node ◄─── inventory.yaml (expected_outcome)
│
▼
POST /v1/admin/fixes/dispatch (or A2A /v1/a2a/tasks, MCP dispatch_fix)
│ body may carry { inventoryActionNodeId } — else worker walks the edge
▼
fix_dispatch_jobs.inventory_action_node_id ──► persisted
│
▼
fix-worker assembles FixContext + inventoryAction.expectedOutcome
│ → Markdown spec block in the LLM prompt
▼
validateAgainstSpec (deterministic pre-PR gate)
│ HARD ERROR if the diff removes a json_path field the contract asserts on
│ WARN to fix_attempts.spec_validation_warnings if no file references the table / route
▼
GitHub PR + fix_attempts row stamped with inventory_action_node_id
│
▼
synthetic_runs queued (status='skipped', error_message='queued_post_pr', action_node_id=…)
│
▼
synthetic-monitor cron drains the queue with priority on its next tick,
runs an HTTP probe, evaluates expected_outcome (status_in + JSONPath assertions),
records a real synthetic_runs row.
│
▼
Status reconciler picks it up → admin UI flips the Action to verified / regressed.What the agent sees in its prompt today:
## Inventory Spec Context (whitepaper §2.10 spec-traceability)
This fix was dispatched against a tracked Action in the project's `inventory.yaml`.
The agent and the reviewer MUST keep the diff scoped to making the action work as
specified — do NOT refactor unrelated code or break sibling actions on the same page.
- Action: `signup-form: submit`
- Description: Submit the signup form and create a new user
- Page: `/signup` (id=`signup`)
- User story: New user signup (`signup`)
### Expected outcome contract (success criteria after fix)
- Summary: POST /signup returns 200 and creates a user row
- HTTP status MUST be one of: 200, 201
- Response body assertions:
- `$.user.id` exists
- Database: `public.users` MUST row_exists
- UI MUST show text containing: "Welcome"
- UI MUST navigate to: `/dashboard`
After the PR merges, the synthetic monitor will probe the action against this
contract. A draft fix that the synthetic monitor will then immediately mark
`regressed` is worse than no fix at all.External orchestrators (Cursor, Claude Code, OpenAI Agents SDK,
LangGraph, CrewAI, A2A v1.0.0 agents) see the same anchor through
the MCP get_fix_context tool, the A2A Task metadata.inventoryActionNodeId
field, or the dispatch row’s column directly. See
Connecting your orchestrator.
Where each link lives
| Link | Where to look |
|---|---|
expected_outcome schema | @mushi-mushi/inventory-schema (Zod + JSON Schema, mirrored at /v1/schemas/expected-outcome.json) |
inventory_action_node_id columns | Migration 20260509100000_inventory_action_traceability — fix_dispatch_jobs + fix_attempts (FK, ON DELETE SET NULL), plus spec_validation_warnings JSONB |
| Spec context in the LLM prompt | renderSpecContext() in @mushi-mushi/agents, mirrored in the fix-worker Edge Function |
| Pre-PR gate | validateAgainstSpec() in @mushi-mushi/agents, wired into FixOrchestrator |
| Post-PR probe | drainPostPrQueue() + evaluateExpectedOutcome() in the synthetic-monitor Edge Function |
| External orchestrators | MCP tools dispatch_fix + get_fix_context, A2A POST /v1/a2a/tasks |
Related packages
@mushi-mushi/inventory-schema— Zod + JSON Schema forinventory.yaml.@mushi-mushi/inventory-auth-runner— bootstrap an authenticated session so the crawler / monitor can reach pages behind a login wall.@mushi-mushi/mcp-ci— the GitHub Action that runs all five gates, pluspropose,discover-api,discovery-status, andauth-bootstrapsub-commands.eslint-plugin-mushi-mushi—no-dead-handlerandno-mock-leak.- Admin → User stories · Inventory — the in-app surface where you accept proposals and trigger one-off gate runs.