Architecture
The whole system is intentionally boring at the seams: a Hono gateway in front of Supabase, six small specialised functions behind it, and three data substrates (relational + vector + graph) that mirror each other for their respective query shapes. Everything an SDK does is one POST to the gateway; everything a contributor adds is either a route or a function.
Wire-level flow
The four columns below correspond to the four “layers” of the stack — left-to-right is the path a single bug report takes from a user’s device to a merged PR.
| ⓵ SDKs (client) | ⓶ Edge gateway | ⓷ Data | ⓸ External |
|---|---|---|---|
@mushi-mushi/web + framework adapters | api (Hono router) | Postgres + RLS | Sentry (User-Feedback webhook) |
@mushi-mushi/react-native | fast-filter (spam triage) | pgvector embeddings | Slack |
| Native iOS / Android / Flutter | classify-report (2-stage LLM) | Apache AGE graph | GitHub (scoped App) |
@mushi-mushi/capacitor | judge-batch (nightly cron) | blast_radius_cache MV | Langfuse (traces) |
@mushi-mushi/mcp (agent surface) | fix-orchestrator (sandbox PR) | Supabase Vault (BYOK + plugin secrets) | Marketplace plugins (PagerDuty, Linear, …) |
intelligence-report (weekly digest) | |||
soc2-evidence (control snapshot) |
Edges that matter (everything else flows top-to-bottom inside a column):
- Every SDK posts to
api— there is no client-direct DB write. api → fast-filter → classify-reportis the canonical ingest path.classify-report → pgvector + Apache AGE + pluginshappens in parallel.judge-batchreads from and writes back intoclassify-report’s prompts.fix-orchestrator → GitHubonly fires after a human approves the triage.api → Supabase Vaultresolves BYOK secrets per-request, never cached.- All three LLM-touching functions stream traces to Langfuse.
Component summary
- Edge gateway (Hono on Supabase Edge Functions) authenticates with API keys (public reports) or JWT (admin), enforces rate limits, and routes to specialised functions.
fast-filtertriages high-volume garbage (form spam, duplicate one-liners) with the cheapest model the project’s BYOK plan permits.classify-reportruns the canonical two-stage classifier. Stage 1 tags category/severity/component from text. Stage 2, only if a screenshot is present, runs an air-gapped vision pass that cannot see Stage 1’s prompt (defence against prompt injection via screenshots).judge-batchis a nightly cron that scores classifier accuracy with a separate judge model, feeding the prompt-A/B framework that promotes new prompt versions automatically when they win statistically.fix-orchestratordispatches approved triage decisions to a sandbox (E2B today, Modal/Cloudflare Sandbox SDK adapters available) where an agent — speaking either MCP (tools/call) or REST — drafts code, opens a PR via a scoped GitHub App.intelligence-reportgenerates weekly bug-intelligence digests with optional cross-customer benchmarking (k-anonymity enforced via materialized view).soc2-evidencesnapshots control state for SOC 2 Type 1 readiness.
Knowledge graph
Reports embed into pgvector for semantic dedup. The same edges are mirrored asynchronously into Apache AGE so customers who care about graph queries (e.g. “find all reports touching the same component within a release window”) get true Cypher.
A2A Agent Card
Public discovery endpoints at /.well-known/agent-card and /v1/agent-card
expose the agent’s identity, skills, supported A2A version, MCP transport,
and auth requirements. Other agents can negotiate with Mushi without
out-of-band config.
Data residency
Each project pins to a region (us / eu / jp). The gateway returns
307 Temporary Redirect when a request reaches the wrong region, and the
Core SDK transparently follows it (caching the resolved region in
localStorage). The US cluster remains the catalog of record for
plugin marketplace + project metadata.
Storage
Per-project storage settings (BYO Storage) let you keep screenshots and intelligence-report PDFs in your own S3 / R2 / GCS / MinIO bucket. Supabase Storage is the default.