Skip to Content
v0.8.0 · shippedNative iOS / Android / Flutter / Capacitor SDKs, A2A discovery, SOC 2 readiness, residency, BYO storage, BYOK. Read the changelog →
ConceptsKnowledge graph

Knowledge graph

Mushi maintains a per-project knowledge graph of reports, components, fixes, and the developers who touched them.

Storage

BackendRoleStatus
pgvectorEmbedding-based dedup & similarityPrimary, always-on
Apache AGETrue graph queries (Cypher, paths)Parallel-write (Phase 1)
blast_radius_cache MVPer-component blast radiusRefreshed by pg_cron
intelligence_benchmarks_mvCross-customer (k≥5) benchmarkRefreshed nightly

Why both pgvector and AGE?

pgvector covers the 95% case (semantic dedup, “show me reports like this one”) with no operational extra cost. AGE handles the 5% that requires true graph traversal — “every report on <Checkout/> from the last release that also touched the same Stripe webhook.” AGE is opt-in via the graph_backend project setting; the parallel write keeps both in sync and an age_drift_audit table snapshots any divergence so we can repair it.

Indexing

Embeddings use HNSW (built when a project crosses 50k reports; IVFFlat fallback below). All RLS-referenced columns (especially user_id) are explicitly indexed because RLS subqueries that match on un-indexed columns can be 100× slower.

Bug Ontology

Tags are drawn from the cross-customer Bug Ontology (see Whitepaper §2.6 / Appendix A). Each tag is a versioned, reviewed enum that lets us surface “this looks like the same Stripe-3DS issue 11 other customers reported last month” — without exposing the other customers’ data.

Last updated on