RAG Knowledge Base
The voice agent includes a privacy-first RAG (Retrieval-Augmented Generation) system powered by Turso (Limbo) WASM, using a browser-local database with in-browser query embeddings.
How It Works
- Build-time indexing: Run
bun run build:ragto generate embeddings from all documentation using llama.cpp with Vulkan acceleration - Local embeddings: Uses Transformers.js with Xenova/all-MiniLM-L6-v2 for fast, local semantic query embeddings
- Semantic search: When the user asks a question, the AI first searches the knowledge base for relevant docs
- Grounded responses: The AI formulates answers based on retrieved context, ensuring factual accuracy
Generated Files
When you run bun run build:rag, the system generates pre-computed embedding artifacts that are loaded directly into the browser. These files contain chunked documentation content with corresponding vector embeddings, ready for instant semantic search without any client-side processing.
The build process creates a complete RAG knowledge base including:
rag-index.yml— chunked documentation content with metadata and precomputed vectorsrag-documents.yml— full document content for hierarchical retrieval
These files are served as static assets and loaded into the browser-local Turso database on page initialization.
Key Features
- Zero cloud costs: No API calls or external services for search
- Privacy-preserving: All data stays in the browser
- Offline-capable: Works without internet after initial page load
- Instant responses: ~50ms search latency for 10K+ documents
- Source citations: The AI can reference specific documentation pages
Debug Mode
Set VITE_VOWEL_DEBUG_RAG=true to enable a debug panel that shows:
- Real-time STT (speech-to-text) transcripts
- RAG search queries and retrieved documents
- Confidence scores and source paths
Users can also say "Open the debug chat" to see this panel.
The debug interface provides complete visibility into the RAG pipeline:

The files panel displays all loaded documentation chunks and their metadata:

Agent Skills
haven-local-rag(.agents/skills/haven-local-rag/) - Browser-based semantic search and local RAG pipeline backgroundrag-prebuild(.agents/skills/rag-prebuild/) - Pre-build embeddings withbuild-rag.pyfor production