Vectora
Overview
Traditional AI agents operate in fragmented contexts, generating hallucinations, wasting tokens, and accidentally exposing secrets. Vectora solves this not by being “another chat”, but as a Tier 2 Sub-Agent designed exclusively for software engineering: it intercepts calls via MCP Protocol, validates security in real-time with Guardian, orchestrates multi-hop retrieval via Context Engine, and delivers structured context to your principal agent (Claude Code, Gemini CLI, Cursor, etc.).
Core Formula:
Functional Agent = Model (Gemini 3 Flash) + [Harness Runtime](concepts/harness-runtime/) + Governed Context (Voyage 4 + MongoDB Atlas)
The Problem Vectora Solves
| Failure in Generic Agents | Practical Impact | How Vectora Mitigates |
|---|---|---|
| Shallow Context | Search for “authentication” returns 50 irrelevant files | Reranker 2.5 filters by real semantic relevance, not raw cosine similarity |
| No Pre-Execution Validation | Dangerous tool calls run before being audited | Harness Runtime intercepts, validates Zod schema, and applies Guardian before execution |
| Lack of Isolation | Data from different projects leaks between sessions | Namespace Isolation via application-level RBAC + mandatory backend filtering |
| Unpredictable Consumption | LLMs overfetch, waste tokens on boilerplate | Context Engine decides scope, applies compaction (head/tail), injects only what’s relevant |
| Fragile Security | Blocklists depend on prompts (jailbreakable) | Hard-Coded Guardian is compiled into runtime, impossible to bypass via prompt |
The Solution: Sub-Agent Architecture
Vectora is exposed exclusively via MCP. There is no chat CLI, TUI, or direct conversational interface. It operates silently as a governance and context layer:
graph LR
A[Principal Agent] -->|MCP Tool Call| B[Harness Runtime]
B --> C{Guardian + Zod Validation}
C -->| Approved| D[Context Engine]
D --> E[Embed via Voyage 4]
D --> F[Rerank via Voyage 2.5]
E --> G[MongoDB Atlas Vector Search]
F --> G
G --> H[Composed Context + Metrics]
H -->|MCP Response| A
Core Components
| Module | Responsibility | Documentation |
|---|---|---|
| Harness Runtime | Orchestrates execution, validates schemas, intercepts tool calls, persists state | Infrastructure that connects the LLM to the real world, not a testing framework |
| Context Engine | Decides scope (filesystem vs vector), applies AST parsing, multi-hop compaction | Pipeline Embed → Search → Rerank → Compose → Validate |
| Provider Router | Routes to curated stack, manages BYOK fallback, tracks quota | No generic layers. Official SDKs, stable parsing |
| Tool Executor | Validates args via Zod, executes with exponential retry, sanitizes output | Immutable blocklist applied before any call |
Curated Stack & Infrastructure
Vectora is not provider-agnostic. We operate with models rigorously calibrated to guarantee metric consistency, parsing stability, and predictable costs:
| Layer | Technology | Why we chose it | Docs |
|---|---|---|---|
| LLM (Inference) | gemini-3-flash | Latency <30ms, stable tool calling, 90% lower cost vs Pro | Gemini 3 |
| Embeddings | voyage-4 | AST-aware, captures functional similarity (validateToken ≈ checkJWT) | Voyage 4 |
| Reranking | voyage-rerank-2.5 | Cross-encoder optimized for code, latency <100ms, +25% precision vs BM25 | Reranker |
| Vector DB + Metadata | MongoDB Atlas | Unified backend (vectors + docs + state + audit), scalable, no ETL | MongoDB Atlas |
| State Persistence | Sessions + AGENTS.md | Working memory between MCP calls, continuity for long-horizon context | State Persistence |
No support for generic fallbacks: Vectora does not integrate OpenAI, Anthropic, OpenRouter, or local models. The calibration of Harness Runtime strictly depends on this stack. For multi-provider, use standard market MCP tools.
Security, Governance & BYOK
Security in Vectora is implemented at the application layer, not delegated to the database:
| Layer | Implementation | Document |
|---|---|---|
| Hard-Coded Guardian | Immutable blocklist (.env, .key, .pem, binaries, lockfiles) executed before any tool call | Guardian |
| Trust Folder | Path validation with fs.realpath + per-namespace/project scope | Trust Folder |
| Application RBAC | Roles (reader, contributor, admin, auditor) validated at runtime | RBAC |
| Mandatory BYOK | GEMINI_API_KEY + VOYAGE_API_KEY are provided by the user on all plans | Free Plan |
| Automatic Fallback | Managed quota exhausts → silently routes to BYOK without interruption | Pro Plan |
Plans & Retention Policy
Vectora operates with a BYOK First model, where the backend (MongoDB Atlas) is managed by Kaffyn on all plans, but API keys belong to the user:
| Plan | Price | Storage | API Quota | Retention | Docs |
|---|---|---|---|---|---|
| Free | $0/month | 512MB total | Pure BYOK | 30 days inactivity = vector index deletion | Free |
| Pro | ~$20/month | 10GB total | 500k tokens + 100k vectors/month | 90 days post-cancellation | Pro |
| Team | $5 base + $15/user/month | 50GB total | Shared pool + per-user BYOK fallback | 180 days post-cancellation | Team |
| Enterprise | Custom | Unlimited (VPC/Dedicated) | Per contract | Custom policy | Overview |
Retention Rules: Free accounts inactive for 30 days have their vector index automatically deleted. Metadata is preserved for +90 days for export via vectora export. Downgrades notify of limit reduction and grant 7 days for backup. Details in Retention Policy.
Operation Flow (MCP-First)
- Detection: Principal Agent identifies need for deep context and triggers
context_searchvia MCP. - Interception: Harness Runtime captures call, validates namespace, applies Guardian.
- Decision: Context Engine chooses scope (filesystem, vector, or hybrid) and applies AST parsing.
- Embed + Rerank: Query is embedded via
voyage-4, raw results are refined byvoyage-rerank-2.5. - Search & Compaction: MongoDB Atlas returns top-N with compaction (head/tail + pointers) to avoid context rot.
- Structured Response: Validated context + metrics are returned to the principal agent, which generates the final user response.
Where to Start?
| Category | Document | Description |
|---|---|---|
| Concepts | Sub-Agents | Why Sub-Agent and not passive MCP tools? Active governance vs static functions |
| Harness Runtime | Harness Runtime | Tool Execution, Context Engineering, State Management, Verification Hooks |
| Context & RAG | Context Engine | AST parsing, compaction, multi-hop reasoning, hybrid ranking |
| Reranking | Reranker | Pipeline Embed → Search → Rerank → LLM, precision metrics |
| Models | Gemini 3 · Voyage 4 | Curated stack, BYOK fallback, config schema, per-query costs |
| Backend | MongoDB Atlas | Vector Search, collections, state persistence, multi-tenant isolation |
| Security | Guardian · RBAC | Hard-coded blocklist, Trust Folder, sanitization, per-namespace roles |
| Plans | Overview | Free/Pro/Team, managed quota, automatic fallback, retention policy |
| Integrations | Claude Code · Gemini CLI | MCP configuration, IDE extensions, custom agents |
| Reference | MCP Tools · Config YAML | Tool schema, Zod-validated config.yaml, error codes |
| Contributing | Guidelines | Strict TypeScript, Harness tests first, PRs, public roadmap |
Phrase to remember:
“Vectora doesn’t respond to the user. It delivers governed context to your agent. Managed backend, API under your key, security in the application, your data always yours.”
Navigation Guide
- Getting Started — Installation, BYOK setup, and MCP integration.
- Core Concepts — Understand Sub-Agents, Context Engine, and Reranking.
- Security & Governance — Details on Guardian, Trust Folder, and RBAC.
- Authentication — SSO flows, Unified Identity, and API Keys.
- Models & Providers — Curated stack with Gemini 3 and Voyage AI.
- Backend & Persistence — MongoDB Atlas, Sessions, and State Persistence.
- Integrations — How to use with Claude Code, Gemini CLI, and Cursor.
- Plans & Pricing — Feature comparison and retention policy.
- Technical Reference — MCP tool schema and Config YAML.
- Contributing — Guidelines, code standards, and roadmap.
- FAQ — Troubleshooting and common questions.
- Protocols — MCP Protocol specifications in Vectora.
Part of the Vectora ecosystem · Open Source (MIT) · TypeScript