Skip to content

Vectora

PT | EN

Overview

Traditional AI agents operate in fragmented contexts, generating hallucinations, wasting tokens, and accidentally exposing secrets. Vectora solves this not by being “another chat”, but as a Tier 2 Sub-Agent designed exclusively for software engineering: it intercepts calls via MCP Protocol, validates security in real-time with Guardian, orchestrates multi-hop retrieval via Context Engine, and delivers structured context to your principal agent (Claude Code, Gemini CLI, Cursor, etc.).


Core Formula: Functional Agent = Model (Gemini 3 Flash) + [Harness Runtime](concepts/harness-runtime/) + Governed Context (Voyage 4 + MongoDB Atlas)


The Problem Vectora Solves

Failure in Generic AgentsPractical ImpactHow Vectora Mitigates
Shallow ContextSearch for “authentication” returns 50 irrelevant filesReranker 2.5 filters by real semantic relevance, not raw cosine similarity
No Pre-Execution ValidationDangerous tool calls run before being auditedHarness Runtime intercepts, validates Zod schema, and applies Guardian before execution
Lack of IsolationData from different projects leaks between sessionsNamespace Isolation via application-level RBAC + mandatory backend filtering
Unpredictable ConsumptionLLMs overfetch, waste tokens on boilerplateContext Engine decides scope, applies compaction (head/tail), injects only what’s relevant
Fragile SecurityBlocklists depend on prompts (jailbreakable)Hard-Coded Guardian is compiled into runtime, impossible to bypass via prompt

The Solution: Sub-Agent Architecture

Vectora is exposed exclusively via MCP. There is no chat CLI, TUI, or direct conversational interface. It operates silently as a governance and context layer:

    graph LR
    A[Principal Agent] -->|MCP Tool Call| B[Harness Runtime]
    B --> C{Guardian + Zod Validation}
    C -->| Approved| D[Context Engine]
    D --> E[Embed via Voyage 4]
    D --> F[Rerank via Voyage 2.5]
    E --> G[MongoDB Atlas Vector Search]
    F --> G
    G --> H[Composed Context + Metrics]
    H -->|MCP Response| A
  

Core Components

ModuleResponsibilityDocumentation
Harness RuntimeOrchestrates execution, validates schemas, intercepts tool calls, persists stateInfrastructure that connects the LLM to the real world, not a testing framework
Context EngineDecides scope (filesystem vs vector), applies AST parsing, multi-hop compactionPipeline Embed → Search → Rerank → Compose → Validate
Provider RouterRoutes to curated stack, manages BYOK fallback, tracks quotaNo generic layers. Official SDKs, stable parsing
Tool ExecutorValidates args via Zod, executes with exponential retry, sanitizes outputImmutable blocklist applied before any call

Curated Stack & Infrastructure

Vectora is not provider-agnostic. We operate with models rigorously calibrated to guarantee metric consistency, parsing stability, and predictable costs:

LayerTechnologyWhy we chose itDocs
LLM (Inference)gemini-3-flashLatency <30ms, stable tool calling, 90% lower cost vs ProGemini 3
Embeddingsvoyage-4AST-aware, captures functional similarity (validateTokencheckJWT)Voyage 4
Rerankingvoyage-rerank-2.5Cross-encoder optimized for code, latency <100ms, +25% precision vs BM25Reranker
Vector DB + MetadataMongoDB AtlasUnified backend (vectors + docs + state + audit), scalable, no ETLMongoDB Atlas
State PersistenceSessions + AGENTS.mdWorking memory between MCP calls, continuity for long-horizon contextState Persistence


No support for generic fallbacks: Vectora does not integrate OpenAI, Anthropic, OpenRouter, or local models. The calibration of Harness Runtime strictly depends on this stack. For multi-provider, use standard market MCP tools.


Security, Governance & BYOK

Security in Vectora is implemented at the application layer, not delegated to the database:

LayerImplementationDocument
Hard-Coded GuardianImmutable blocklist (.env, .key, .pem, binaries, lockfiles) executed before any tool callGuardian
Trust FolderPath validation with fs.realpath + per-namespace/project scopeTrust Folder
Application RBACRoles (reader, contributor, admin, auditor) validated at runtimeRBAC
Mandatory BYOKGEMINI_API_KEY + VOYAGE_API_KEY are provided by the user on all plansFree Plan
Automatic FallbackManaged quota exhausts → silently routes to BYOK without interruptionPro Plan

Plans & Retention Policy

Vectora operates with a BYOK First model, where the backend (MongoDB Atlas) is managed by Kaffyn on all plans, but API keys belong to the user:

PlanPriceStorageAPI QuotaRetentionDocs
Free$0/month512MB totalPure BYOK30 days inactivity = vector index deletionFree
Pro~$20/month10GB total500k tokens + 100k vectors/month90 days post-cancellationPro
Team$5 base + $15/user/month50GB totalShared pool + per-user BYOK fallback180 days post-cancellationTeam
EnterpriseCustomUnlimited (VPC/Dedicated)Per contractCustom policyOverview


Retention Rules: Free accounts inactive for 30 days have their vector index automatically deleted. Metadata is preserved for +90 days for export via vectora export. Downgrades notify of limit reduction and grant 7 days for backup. Details in Retention Policy.


Operation Flow (MCP-First)

  1. Detection: Principal Agent identifies need for deep context and triggers context_search via MCP.
  2. Interception: Harness Runtime captures call, validates namespace, applies Guardian.
  3. Decision: Context Engine chooses scope (filesystem, vector, or hybrid) and applies AST parsing.
  4. Embed + Rerank: Query is embedded via voyage-4, raw results are refined by voyage-rerank-2.5.
  5. Search & Compaction: MongoDB Atlas returns top-N with compaction (head/tail + pointers) to avoid context rot.
  6. Structured Response: Validated context + metrics are returned to the principal agent, which generates the final user response.

Where to Start?

CategoryDocumentDescription
ConceptsSub-AgentsWhy Sub-Agent and not passive MCP tools? Active governance vs static functions
Harness RuntimeHarness RuntimeTool Execution, Context Engineering, State Management, Verification Hooks
Context & RAGContext EngineAST parsing, compaction, multi-hop reasoning, hybrid ranking
RerankingRerankerPipeline Embed → Search → Rerank → LLM, precision metrics
ModelsGemini 3 · Voyage 4Curated stack, BYOK fallback, config schema, per-query costs
BackendMongoDB AtlasVector Search, collections, state persistence, multi-tenant isolation
SecurityGuardian · RBACHard-coded blocklist, Trust Folder, sanitization, per-namespace roles
PlansOverviewFree/Pro/Team, managed quota, automatic fallback, retention policy
IntegrationsClaude Code · Gemini CLIMCP configuration, IDE extensions, custom agents
ReferenceMCP Tools · Config YAMLTool schema, Zod-validated config.yaml, error codes
ContributingGuidelinesStrict TypeScript, Harness tests first, PRs, public roadmap

Phrase to remember:
“Vectora doesn’t respond to the user. It delivers governed context to your agent. Managed backend, API under your key, security in the application, your data always yours.”


Navigation Guide


Part of the Vectora ecosystem · Open Source (MIT) · TypeScript