Reference
Vectora is a modular layered architecture that combines embedding (Voyage), search (Qdrant), reranking (Voyage), and reasoning (Gemini) to provide intelligent and governed context.
┌─────────────────────────────────────────────────────┐
│ IDEs (Claude Code, Cursor, VS Code) │
└────────────────────┬────────────────────────────────┘
│ MCP Protocol
▼
┌─────────────────────────────────────────────────────┐
│ Vectora MCP Server │
│ (search_context, analyze_dependencies, etc) │
└────────┬──────────────────────────────────────────┬─┘
│ │
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Context Engine │ │ Harness Runtime │
│ - Embedding (V4) │ │ - Pre-execution │
│ - Search (HNSW) │ │ - Validation │
│ - Reranking (V2.5) │ │ - Metrics │
└────────┬─────────────┘ └──────────┬───────────┘
│ │
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Guardian Blocklist │ │ RBAC System │
│ - Path isolation │ │ - 5 roles │
│ - Trust folder │ │ - 15 permissions │
│ - Pattern matching │ │ - User management │
└────────┬─────────────┘ └──────────┬───────────┘
│ │
▼──────────────┬───────────────────▼
│
▼
┌──────────────────────┐
│ Qdrant Vector DB │
│ - Collections │
│ - HNSW Index │
│ - Namespaces │
│ - Metadata Filters │
└──────────────────────┘
│
▼
┌──────────────────────┐
│ File Storage │
│ - Trust folder │
│ - Vector index │
│ - Cache (.vectora) │
└──────────────────────┘Layers
Vectora’s architecture is organized into four main layers that ensure everything from the user interface to secure data persistence.
1. Integration Layer (IDEs)
Where the user interacts with Vectora:
- Claude Code: Native MCP
- Cursor: Native MCP
- VS Code: Own extension
- ChatGPT: Custom GPT Plugin
- CLI: Direct commands
2. MCP Server Layer
This layer acts as the communication brain, following the Model Context Protocol open standard to ensure full interoperability.
// MCP Tool example
tool: {
name: "search_context",
inputSchema: { /* JSON Schema */ }
}Converts requests into Context Engine calls.
3. Core Logic Layer
Here resides the system’s intelligence, where context is processed, validated, and governed before reaching the tool executor.
Context Engine
Orchestrates intelligent search:
- Embedding: Text → vector (Voyage 4, 1536D)
- Search: HNSW search in Qdrant (top-100)
- Reranking: Refines top-100 → top-10 (Voyage Rerank 2.5)
- Compaction: Reduces size while maintaining context (head/tail)
- Validation: Harness validates output
Harness Runtime
Protection and validation:
- Pre-execution: Guardian checks, rate limit, preconditions
- Execution: Wrapped tool call with timeout/retry
- Post-execution: Validation, metrics, comparison mode
Guardian Blocklist
Hard-coded security:
- Trust Folder:
/absolute/path/to/srcis the perimeter - Path Isolation: Directory traversal blocked
- Pattern Matching: Regex rules for blocking
- Audit Logging: All attempts recorded
RBAC (Role-Based Access Control)
5 hierarchical levels:
Owner
├─ Edit namespace, manage users
├─ Admin
│ ├─ Configure server, manage keys
│ ├─ Editor
│ │ ├─ Index, search, analyze
│ │ ├─ Viewer
│ │ │ └─ Search only
│ │ └─ Guest
│ │ └─ Limited search (rate limited)15 granular permissions: search, index, delete, configure, etc.
4. Storage Layer
The storage layer ensures that vector indices and metadata are persisted securely and efficiently at a local or distributed level.
Qdrant (Vector Database)
- Collections: One per namespace
- HNSW: Hierarchical Navigable Small World
- Metadata Filtering: Pre-filtering by namespace
- Quantization: Dimensionality reduction (4x faster)
collection: "your-namespace"
vectors:
size: 1536
distance: cosine
hnsw:
m: 16
ef_construct: 200
ef_search: 150Local Storage
- Indexing State:
.vectora/(cache) - Configuration:
vectora.config.yaml - Credentials:
~/.vectora/credentials.enc(encrypted) - AGENTS.md: Agent memory (json-in-frontmatter)
Data Flow
Data flow in Vectora is optimized for ultra-low latency, ensuring that context is retrieved and validated in milliseconds.
Semantic Search
1. User Query
"How to do authentication?"
▼
2. Embedding (Voyage 4)
[0.12, 0.45, ..., 0.67] (1536D)
▼
3. Vector Search (HNSW/Qdrant)
Top-100 chunks by cosine similarity
▼
4. Reranking (Voyage Rerank 2.5)
Refines to top-10 (semantic relevance)
▼
5. Compaction
Head (first lines) + Tail (last lines)
Maintains context, reduces tokens
▼
6. Validation (Harness)
- Output schema
- Security checks
- Metrics captured
▼
7. Response
{chunks: [...], precision: 0.87}
▼
8. To IDE
Claude/Cursor/VS Code receive chunksRate Limiting & SLA
Request → Guardian (check blocklist) →
Rate Limiter (60 req/min free tier) →
Timeout (30s default) →
Retry (3 attempts) →
Circuit Breaker (fail-open after 5 errors)Key Components
| Component | Function | Provider |
|---|---|---|
| Embedding | Convert text→vector | Voyage 4 |
| Vector Store | Store/search vectors | Qdrant |
| Reranking | Refine relevance | Voyage Rerank 2.5 |
| LLM | Reasoning + analysis | Gemini 3 Flash |
| Auth | Token validation | JWT + RBAC |
| Namespace | Logical isolation | Qdrant collections |
| Trust Folder | Path isolation | Guardian |
System Configuration
# vectora.config.yaml
project:
name: "Your Project"
namespace: "your-namespace"
trust_folder: "./src"
providers:
embedding:
name: "voyage"
model: "voyage-4"
reranker:
name: "voyage"
model: "voyage-rerank-2.5"
llm:
name: "gemini"
model: "gemini-3-flash"
context_engine:
strategy: "semantic"
max_depth: 3
timeout_ms: 2000
harness:
enabled: true
pre_execution:
validate_guardian: true
rate_limit_per_minute: 60
post_execution:
validate_output: true
capture_metrics: true
guardian:
rules:
- pattern: "^(src|docs)/"
action: "allow"
- pattern: "\.env.*"
action: "block"
rbac:
roles:
- owner
- admin
- editor
- viewer
- guestPerformance Targets
| Metric | Target | Typical |
|---|---|---|
| Search Latency | <500ms | ~234ms |
| Embedding | <200ms | ~120ms |
| Reranking | <100ms | ~50ms |
| Retrieval Precision | ≥ 0.65 | ~0.78 |
| Tool Accuracy | ≥ 0.95 | ~0.98 |
| Security Events | 0 | 0 |
| Availability | 99.9% | 99.95% |
Scalability
Horizontal
- Multiple Qdrant clusters: For physical isolation
- Load balancing: Between MCP servers
- Read replicas: For large-scale search
Vertical
- Quantization: Reduces size by 4x
- Compaction: Reduces output by 50%
- Caching: Local result in
.vectora/
Security
Defense in Depth
1. Trust Folder (path isolation)
2. Guardian Blocklist (pattern matching)
3. RBAC (user-level permissions)
4. Harness (pre/post execution validation)
5. Audit Logging (who did what)
6. Encryption (API keys, tokens)Data Privacy
- BYOK (Bring Your Own Key): You control the keys
- Local processing: Embeddings are calculated locally
- No data sync: Code never leaves your server
- Audit trail: Complete and immutable
Next: Plans - Free
Part of the Vectora ecosystem · Open Source (MIT) · Contributors