Skip to content
Performance & Benchmarks Test Suite

Performance & Benchmarks Test Suite

Vectora deve atender todos os SLAs de performance com latência, throughput, utilização de recursos e escalabilidade comprovadas através de benchmarks rigorosos. Esta suite valida que Vectora escala para 50+ usuários simultâneos e mantém p95 latency < 500ms. Cobertura: 80+ testes | Prioridade: ALTA

Query Latency Testing

  • Latência p50 < 100ms (12 testes)
  • Latência p95 < 500ms (12 testes)
  • Latência p99 < 1s (10 testes)
  • Warmup vs cold start (8 testes)

Target: p95 < 500ms, p99 < 1s

Embedding Latency

  • Embedding single query < 2s (8 testes)
  • Batch embeddings < 5s (8 testes)
  • Reranking latency < 1s (8 testes)
  • Cache hit latency < 50ms (8 testes)

Target: Single < 2s, batch < 5s

Throughput Testing

  • Queries/segundo > 100 (10 testes)
  • Embeddings/segundo > 50 (8 testes)
  • Concurrent users > 50 (10 testes)
  • Sustained load testing (8 testes)

Target: > 100 queries/sec sustained

Resource Utilization

  • Memory footprint < 500MB (8 testes)
  • CPU usage < 50% (8 testes)
  • Disk I/O optimized (8 testes)
  • Goroutine count stable (8 testes)

Target: < 500MB mem, < 50% CPU

Scalability Testing

  • 100 concurrent requests (8 testes)
  • 1000 requests/min sustained (8 testes)
  • Large result sets (> 1MB) (8 testes)
  • Connection pooling efficiency (5 testes)

Target: 100+ concurrent, sustained 1000 req/min


Performance SLAs

MétricaAlvo
Latência p50< 100ms
Latência p95< 500ms
Latência p99< 1s
Throughput> 100 queries/sec
Memory< 500MB
CPU< 50%
Concurrent Users> 50

External Linking

ConceitoRecursoLink
Go ProfilingOfficialblog.golang.org/profiling-go-programs
Benchmarkingpkg.go.devpkg.go.dev/testing#B
Performance TestingGuidemedium.com/performance-testing-go
Load Testing ToolsArtilleryartillery.io/
Latency PercentilesEngineeringengineering.linkedin.com/latency-percentiles