Skip to main content
Verified Performance Metrics

Performance Benchmarks

Enterprise-grade performance verified through rigorous testing. These metrics represent production-validated results across sovereign deployments.

99.9%
Factual Accuracy
CAG deterministic grounding eliminates hallucinations
<100ms
P95 Latency
Sub-second response for complex queries
10K+
Concurrent Queries
Horizontal scaling across sovereign clusters
2M
Token Context
Full document analysis in single context

Detailed Performance Metrics

Metric Value Test Conditions
Query Response Time (P50) 42ms Standard knowledge graph query, 3-hop traversal
Query Response Time (P95) 87ms Complex multi-hop with SCAG filtering
Query Response Time (P99) 156ms Full corpus scan with Shield validation
Factual Precision (RAGAS) 99.9% 10,000 query benchmark, ground-truth verified
Hallucination Rate ≈0% CAG deterministic grounding, no stochastic generation
Knowledge Graph Nodes 100M+ Tested on enterprise customer deployment
Concurrent Users 10,000+ Load test on 8-node Kubernetes cluster
System Availability (SLA) 99.95% Rolling 12-month production metric
Cold Start Time <30s New pod initialization with model loading

CAG vs Traditional RAG

Traditional RAG

  • 5-15% hallucination rate
  • Vector similarity ≠ factual relevance
  • No audit trail for retrieved chunks
  • Stochastic token generation
  • Chunk boundaries lose context

ArcaQ CAG

  • ≈0% hallucination (deterministic)
  • Graph traversal = semantic reasoning
  • Full provenance for every fact
  • Grounded caching via SCAG
  • Entity-relation context preserved

Testing Methodology

All benchmarks are conducted in production-equivalent environments using standardized testing frameworks. Latency metrics are measured end-to-end from API gateway to response completion.

Factual accuracy is validated using the RAGAS framework with human-annotated ground truth datasets. Each benchmark includes 10,000+ queries across diverse domains (finance, healthcare, legal, technical documentation).

Load testing uses distributed k6 runners simulating realistic user patterns including burst traffic scenarios. Infrastructure: Kubernetes clusters with AMD EPYC processors, NVMe storage, 100Gbps network fabric.

Domain-Specific Performance

ArcaQ has been validated across five enterprise verticals, each with different data volumes, compliance constraints, and reasoning complexity. The Knowledge Graph architecture adapts to domain ontologies while maintaining consistent latency and accuracy profiles.

🏦

Banking & Finance

Risk analysis queries over 50M+ financial entities resolved in under 60ms P95. GDPR and NDMO compliance layers add <8ms overhead. Fraud detection pattern matching across 3-hop graph traversals achieves 99.97% precision with zero false-positive rate on sanctioned entity screening.

P95: 58ms Accuracy: 99.97%
🏛️

Government & Public Sector

Deployed on air-gapped infrastructure for critical national decision support. Sovereign clusters in offline mode sustain <100ms median latency with zero external API calls. Supports Arabic, French, and Tamazight (Tifinagh) natively with no translation overhead. Full audit trail for every decision as required by regulatory mandates.

Air-gapped: <100ms 0 external calls
⚕️

Healthcare & Life Sciences

Clinical knowledge graphs with 20M+ medical entity relationships. PII anonymisation pipeline processes 10,000 patient records per second with ≈0% re-identification risk. Drug interaction reasoning across 4-hop traversal completes in 73ms P95. HIPAA and CNDP compliance validated with zero data residency violations.

P95: 73ms 10K rec/s PII pipeline
⚖️

Legal & Compliance

Contract analysis over 500-page documents ingested in under 12 seconds. Regulatory conflict arbitration across 60+ jurisdictions resolves ambiguity in <200ms using the SCAG multi-layer filter. Precedent retrieval achieves 98.4% relevance score versus 71% for traditional keyword search, with complete citation traceability to source paragraph.

Relevance: 98.4% 60+ jurisdictions
🏭

Industrial & Manufacturing

Predictive maintenance knowledge graphs correlating 1,000+ sensor streams per asset. Anomaly detection latency under 35ms enables real-time intervention. Cross-plant knowledge transfer achieves 94% accuracy on new facility deployments without re-training, leveraging the institutional expertise preservation module of the Refinery Agent.

Anomaly: <35ms Transfer: 94%

Seven-Agent Architecture — Per-Agent Metrics

ArcaQ's seven specialized AI agents each carry specific performance contracts. Agents operate in parallel on a shared Knowledge Graph, allowing compound queries to resolve faster than the sum of their individual latencies. The Orchestrator Agent coordinates sub-second fan-out and merge cycles.

Agent Role P95 Latency Throughput
Orchestrator Query decomposition, agent fan-out, result merge 12ms 50K req/s
KG Navigator Multi-hop Knowledge Graph traversal and entity resolution 38ms 20K req/s
Refinery Agent Data quality scoring, inconsistency detection, provenance tagging 54ms 8K doc/s
Shield Agent SCAG 4-layer security filter: legal, hierarchical, cultural, strategic 8ms 100K req/s
Decision Agent Multi-criteria scoring, explainable decision paths, audit trail generation 67ms 5K req/s
DMS Agent Document ingestion, semantic extraction, ontology mapping 2.1s/doc 500 doc/s
Meta Agent Cross-agent orchestration, institutional memory, expertise routing 22ms 30K req/s

Measured on 8-node Kubernetes cluster (AMD EPYC 7763, 256GB RAM, NVMe). DMS throughput varies with document size and extraction complexity.

SCAG Security Overhead: Zero-Compromise Performance

The Sovereign Contextual Alignment Gate (SCAG, Patent Claim 11) is a 4-layer security filter applied to every query. A key design objective was that security must not degrade user experience. The SCAG pipeline is fully parallelized — all four layers (Legal, Hierarchical, Cultural, Strategic Secrets) execute concurrently, not sequentially.

8ms
Full SCAG P95 Latency
All 4 layers, parallel execution
<9%
Total Latency Overhead
vs. unfiltered baseline query
100%
Policy Enforcement Rate
Zero policy bypass in 12 months production
60+
Jurisdiction Rulesets
Automatically arbitrated, no manual intervention

How SCAG Achieves Sub-10ms Security

Layer 1 — Legal (2ms)

Pre-compiled jurisdiction rules stored as in-memory Bloom filters. GDPR, CNDP, NDMO and 57 other data protection laws checked as bit-operations, not database queries.

Layer 2 — Hierarchical (3ms)

ReBAC (Relationship-Based Access Control) graph lookups cached at L1 CPU cache level. Role hierarchies pre-materialized into adjacency matrices for O(1) authorization checks.

Layer 3 — Cultural (1ms)

Institutional values encoded as vector embeddings. Semantic alignment scoring against organization policy runs on dedicated SIMD-accelerated microkernel, fully independent of KG traversal.

Layer 4 — Strategic (2ms)

Strategic secrets classifier runs as a lightweight ONNX model (3M parameters). Detects sensitive strategic information patterns with 99.8% recall, triggering data masking or access denial before any content is returned.

ArcaQ vs. Alternative Architectures

The following comparison is based on benchmark data published by each vendor and independent evaluations. Cloud-hosted AI platforms (Azure OpenAI, AWS Bedrock) aggregate user data and provide no data residency guarantees. General-purpose RAG tools lack the semantic reasoning layer required for deterministic enterprise decisions.

Capability ArcaQ CAG Generic RAG Cloud AI (API) DataOS / Palantir
Factual accuracy 99.9% 85–95% 80–92% 90–96%
Hallucination rate ≈0% 5–15% 8–20% 2–5%
Data sovereignty Full (on-premise) Partial None Partial
Response latency P95 87ms 200–800ms 400–2000ms 300–1200ms
Full audit trail Every query No No Partial
Air-gap deployment Yes Possible No No
Multi-jurisdiction compliance 60+ jurisdictions None built-in GDPR only Limited

Benchmark FAQ

How is the 99.9% accuracy figure calculated?

Accuracy is measured using the RAGAS (Retrieval-Augmented Generation Assessment) framework on a 10,000-query benchmark dataset with human-annotated ground truth across five domains (finance, healthcare, legal, government, manufacturing). "Factual accuracy" is defined as the fraction of responses where every stated fact is traceable to a verified source node in the Knowledge Graph. The ≈0% hallucination rate reflects that ArcaQ's CAG architecture does not perform stochastic token generation — all outputs are grounded to explicit graph paths.

What infrastructure are benchmarks measured on?

Latency benchmarks use an 8-node Kubernetes cluster: AMD EPYC 7763 (64-core), 256GB ECC RAM, 4×NVMe 3.84TB in RAID-0, 100Gbps InfiniBand interconnect. This configuration is representative of a mid-tier sovereign deployment. Smaller single-server deployments (16-core, 64GB) achieve P95 latency under 180ms for standard queries. Cloud deployments on equivalent hardware show 15–25% higher latency due to network virtualization overhead.

How does performance scale with Knowledge Graph size?

ArcaQ uses a sharded graph database architecture. Latency scales sub-linearly with node count: a graph of 1M nodes has P95 latency of ~35ms; 10M nodes ~48ms; 100M nodes ~87ms. This is achieved via graph partitioning aligned to domain ontology boundaries, ensuring most queries remain within a single shard. Cross-shard queries (typically complex multi-domain reasoning) account for the P99 latency of 156ms.

Can these benchmarks be independently verified?

Yes. ArcaQ's Proof-of-Concept program deploys a full sovereign instance on your infrastructure or in an isolated cloud tenancy. You provide your own benchmark dataset and run the evaluation framework autonomously. The POC environment includes the complete 7-agent stack, SCAG security layer, and your domain ontology. Typical POC completion time is 4–6 weeks from contract signature.

What is the minimum hardware for a production deployment?

Minimum viable production: 1 server with 16-core CPU, 64GB RAM, 1TB NVMe storage. This supports up to 500 concurrent users and graphs up to 5M nodes with P95 latency under 200ms. Recommended production entry: 2-node cluster (16-core × 2, 128GB RAM) for high availability. Full enterprise scale (10,000+ concurrent users, 100M+ nodes) requires the 8-node reference configuration above or equivalent cloud resources.

Run Your Own Benchmark

Validate ArcaQ performance with your own data in a proof-of-concept deployment.

Request POC

Join the Sovereign AI Revolution

Partner with ArcaQ to bring sovereign decision intelligence to Africa and beyond.

Rabat, Morocco
Schedule a Call

Meet us at GITEX Africa 2026 — April 7-9 — Marrakech