Verified Performance Metrics

Performance Benchmarks

Enterprise-grade performance verified through rigorous testing. These metrics represent production-validated results across sovereign deployments.

99.9%

Factual Accuracy

CAG deterministic grounding eliminates hallucinations

<100ms

P95 Latency

Sub-second response for complex queries

10K+

Concurrent Queries

Horizontal scaling across sovereign clusters

2M

Token Context

Full document analysis in single context

Detailed Performance Metrics

Metric	Value	Test Conditions
Query Response Time (P50)	42ms	Standard knowledge graph query, 3-hop traversal
Query Response Time (P95)	87ms	Complex multi-hop with SCAG filtering
Query Response Time (P99)	156ms	Full corpus scan with Shield validation
Factual Precision (RAGAS)	99.9%	10,000 query benchmark, ground-truth verified
Hallucination Rate	≈0%	CAG deterministic grounding, no stochastic generation
Knowledge Graph Nodes	100M+	Tested on enterprise customer deployment
Concurrent Users	10,000+	Load test on 8-node Kubernetes cluster
System Availability (SLA)	99.95%	Rolling 12-month production metric
Cold Start Time	<30s	New pod initialization with model loading

CAG vs Traditional RAG

Traditional RAG

✗ 5-15% hallucination rate
✗ Vector similarity → factual relevance
✗ No audit trail for retrieved chunks
✗ Stochastic token generation
✗ Chunk boundaries lose context

ArcaQ CAG

✓ ~0% hallucination (deterministic)
✓ Graph traversal = semantic reasoning
✓ Full provenance for every fact
✓ Grounded caching via SCAG
✓ Entity-relation context preserved

Testing Methodology

All benchmarks are conducted in production-equivalent environments using standardized testing frameworks. Latency metrics are measured end-to-end from API gateway to response completion.

Factual accuracy is validated using the RAGAS framework with human-annotated ground truth datasets. Each benchmark includes 10,000+ queries across diverse domains (finance, healthcare, legal, technical documentation).

Load testing uses distributed k6 runners simulating realistic user patterns including burst traffic scenarios. Infrastructure: Kubernetes clusters with AMD EPYC processors, NVMe storage, 100Gbps network fabric.

Domain-Specific Performance

ArcaQ has been validated across five enterprise verticals, each with different data volumes, compliance constraints, and reasoning complexity. The Knowledge Graph architecture adapts to domain ontologies while maintaining consistent latency and accuracy profiles.

Banking & Finance

Risk analysis queries over 50M+ financial entities resolved in under 60ms P95. GDPR and NDMO compliance layers add <8ms overhead. Fraud detection pattern matching across 3-hop graph traversals achieves 99.97% precision with zero false-positive rate on sanctioned entity screening.

P95: 58ms Accuracy: 99.97%

Government & Public Sector

Deployed on air-gapped infrastructure for critical national decision support. Sovereign clusters in offline mode sustain <100ms median latency with zero external API calls. Supports Arabic, French, and Tamazight (Tifinagh) natively with no translation overhead. Full audit trail for every decision as required by regulatory mandates.

Air-gapped: <100ms 0 external calls

Healthcare & Life Sciences

Clinical knowledge graphs with 20M+ medical entity relationships. PII anonymisation pipeline processes 10,000 patient records per second with ≈0% re-identification risk. Drug interaction reasoning across 4-hop traversal completes in 73ms P95. HIPAA and CNDP compliance validated with zero data residency violations.

P95: 73ms 10K rec/s PII pipeline

Legal & Compliance

Contract analysis over 500-page documents ingested in under 12 seconds. Regulatory conflict arbitration across 60+ jurisdictions resolves ambiguity in <200ms using the SCAG multi-layer filter. Precedent retrieval achieves 98.4% relevance score versus 71% for traditional keyword search, with complete citation traceability to source paragraph.

Relevance: 98.4% 60+ jurisdictions

Industrial & Manufacturing

Predictive maintenance knowledge graphs correlating 1,000+ sensor streams per asset. Anomaly detection latency under 35ms enables real-time intervention. Cross-plant knowledge transfer achieves 94% accuracy on new facility deployments without re-training, leveraging the institutional expertise preservation module of the Refinery Agent.

Anomaly: <35ms Transfer: 94%

Nine-Agent Architecture — Per-Agent Metrics

ArcaQ's nine specialized AI agents each carry specific performance contracts. Agents operate in parallel on a shared Knowledge Graph, allowing compound queries to resolve faster than the sum of their individual latencies. The Orchestrator Agent coordinates sub-second fan-out and merge cycles.

Agent	Role	P95 Latency	Throughput
Orchestrator	Query decomposition, agent fan-out, result merge	12ms	50K req/s
KG Navigator	Multi-hop Knowledge Graph traversal and entity resolution	38ms	20K req/s
Refinery Agent	Data quality scoring, inconsistency detection, provenance tagging	54ms	8K doc/s
Shield Agent	SCAG 4-layer security filter: legal, hierarchical, cultural, strategic	8ms	100K req/s
Decision Agent	Multi-criteria scoring, explainable decision paths, audit trail generation	67ms	5K req/s
DMS Agent	Document ingestion, semantic extraction, ontology mapping	2.1s/doc	500 doc/s
Meta Agent	Cross-agent orchestration, institutional memory, expertise routing	22ms	30K req/s

Measured on 8-node Kubernetes cluster (AMD EPYC 7763, 256GB RAM, NVMe). DMS throughput varies with document size and extraction complexity.

SCAG Security Overhead: Zero-Compromise Performance

The Sovereign Contextual Alignment Gate (SCAG, Patent Claim 11) is a 4-layer security filter applied to every query. A key design objective was that security must not degrade user experience. The SCAG pipeline is fully parallelized — all four layers (Legal, Hierarchical, Cultural, Strategic Secrets) execute concurrently, not sequentially.

8ms

Full SCAG P95 Latency

All 4 layers, parallel execution

<9%

Total Latency Overhead

vs. unfiltered baseline query

100%

Policy Enforcement Rate

Zero policy bypass in 12 months production

60+

Jurisdiction Rulesets

Automatically arbitrated, no manual intervention

How SCAG Achieves Sub-10ms Security

Layer 1 — Legal (2ms)

Pre-compiled jurisdiction rules stored as in-memory Bloom filters. GDPR, CNDP, NDMO and 57 other data protection laws checked as bit-operations, not database queries.

Layer 2 — Hierarchical (3ms)

ReBAC (Relationship-Based Access Control) graph lookups cached at L1 CPU cache level. Role hierarchies pre-materialized into adjacency matrices for O(1) authorization checks.

Layer 3 — Cultural (1ms)

Institutional values encoded as vector embeddings. Semantic alignment scoring against organization policy runs on dedicated SIMD-accelerated microkernel, fully independent of KG traversal.

Layer 4 — Strategic (2ms)

Strategic secrets classifier runs as a lightweight ONNX model (3M parameters). Detects sensitive strategic information patterns with 99.8% recall, triggering data masking or access denial before any content is returned.

ArcaQ vs. Alternative Architectures

The following comparison is based on benchmark data published by each vendor and independent evaluations. Cloud-hosted AI platforms (Azure OpenAI, AWS Bedrock) aggregate user data and provide no data residency guarantees. General-purpose RAG tools lack the semantic reasoning layer required for deterministic enterprise decisions.

Capability	ArcaQ CAG	Generic RAG	Cloud AI (API)	DataOS / Palantir
Factual accuracy	99.9%	85–95%	80–92%	90–96%
Hallucination rate	≈0%	5–15%	8–20%	2–5%
Data sovereignty	Full (on-premise)	Partial	None	Partial
Response latency P95	87ms	200–800ms	400–2000ms	300–1200ms
Full audit trail	Every query	No	No	Partial
Air-gap deployment	Yes	Possible	No	No
Multi-jurisdiction compliance	60+ jurisdictions	None built-in	GDPR only	Limited

Benchmark FAQ

How is the 99.9% accuracy figure calculated?

Accuracy is measured using the RAGAS (Retrieval-Augmented Generation Assessment) framework on a 10,000-query benchmark dataset with human-annotated ground truth across five domains (finance, healthcare, legal, government, manufacturing). "Factual accuracy" is defined as the fraction of responses where every stated fact is traceable to a verified source node in the Knowledge Graph. The ≈0% hallucination rate reflects that ArcaQ's CAG architecture does not perform stochastic token generation — all outputs are grounded to explicit graph paths.

What infrastructure are benchmarks measured on?

Latency benchmarks use an 8-node Kubernetes cluster: AMD EPYC 7763 (64-core), 256GB ECC RAM, 4×NVMe 3.84TB in RAID-0, 100Gbps InfiniBand interconnect. This configuration is representative of a mid-tier sovereign deployment. Smaller single-server deployments (16-core, 64GB) achieve P95 latency under 180ms for standard queries. Cloud deployments on equivalent hardware show 15–25% higher latency due to network virtualization overhead.

How does performance scale with Knowledge Graph size?

ArcaQ uses a sharded graph database architecture. Latency scales sub-linearly with node count: a graph of 1M nodes has P95 latency of ~35ms; 10M nodes ~48ms; 100M nodes ~87ms. This is achieved via graph partitioning aligned to domain ontology boundaries, ensuring most queries remain within a single shard. Cross-shard queries (typically complex multi-domain reasoning) account for the P99 latency of 156ms.

Can these benchmarks be independently verified?

Yes. ArcaQ's Proof-of-Concept program deploys a full sovereign instance on your infrastructure or in an isolated cloud tenancy. You provide your own benchmark dataset and run the evaluation framework autonomously. The POC environment includes the complete 9-agent stack, SCAG security layer, and your domain ontology. Typical POC completion time is 4–6 weeks from contract signature.

What is the minimum hardware for a production deployment?

Minimum viable production: 1 server with 16-core CPU, 64GB RAM, 1TB NVMe storage. This supports up to 500 concurrent users and graphs up to 5M nodes with P95 latency under 200ms. Recommended production entry: 2-node cluster (16-core x 2, 128GB RAM) for high availability. Full enterprise scale (10,000+ concurrent users, 100M+ nodes) requires the 8-node reference configuration above or equivalent cloud resources.

Run Your Own Benchmark

Validate ArcaQ performance with your own data in a proof-of-concept deployment.

Request POC