Two Knowledge Worlds, One Platform
In most organizations, knowledge lives in two completely disconnected worlds. The first is institutional: databases with customer records and compliance rules, document repositories holding years of decisions, APIs exposing operational data. It's structured, vast — and almost impossible to navigate without knowing exactly what you're looking for.
The second world is personal: the expert who has read every report and knows which ones actually matter, the analyst whose annotations contain more insight than the document itself, the architect whose mental model spans ten years of technology choices. This knowledge is rich, contextual, irreplaceable — and completely invisible to the organization.
ArcaQ's insight: These two worlds don't just need to coexist — they need to actively enrich each other. Every expert annotation should strengthen the institutional graph. Every company entity should surface relevant personal notes. This is what the dual graph architecture enables.
The Living Encyclopedia: Multi-Domain Named Graphs
The Living Encyclopedia is ArcaQ's company-wide knowledge graph, stored in Apache Jena TDB2. What makes it "living" isn't the volume of data — it's the architecture: knowledge is organized into SPARQL named graphs per domain. Each domain (Compliance, Finance, Customer, Travel, Operations, MLOps, Catalog) lives at its own URI:
http://arcaq.ai/graph/compliance
http://arcaq.ai/graph/finance
http://arcaq.ai/graph/customer
http://arcaq.ai/graph/travel
http://arcaq.ai/graph/operations
http://arcaq.ai/graph/mlops
Each named graph is populated by the Brain Ingestion pipeline — a connector network that reaches every data source in your organization. The pipeline ingests SQL database schemas from PostgreSQL, MySQL, SQLite, MSSQL and Oracle; document content from PDF, DOCX, XLSX, CSV and Markdown; object storage from S3 and MinIO; REST API responses; MCP protocol resources; and Obsidian vault notes.
For each source document, the pipeline runs two parallel operations: a local LLM (Ollama, running on-premise) extracts RDF subject–predicate–object triplets and assigns them to the appropriate domain named graph; and a sentence encoder model generates semantic vectors written to Qdrant (collection: arcaq_knowledge). The result is bi-modal knowledge: graph-traversable ontology + vector-searchable embeddings.
Key invariant: No data leaves your infrastructure. The LLM runs locally via Ollama. Jena and Qdrant are deployed as Kubernetes pods in your cluster. The entire pipeline runs air-gap capable.
Interactive Domain Explorer
The Living Encyclopedia exposes a D3.js force-directed graph explorer — a real-time visualization of all entities and their connections across domains. Users can filter by domain to focus on a single named graph, click any node to inspect its properties and URI, follow cross-domain edges to trace knowledge that spans multiple departments, and zoom to navigate graphs with thousands of nodes.
Cross-domain edges are automatically detected: when a Compliance entity references a Customer entity, the graph surface that connection as a purple dashed edge — a visual signal that knowledge from two separate named graphs has been bridged. These are the most valuable nodes in the graph: entities that sit at the intersection of multiple domains are the ones that carry the most organizational context.
The Personal Second Brain: Your Private Knowledge Graph
Every ArcaQ user gets their own isolated knowledge graph, stored at a user-scoped named graph URI:
http://arcaq.ai/personal/{user_id}
This graph is entirely private — no other user can query it, no company-wide analytics touch it. It stores everything the user explicitly adds: personal notes, bookmarks, insights, meeting notes, expert corrections to company knowledge, auto-discovered connections to institutional entities. Each item is an RDF resource with full metadata: type, tags, creation timestamp, source URL, content.
Behavior Profile: Understanding How You Think
Beyond storing items, the Personal Second Brain continuously analyzes your knowledge graph to infer a behavior profile. By examining the types, tags, creation times, and frequency of your saved items, it derives:
- Learning style — are you analytical (structured, taxonomy-driven), exploratory (broad, cross-domain), or a knowledge curator (high volume, organized categories)?
- Top topics — which tags appear most in your saved knowledge, weighted by recency
- Active hours — when during the day your knowledge activity peaks
- Items analyzed — over a configurable time window (default 90 days)
This profile is displayed as a strip at the top of the Second Brain view — a learning style badge, your top tags, and activity metadata. It's not surveillance: only your own saved explicit knowledge drives the profile calculation. Passive browsing is never tracked.
The Bridge: Auto-Linking Personal Notes to Company Knowledge
The dual graph architecture delivers full value when personal knowledge connects to institutional knowledge. This is what the auto-linking mechanism does — and it's entirely powered by local LLM inference.
When you trigger Discover Links, here's what happens:
The result is visible immediately in the graph: dashed pink edges connecting your personal nodes to company entities. These edges create navigable paths between your expertise and the institutional knowledge — when you click a personal note, you see which company entities it connects to; when you explore the Living Encyclopedia, nodes connected to your personal graph are highlighted.
"The auto-link isn't a search suggestion — it's a semantic bridge. Your note about a customer issue connects to the Compliance entity that governs it, the Finance entity that tracks its cost, and the Operations entity that owns the resolution. One click, full context."
Enriching the Encyclopedia
The connection works in both directions. When a user marks a knowledge entity as an Expert Correction — a case where their personal expertise contradicts or refines what the ingested data says — that correction is visible in the Living Encyclopedia as an expert-validated annotation. The company graph becomes more accurate over time as the people who know best contribute their corrections.
This is the flywheel: the Living Encyclopedia feeds the Personal Second Brain with institutional context; the Personal Second Brain enriches the Living Encyclopedia with human judgment. Both graphs become more valuable with every interaction.
Real Data — Not Demo Data
A common failure point in enterprise knowledge management systems is beautiful UIs over empty databases. ArcaQ is built for production data integrity. The Brain Ingestion pipeline includes completeness gates: a run fails if document coverage drops below 95% or vector coverage below 90%. Changed documents are automatically re-ingested via SHA-256 content hashing. Stale vectors are cleaned up from Qdrant. Dead-letter failures are recorded in Redis for inspection.
For SQL databases, the system extracts full table schemas — every column name, type, nullability — and builds ontology triplets from the schema structure. This means your database tables become first-class knowledge entities in the graph. A customer_contracts table with a gdpr_consent_date column generates a Compliance-domain entity that links to your customer entities and your compliance policy entities — automatically.
The Knowledge Status panel in the Second Brain view shows live counts: how many triplets are indexed in Jena, how many vectors in Qdrant, how many items in your personal graph. These figures update on every refresh — no cached numbers, no estimated totals.
Next: Role-Based Graph Scoping with SCAG
The current Living Encyclopedia returns company knowledge without access restriction — it's designed for environments where all authenticated users can consult all institutional knowledge. For organizations with stricter data governance, the next release introduces role-based graph scoping through SCAG (Semantic Context-Aware Guard).
SCAG, backed by OpenFGA's relationship-based access control (ReBAC), will allow each user's role to determine which named graphs they can query. A finance analyst will only see the Finance and Compliance named graphs; a customer success manager will see Customer and Operations. Cross-domain edges will respect the intersection of visible graphs. The auto-link mechanism will only propose connections to entities the user is authorized to see.
Architecture invariant: Role-based scoping is applied at the SPARQL query layer — it's not a UI filter you can bypass. Named graph selection is driven by the user's verified role from the Keycloak access token, enforced by SCAG before any triple reaches the application layer.
ArcaQ vs. Conventional Enterprise Knowledge Tools
Features that define the category difference:
| Capability | ArcaQ | Confluence / Notion | Classic Enterprise Search |
|---|---|---|---|
| Named-graph domain model (SPARQL) | ✓ | ✗ | ✗ |
| Per-user private knowledge graph | ✓ | ~ pages only | ✗ |
| On-premise LLM auto-linking | ✓ | ✗ | ✗ |
| Cross-domain edge discovery | ✓ | ✗ | ✗ |
| SQL schema → knowledge entity auto-ingest | ✓ | ✗ | ~ limited connectors |
| Zero vendor telemetry | ✓ | ✗ | ✗ |
| ReBAC access control at SPARQL layer | → next release | ✗ | ~ row-level only |
Key Takeaways
- The Living Encyclopedia organizes company knowledge as SPARQL named graphs — one per domain — enabling domain-filtered traversal and cross-domain discovery
- The Personal Second Brain gives each user a private RDF graph for notes, bookmarks, insights and expert corrections
- Auto-linking bridges personal notes to company entities using local LLM inference — no data ever leaves your infrastructure
- The behavior profile derives learning style and top topics from saved knowledge — without tracking passive browsing
- SQL databases, REST APIs, S3, files and MCP sources are all indexed into the same graph — schemas become knowledge entities automatically
- Role-based graph scoping via SCAG + OpenFGA is the next release — access enforced at the SPARQL layer, not the UI layer
Frequently Asked Questions
What is the Living Encyclopedia in ArcaQ?
The Living Encyclopedia is ArcaQ's company-wide knowledge graph — a multi-domain RDF graph built automatically from all connected sources. Each domain becomes a SPARQL named graph navigable through an interactive D3.js explorer filterable by domain, with cross-domain edge detection.
How does the Personal Second Brain connect to the Living Encyclopedia?
Through auto-linking: a local Ollama LLM compares personal notes against company entities and writes arcaq:autoLinkedTo edges. The connection is navigable in both graph views. Expert corrections flow back to the Encyclopedia as validated annotations.
What sources does ArcaQ connect to automatically?
SQL databases (PostgreSQL, MySQL, SQLite, MSSQL, Oracle), REST APIs, file systems (PDF, DOCX, XLSX, CSV, Markdown), AWS S3 / MinIO, MCP protocol, and Obsidian vaults. Each feeds both the Jena knowledge graph and the Qdrant vector index.
Does the system track what I passively read or browse?
No. The Personal Second Brain captures only items you explicitly add. The behavior profile is derived entirely from your saved notes and bookmarks. No passive browsing, no click tracking. Your private graph is only accessible to you.
When will role-based graph access be available?
Role-based graph scoping via SCAG + OpenFGA is the next major release. It will enforce access at the SPARQL named graph query layer — each role can only traverse the named graphs their permissions cover. The auto-link mechanism will respect the same access boundaries.
See Your Knowledge Graph Come Alive
Connect ArcaQ to your databases, files and APIs. In one indexing run, watch your organizational knowledge become a navigable, queryable graph — fully sovereign, fully on-premise.
Request a Private Demo