Skip to main content
Second Brain · LLM Auto-Linking

Semantic Connections, Discovered Automatically

The on-premise LLM automatically discovers semantic relationships between your personal notes and the company knowledge graph — running nightly without any manual intervention.

§1 — Nightly pipeline (03:00 UTC)

The personal_autolink_orchestrator Airflow DAG runs every night and processes all active users in five sequential steps.

1

Discover active users

SPARQL query on Jena finds all named graphs under http://arcaq.com/personal/ — any collaborator who has at least one personal note is automatically included in the run.

2

Fetch recent personal notes

For each user, up to AUTOLINK_BATCH_SIZE (default 50) personal notes are retrieved from the private graph — label, content, and note type.

3

Fetch company entity sample

Up to 500 company entities (with labels and types) are fetched from the shared knowledge graph as comparison targets — excluding personal and blank nodes.

4

LLM semantic analysis

The on-premise LLM (OLLAMA_MODEL, default llama3.2:3b) is queried with pairs of notes and company entities to produce a similarity score. Only pairs scoring above AUTOLINK_CONFIDENCE (default 0.7) become links.

5

Insert auto-links into personal graph

Discovered links are written as typed arcaq:AutoLink triples into the user's private named graph — idempotent, timestamped, and carrying the confidence score.

§2 — RDF auto-link model

Every discovered link is a first-class RDF resource — queryable, deletable, and traceable to the LLM model that created it.

# Named graph: http://arcaq.com/personal/{user_id} <autolink:xyz789> a arcaq:AutoLink ; arcaq:sourceNote <personal:note:abc> ; arcaq:targetEntity <company:concept:SupplyChainRisk> ; arcaq:confidence "0.84"^^xsd:decimal ; arcaq:llmModel "llama3.2:3b" ; arcaq:ownedBy <user:alice> ; arcaq:detectedAt "2026-05-15T03:22:11Z"^^xsd:dateTime ; arcaq:dagRun "personal_autolink_orchestrator__2026-05-15T03:00:00" .

§3 — Configuration

All thresholds and model choices are managed via environment variables — no code change required to tune the linker.

VariableDefaultDescription
AUTOLINK_CONFIDENCE0.7Minimum LLM similarity score to create a link (0.0–1.0). Raise for precision, lower for recall.
AUTOLINK_BATCH_SIZE50Max personal notes processed per user per DAG run. Raise for thoroughness, lower for speed.
OLLAMA_MODELllama3.2:3bOllama model used for semantic comparison. Any model installed in your cluster can be used.
OLLAMA_BASE_URLhttp://ollama:11434Internal URL of the Ollama service. Must be reachable from Airflow workers.
ARCAQ_API_URLhttp://arcaq-api:8000ArcaQ API base URL — used to discover users via the API rather than SPARQL when preferred.

Let the LLM do the connecting

Stop manually tagging relationships. The auto-linker surfaces unexpected connections across domains — entirely on-premise, with full auditability.