Import Solver + neighbors via sparql query

Reorganiza backend
Graph access via SPARQL
2026-03-04 13:49:14 -03:00 · 2026-03-02 17:33:45 -03:00 · 2026-03-02 16:27:28 -03:00 · 2026-03-02 14:32:42 -03:00
44 changed files with 202394 additions and 1112 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,4 +1,9 @@
 .direnv/
 .envrc
 .env
+backend/.env
+frontend/node_modules/
+frontend/dist/
+.npm/
+.vite/
 data/
--- a/README.md
+++ b/README.md
@@ -1,67 +0,0 @@
-# Large Instanced Ontology Visualizer
-
-An experimental visualizer designed to render and explore massive instanced ontologies (millions of nodes) with interactive performance.
-
-## 🚀 The Core Challenge
-Ontologies with millions of instances present a significant rendering challenge for traditional graph visualization tools. This project solves this by:
-1.  **Selective Rendering:** Only rendering up to a set limit of nodes (e.g., 2 million) at any given time.
-2.  **Adaptive Sampling:** When zoomed out, it provides a representative spatial sample of the nodes. When zoomed in, the number of nodes within the viewport naturally falls below the rendering limit, allowing for 100% detail with zero performance degradation.
-3.  **Spatial Indexing:** Using a custom Quadtree to manage millions of points in memory and efficiently determine visibility.
-
-## 🛠 Technical Architecture
-
-### 1. Data Pipeline & AnzoGraph Integration
-The project features an automated pipeline to extract and prepare data from an **AnzoGraph** DB:
-   **SPARQL Extraction:** `scripts/fetch_from_db.ts` connects to AnzoGraph via its SPARQL endpoint. It fetches a seed set of subjects and their related triples, identifying "primary" nodes (objects of `rdf:type`).
-   **Graph Classification:** Instances are categorized to distinguish between classes and relationships.
-   **Force-Directed Layout:** `scripts/compute_layout.ts` calculates 2D positions for the nodes using a **Barnes-Hut** optimized force-directed simulation, ensuring scalability for large graphs.
-
-### 2. Quadtree Spatial Index
-To handle millions of nodes without per-frame object allocation:
-   **In-place Sorting:** The Quadtree (`src/quadtree.ts`) spatially sorts the raw `Float32Array` of positions at build-time.
-   **Index-Based Access:** Leaves store only the index ranges into the sorted array, pointing directly to the data sent to the GPU.
-   **Fast Lookups:** Used for both frustum culling and efficient "find node under cursor" calculations.
-
-### 3. WebGL 2 High-Performance Renderer
-The renderer (`src/renderer.ts`) is built for maximum throughput:
-   **`WEBGL_multi_draw` Extension:** Batches multiple leaf nodes into single draw calls, minimizing CPU overhead.
-   **Zero-Allocation Render Loop:** The frame loop uses pre-allocated typed arrays to prevent GC pauses.
-   **Dynamic Level of Detail (LOD):** 
-    -   **Points:** Always visible, with adaptive density based on zoom.
-    -   **Lines:** Automatically rendered when zoomed in deep enough to see individual relationships (< 20k visible nodes).
-    -   **Selection:** Interactive selection of nodes highlights immediate neighbors (incoming/outgoing edges).
-
-## 🚦 Getting Started
-
-### Prerequisites
-   Docker and Docker Compose
-   Node.js (for local development)
-
-### Deployment
-The project includes a `docker-compose.yml` that spins up both the **AnzoGraph** database and the visualizer app.
-
-```bash
-# Start the services
-docker-compose up -d
-
-# Inside the app container, the following will run automatically:
-# 1. Fetch data from AnzoGraph (fetch_from_db.ts)
-# 2. Compute the 2D layout (compute_layout.ts)
-# 3. Start the Vite development server
-```
-
-The app will be available at `http://localhost:5173`.
-
-## 🖱 Interactions
-   **Drag:** Pan the view.
-   **Scroll:** Zoom in/out at the cursor position.
-   **Click:** Select a node to see its URI/Label and highlight its neighbors.
-   **HUD:** Real-time stats on FPS, nodes drawn, and current sampling ratio.
-
-## TODO
-   **Positioning:** Use better algorithm to position nodes, trying to avoid as much as possible any edges crossing, but at the same time trying to keep the graph compact. 
-   **Positioning:** Decide how to handle classes which are both instances and classes. 
-   **Functionality:** Find every equipment with a specific property or that participate in a specific process.
-   **Functionality:** Find every equipment which is connecte to a well.
-   **Functionality:** Show every connection witin a specified depth.
-   **Functionality:** Show every element of a specific class.
--- a/backend/Dockerfile
+++ b/backend/Dockerfile
@@ -0,0 +1,16 @@
+FROM python:3.12-slim
+
+WORKDIR /app
+
+ENV PYTHONDONTWRITEBYTECODE=1 \
+    PYTHONUNBUFFERED=1
+
+COPY requirements.txt /app/requirements.txt
+RUN pip install --no-cache-dir -r /app/requirements.txt
+
+COPY app /app/app
+
+EXPOSE 8000
+
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
+
--- a/backend/app/README.md
+++ b/backend/app/README.md
@@ -0,0 +1,197 @@
+# Backend App (`backend/app`)
+
+This folder contains the FastAPI backend for `visualizador_instanciados`.
+
+The backend can execute SPARQL queries in two interchangeable ways:
+
+1. **`GRAPH_BACKEND=rdflib`**: parse a Turtle file into an in-memory RDFLib `Graph` and run SPARQL queries locally.
+2. **`GRAPH_BACKEND=anzograph`**: run SPARQL queries against an AnzoGraph SPARQL endpoint over HTTP (optionally `LOAD` a TTL on startup).
+
+Callers (frontend or other clients) interact with a single API surface (`/api/*`) and do not need to know which backend is configured.
+
+## Files
+
+- `main.py`
+  - FastAPI app setup, startup/shutdown (`lifespan`), and HTTP endpoints.
+- `settings.py`
+  - Env-driven configuration (`pydantic-settings`).
+- `sparql_engine.py`
+  - Backend-agnostic SPARQL execution layer:
+    - `RdflibEngine`: `Graph.query(...)` + SPARQL JSON serialization.
+    - `AnzoGraphEngine`: HTTP POST to `/sparql` with Basic auth + readiness gate.
+  - `create_sparql_engine(settings)` chooses the engine based on `GRAPH_BACKEND`.
+- `graph_export.py`
+  - Shared helpers to:
+    - build the snapshot SPARQL query used for edge retrieval
+    - map SPARQL JSON bindings to `{nodes, edges}`.
+- `models.py`
+  - Pydantic response/request models:
+    - `Node`, `Edge`, `GraphResponse`, `StatsResponse`, etc.
+- `rdf_store.py`
+  - A local parsed representation (dense IDs + neighbor-ish data) built only in `GRAPH_BACKEND=rdflib`.
+  - Used by `/api/nodes`, `/api/edges`, and `rdflib`-mode `/api/stats`.
+- `pipelines/graph_snapshot.py`
+  - Pipeline used by `/api/graph` to return a `{nodes, edges}` snapshot via SPARQL (works for both RDFLib and AnzoGraph).
+- `pipelines/layout_dag_radial.py`
+  - DAG layout helpers used by `pipelines/graph_snapshot.py`:
+    - cycle detection
+    - level-synchronous Kahn layering
+    - radial (ring-per-layer) positioning.
+- `pipelines/snapshot_service.py`
+  - Snapshot cache layer used by `/api/graph` and `/api/stats` so the backend doesn't run expensive SPARQL twice.
+- `pipelines/subclass_labels.py`
+  - Pipeline to extract `rdfs:subClassOf` entities and aligned `rdfs:label` list.
+
+## Runtime Flow
+
+On startup (FastAPI lifespan):
+
+1. `create_sparql_engine(settings)` selects and starts a SPARQL engine.
+2. The engine is stored at `app.state.sparql`.
+3. If `GRAPH_BACKEND=rdflib`, `RDFStore` is also built from the already-loaded RDFLib graph and stored at `app.state.store`.
+
+On shutdown:
+
+- `app.state.sparql.shutdown()` is called to close the HTTP client (AnzoGraph mode) or no-op (RDFLib mode).
+
+## Environment Variables
+
+Most configuration is intended to be provided via container environment variables (see repo root `.env` and `docker-compose.yml`).
+
+Core:
+
+- `GRAPH_BACKEND`: `rdflib` or `anzograph`
+- `INCLUDE_BNODES`: `true`/`false`
+- `CORS_ORIGINS`: comma-separated list or `*`
+
+RDFLib mode:
+
+- `TTL_PATH`: path inside the backend container to a `.ttl` file (example: `/data/o3po.ttl`)
+- `MAX_TRIPLES`: optional int; if set, stops parsing after this many triples
+
+Optional import-combining step (runs before the SPARQL engine starts):
+
+- `COMBINE_OWL_IMPORTS_ON_START`: `true` to recursively load `TTL_PATH` (or `COMBINE_ENTRY_LOCATION`) plus `owl:imports` and write a combined TTL file.
+- `COMBINE_ENTRY_LOCATION`: optional override for the entry file/URL to load (defaults to `TTL_PATH`)
+- `COMBINE_OUTPUT_LOCATION`: optional explicit output path (defaults to `${dirname(entry)}/${COMBINE_OUTPUT_NAME}`)
+- `COMBINE_OUTPUT_NAME`: output filename when `COMBINE_OUTPUT_LOCATION` is not set (default: `combined_ontology.ttl`)
+- `COMBINE_FORCE`: `true` to rebuild even if the output file already exists
+
+AnzoGraph mode:
+
+- `SPARQL_HOST`: base host (example: `http://anzograph:8080`)
+- `SPARQL_ENDPOINT`: optional full endpoint; if set, overrides `${SPARQL_HOST}/sparql`
+- `SPARQL_USER`, `SPARQL_PASS`: Basic auth credentials
+- `SPARQL_DATA_FILE`: file URI as seen by the **AnzoGraph container** (example: `file:///opt/shared-files/o3po.ttl`)
+- `SPARQL_GRAPH_IRI`: optional graph IRI for `LOAD ... INTO GRAPH <...>`
+- `SPARQL_LOAD_ON_START`: `true` to execute `LOAD <SPARQL_DATA_FILE>` during startup
+- `SPARQL_CLEAR_ON_START`: `true` to execute `CLEAR ALL` during startup (dangerous)
+- `SPARQL_TIMEOUT_S`: request timeout for normal SPARQL requests
+- `SPARQL_READY_RETRIES`, `SPARQL_READY_DELAY_S`, `SPARQL_READY_TIMEOUT_S`: readiness gate parameters
+
+## AnzoGraph Readiness Gate
+
+`AnzoGraphEngine` does not assume "container started" means "SPARQL works".
+It waits for a smoke-test POST:
+
+- Method: `POST ${SPARQL_ENDPOINT}`
+- Headers:
+  - `Content-Type: application/x-www-form-urlencoded`
+  - `Accept: application/sparql-results+json`
+  - `Authorization: Basic ...` (if configured)
+- Body: `query=ASK WHERE { ?s ?p ?o }`
+- Success condition: HTTP 2xx and response parses as JSON
+
+This matches the behavior described in `docs/anzograph-readiness-julia.md`.
+
+## API Endpoints
+
+- `GET /api/health`
+  - Returns `{ "status": "ok" }`.
+- `GET /api/stats`
+  - Returns counts for the same snapshot used by `/api/graph` (via the snapshot cache).
+- `POST /api/sparql`
+  - Body: `{ "query": "<SPARQL SELECT/ASK>" }`
+  - Returns SPARQL JSON results as-is.
+  - Notes:
+    - This endpoint is intended for **SELECT/ASK returning SPARQL-JSON**.
+    - SPARQL UPDATE is not exposed here (AnzoGraph `LOAD`/`CLEAR` are handled internally during startup).
+- `GET /api/graph?node_limit=...&edge_limit=...`
+  - Returns a graph snapshot as `{ nodes: [...], edges: [...] }`.
+  - Implemented as a SPARQL edge query + mapping in `pipelines/graph_snapshot.py`.
+- `GET /api/nodes`, `GET /api/edges`
+  - Only available in `GRAPH_BACKEND=rdflib` (these use `RDFStore`'s dense ID tables).
+
+## Data Contract
+
+### Node
+
+Returned in `nodes[]` (dense IDs; suitable for indexing in typed arrays):
+
+```json
+{
+  "id": 0,
+  "termType": "uri",
+  "iri": "http://example.org/Thing",
+  "label": null,
+  "x": 0.0,
+  "y": 0.0
+}
+```
+
+- `id`: integer dense node ID used in edges
+- `termType`: `"uri"` or `"bnode"`
+- `iri`: URI string; blank nodes are normalized to `_:<id>`
+- `label`: `rdfs:label` when available (best-effort; prefers English)
+- `x`/`y`: world-space coordinates for rendering (currently a radial layered layout derived from `rdfs:subClassOf`)
+
+### Edge
+
+Returned in `edges[]`:
+
+```json
+{
+  "source": 0,
+  "target": 12,
+  "predicate": "http://www.w3.org/2000/01/rdf-schema#subClassOf"
+}
+```
+
+- `source`/`target`: dense node IDs (indexes into `nodes[]`)
+- `predicate`: predicate IRI string
+
+## Snapshot Query (`/api/graph`)
+
+`/api/graph` currently uses a SPARQL query that returns only `rdfs:subClassOf` edges:
+
+- selects bindings as `?s ?p ?o` (with `?p` bound to `rdfs:subClassOf`)
+- excludes literal objects (`FILTER(!isLiteral(?o))`) for safety
+- optionally excludes blank nodes (unless `INCLUDE_BNODES=true`)
+- applies `LIMIT edge_limit`
+
+The result bindings are mapped to dense node IDs (first-seen order) and returned to the caller.
+
+`/api/graph` also returns `meta` with snapshot counts and engine info so the frontend doesn't need to call `/api/stats`.
+
+If a cycle is detected in the returned `rdfs:subClassOf` snapshot, `/api/graph` returns HTTP 422 (layout requires a DAG).
+
+## Pipelines
+
+### `pipelines/graph_snapshot.py`
+
+`fetch_graph_snapshot(...)` is the main "export graph" pipeline used by `/api/graph`.
+
+### `pipelines/subclass_labels.py`
+
+`extract_subclass_entities_and_labels(...)`:
+
+1. Queries all `rdfs:subClassOf` triples.
+2. Builds a unique set of subjects+objects, then converts it to a deterministic list.
+3. Queries `rdfs:label` for those entities and returns aligned lists:
+   - `entities[i]` corresponds to `labels[i]`.
+
+## Notes / Tradeoffs
+
+- `/api/graph` returns only nodes that appear in the returned edge result set. Nodes not referenced by those edges will not be present.
+- RDFLib and AnzoGraph may differ in supported SPARQL features (vendor extensions, inference, performance), but the API surface is the same.
+- `rdf_store.py` is currently only needed for `/api/nodes`, `/api/edges`, and rdflib-mode `/api/stats`. If you don't use those endpoints, it can be removed later.
--- a/backend/app/init.py
+++ b/backend/app/init.py
@@ -0,0 +1 @@
+
--- a/backend/app/graph_export.py
+++ b/backend/app/graph_export.py
@@ -0,0 +1,102 @@
+from __future__ import annotations
+
+from typing import Any
+
+
+def edge_retrieval_query(*, edge_limit: int, include_bnodes: bool) -> str:
+    bnode_filter = "" if include_bnodes else "FILTER(!isBlank(?s) && !isBlank(?o))"
+    
+    return f"""
+PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
+PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
+PREFIX owl: <http://www.w3.org/2002/07/owl#>
+
+SELECT ?s ?p ?o
+WHERE {{
+  {{
+    VALUES ?p {{ rdf:type }}
+    ?s ?p ?o .
+    ?o rdf:type owl:Class .
+  }}
+  UNION
+  {{
+    VALUES ?p {{ rdfs:subClassOf }}
+    ?s ?p ?o .
+  }}
+  FILTER(!isLiteral(?o))
+  {bnode_filter}
+}}
+LIMIT {edge_limit}
+"""
+
+
+def graph_from_sparql_bindings(
+    bindings: list[dict[str, Any]],
+    *,
+    node_limit: int,
+    include_bnodes: bool,
+) -> tuple[list[dict[str, object]], list[dict[str, object]]]:
+    """
+    Convert SPARQL JSON results bindings into:
+      nodes: [{id, termType, iri, label}]
+      edges: [{source, target, predicate}]
+
+    IDs are assigned densely (0..N-1) based on first occurrence in bindings.
+    """
+
+    node_id_by_key: dict[tuple[str, str], int] = {}
+    node_meta: list[tuple[str, str]] = []  # (termType, iri)
+    out_edges: list[dict[str, object]] = []
+
+    def term_to_key_and_iri(term: dict[str, Any]) -> tuple[tuple[str, str], tuple[str, str]] | None:
+        t = term.get("type")
+        v = term.get("value")
+        if not t or v is None:
+            return None
+        if t == "literal":
+            return None
+        if t == "bnode":
+            if not include_bnodes:
+                return None
+            # SPARQL JSON uses bnode identifiers without the "_:" prefix; we normalize to "_:id".
+            return (("bnode", str(v)), ("bnode", f"_:{v}"))
+        # Default to "uri".
+        return (("uri", str(v)), ("uri", str(v)))
+
+    def get_or_add(term: dict[str, Any]) -> int | None:
+        out = term_to_key_and_iri(term)
+        if out is None:
+            return None
+        key, meta = out
+        existing = node_id_by_key.get(key)
+        if existing is not None:
+            return existing
+        if len(node_meta) >= node_limit:
+            return None
+        nid = len(node_meta)
+        node_id_by_key[key] = nid
+        node_meta.append(meta)
+        return nid
+
+    for b in bindings:
+        s_term = b.get("s") or {}
+        o_term = b.get("o") or {}
+        p_term = b.get("p") or {}
+
+        sid = get_or_add(s_term)
+        oid = get_or_add(o_term)
+        if sid is None or oid is None:
+            continue
+
+        pred = p_term.get("value")
+        if not pred:
+            continue
+
+        out_edges.append({"source": sid, "target": oid, "predicate": str(pred)})
+
+    out_nodes = [
+        {"id": i, "termType": term_type, "iri": iri, "label": None}
+        for i, (term_type, iri) in enumerate(node_meta)
+    ]
+
+    return out_nodes, out_edges
--- a/backend/app/main.py
+++ b/backend/app/main.py
@@ -0,0 +1,172 @@
+from __future__ import annotations
+
+from contextlib import asynccontextmanager
+import logging
+import asyncio
+
+from fastapi import FastAPI, HTTPException, Query
+from fastapi.middleware.cors import CORSMiddleware
+
+from .models import (
+    EdgesResponse,
+    GraphResponse,
+    NeighborsRequest,
+    NeighborsResponse,
+    NodesResponse,
+    SparqlQueryRequest,
+    StatsResponse,
+)
+from .pipelines.layout_dag_radial import CycleError
+from .pipelines.owl_imports_combiner import (
+    build_combined_graph,
+    output_location_to_path,
+    resolve_output_location,
+    serialize_graph_to_ttl,
+)
+from .pipelines.selection_neighbors import fetch_neighbor_ids_for_selection
+from .pipelines.snapshot_service import GraphSnapshotService
+from .rdf_store import RDFStore
+from .sparql_engine import RdflibEngine, SparqlEngine, create_sparql_engine
+from .settings import Settings
+
+
+settings = Settings()
+logger = logging.getLogger(__name__)
+
+
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    rdflib_preloaded_graph = None
+
+    if settings.combine_owl_imports_on_start:
+        entry_location = settings.combine_entry_location or settings.ttl_path
+        output_location = resolve_output_location(
+            entry_location,
+            output_location=settings.combine_output_location,
+            output_name=settings.combine_output_name,
+        )
+
+        output_path = output_location_to_path(output_location)
+        if output_path.exists() and not settings.combine_force:
+            logger.info("Skipping combine step (output exists): %s", output_location)
+        else:
+            rdflib_preloaded_graph = await asyncio.to_thread(build_combined_graph, entry_location)
+            logger.info("Finished combining imports; serializing to: %s", output_location)
+            await asyncio.to_thread(serialize_graph_to_ttl, rdflib_preloaded_graph, output_location)
+
+        if settings.graph_backend == "rdflib":
+            settings.ttl_path = str(output_path)
+
+    sparql: SparqlEngine = create_sparql_engine(settings, rdflib_graph=rdflib_preloaded_graph)
+    await sparql.startup()
+    app.state.sparql = sparql
+    app.state.snapshot_service = GraphSnapshotService(sparql=sparql, settings=settings)
+
+    # Only build node/edge tables when running in rdflib mode.
+    if settings.graph_backend == "rdflib":
+        assert isinstance(sparql, RdflibEngine)
+        if sparql.graph is None:
+            raise RuntimeError("rdflib graph failed to load")
+
+        store = RDFStore(
+            ttl_path=settings.ttl_path,
+            include_bnodes=settings.include_bnodes,
+            max_triples=settings.max_triples,
+        )
+        store.load(sparql.graph)
+        app.state.store = store
+
+    yield
+
+    await sparql.shutdown()
+
+
+app = FastAPI(title="visualizador_instanciados backend", lifespan=lifespan)
+
+cors_origins = settings.cors_origin_list()
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=cors_origins,
+    allow_credentials=False,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+
+
+@app.get("/api/health")
+def health() -> dict[str, str]:
+    return {"status": "ok"}
+
+
+@app.get("/api/stats", response_model=StatsResponse)
+async def stats() -> StatsResponse:
+    # Stats reflect exactly what we send to the frontend (/api/graph), not global graph size.
+    svc: GraphSnapshotService = app.state.snapshot_service
+    try:
+        snap = await svc.get(node_limit=50_000, edge_limit=100_000)
+    except CycleError as e:
+        raise HTTPException(status_code=422, detail=str(e)) from None
+    meta = snap.meta
+    return StatsResponse(
+        backend=meta.backend if meta else app.state.sparql.name,
+        ttl_path=meta.ttl_path if meta and meta.ttl_path else settings.ttl_path,
+        sparql_endpoint=meta.sparql_endpoint if meta else None,
+        parsed_triples=len(snap.edges),
+        nodes=len(snap.nodes),
+        edges=len(snap.edges),
+    )
+
+
+@app.post("/api/sparql")
+async def sparql_query(req: SparqlQueryRequest) -> dict:
+    sparql: SparqlEngine = app.state.sparql
+    data = await sparql.query_json(req.query)
+    return data
+
+
+@app.post("/api/neighbors", response_model=NeighborsResponse)
+async def neighbors(req: NeighborsRequest) -> NeighborsResponse:
+    svc: GraphSnapshotService = app.state.snapshot_service
+    snap = await svc.get(node_limit=req.node_limit, edge_limit=req.edge_limit)
+    sparql: SparqlEngine = app.state.sparql
+    neighbor_ids = await fetch_neighbor_ids_for_selection(
+        sparql,
+        snapshot=snap,
+        selected_ids=req.selected_ids,
+        include_bnodes=settings.include_bnodes,
+    )
+    return NeighborsResponse(selected_ids=req.selected_ids, neighbor_ids=neighbor_ids)
+
+
+@app.get("/api/nodes", response_model=NodesResponse)
+def nodes(
+    limit: int = Query(default=10_000, ge=1, le=200_000),
+    offset: int = Query(default=0, ge=0),
+) -> NodesResponse:
+    if settings.graph_backend != "rdflib":
+        raise HTTPException(status_code=501, detail="GET /api/nodes is only supported in GRAPH_BACKEND=rdflib mode")
+    store: RDFStore = app.state.store
+    return NodesResponse(total=store.node_count, nodes=store.node_slice(offset=offset, limit=limit))
+
+
+@app.get("/api/edges", response_model=EdgesResponse)
+def edges(
+    limit: int = Query(default=50_000, ge=1, le=500_000),
+    offset: int = Query(default=0, ge=0),
+) -> EdgesResponse:
+    if settings.graph_backend != "rdflib":
+        raise HTTPException(status_code=501, detail="GET /api/edges is only supported in GRAPH_BACKEND=rdflib mode")
+    store: RDFStore = app.state.store
+    return EdgesResponse(total=store.edge_count, edges=store.edge_slice(offset=offset, limit=limit))
+
+
+@app.get("/api/graph", response_model=GraphResponse)
+async def graph(
+    node_limit: int = Query(default=50_000, ge=1, le=200_000),
+    edge_limit: int = Query(default=100_000, ge=1, le=500_000),
+) -> GraphResponse:
+    svc: GraphSnapshotService = app.state.snapshot_service
+    try:
+        return await svc.get(node_limit=node_limit, edge_limit=edge_limit)
+    except CycleError as e:
+        raise HTTPException(status_code=422, detail=str(e)) from None
--- a/backend/app/models.py
+++ b/backend/app/models.py
@@ -0,0 +1,69 @@
+from __future__ import annotations
+
+from pydantic import BaseModel
+
+
+class Node(BaseModel):
+    id: int
+    termType: str  # "uri" | "bnode"
+    iri: str
+    label: str | None = None
+    # Optional because /api/nodes (RDFStore) doesn't currently provide positions.
+    x: float | None = None
+    y: float | None = None
+
+
+class Edge(BaseModel):
+    source: int
+    target: int
+    predicate: str
+
+
+class StatsResponse(BaseModel):
+    backend: str
+    ttl_path: str
+    sparql_endpoint: str | None = None
+    parsed_triples: int
+    nodes: int
+    edges: int
+
+
+class NodesResponse(BaseModel):
+    total: int
+    nodes: list[Node]
+
+
+class EdgesResponse(BaseModel):
+    total: int
+    edges: list[Edge]
+
+
+class GraphResponse(BaseModel):
+    class Meta(BaseModel):
+        backend: str
+        ttl_path: str | None = None
+        sparql_endpoint: str | None = None
+        include_bnodes: bool
+        node_limit: int
+        edge_limit: int
+        nodes: int
+        edges: int
+
+    nodes: list[Node]
+    edges: list[Edge]
+    meta: Meta | None = None
+
+
+class SparqlQueryRequest(BaseModel):
+    query: str
+
+
+class NeighborsRequest(BaseModel):
+    selected_ids: list[int]
+    node_limit: int = 50_000
+    edge_limit: int = 100_000
+
+
+class NeighborsResponse(BaseModel):
+    selected_ids: list[int]
+    neighbor_ids: list[int]
--- a/backend/app/pipelines/init.py
+++ b/backend/app/pipelines/init.py
@@ -0,0 +1 @@
+
--- a/backend/app/pipelines/graph_snapshot.py
+++ b/backend/app/pipelines/graph_snapshot.py
@@ -0,0 +1,148 @@
+from __future__ import annotations
+
+from typing import Any
+
+from ..graph_export import edge_retrieval_query, graph_from_sparql_bindings
+from ..models import GraphResponse
+from ..sparql_engine import SparqlEngine
+from ..settings import Settings
+from .layout_dag_radial import CycleError, level_synchronous_kahn_layers, radial_positions_from_layers
+
+
+RDFS_LABEL = "http://www.w3.org/2000/01/rdf-schema#label"
+
+
+def _bindings(res: dict[str, Any]) -> list[dict[str, Any]]:
+    return (((res.get("results") or {}).get("bindings")) or [])
+
+
+def _label_score(label_binding: dict[str, Any]) -> int:
+    # Prefer English, then no-language, then anything else.
+    lang = (label_binding.get("xml:lang") or "").lower()
+    if lang == "en":
+        return 3
+    if lang == "":
+        return 2
+    return 1
+
+
+async def _fetch_rdfs_labels_for_iris(
+    sparql: SparqlEngine,
+    iris: list[str],
+    *,
+    batch_size: int = 500,
+) -> dict[str, str]:
+    best: dict[str, tuple[int, str]] = {}
+
+    for i in range(0, len(iris), batch_size):
+        batch = iris[i : i + batch_size]
+        values = " ".join(f"<{u}>" for u in batch)
+        q = f"""
+SELECT ?s ?label
+WHERE {{
+  VALUES ?s {{ {values} }}
+  ?s <{RDFS_LABEL}> ?label .
+}}
+"""
+        res = await sparql.query_json(q)
+        for b in _bindings(res):
+            s = (b.get("s") or {}).get("value")
+            label_term = b.get("label") or {}
+            if not s or label_term.get("type") != "literal":
+                continue
+            label_value = label_term.get("value")
+            if label_value is None:
+                continue
+            score = _label_score(label_term)
+            prev = best.get(s)
+            if prev is None or score > prev[0]:
+                best[s] = (score, str(label_value))
+
+    return {iri: lbl for iri, (_, lbl) in best.items()}
+
+
+async def fetch_graph_snapshot(
+    sparql: SparqlEngine,
+    *,
+    settings: Settings,
+    node_limit: int,
+    edge_limit: int,
+) -> GraphResponse:
+    """
+    Fetch a graph snapshot (nodes + edges) via SPARQL, independent of whether the
+    underlying engine is RDFLib or AnzoGraph.
+    """
+    edges_q = edge_retrieval_query(edge_limit=edge_limit, include_bnodes=settings.include_bnodes)
+    res = await sparql.query_json(edges_q)
+    bindings = (((res.get("results") or {}).get("bindings")) or [])
+    nodes, edges = graph_from_sparql_bindings(
+        bindings,
+        node_limit=node_limit,
+        include_bnodes=settings.include_bnodes,
+    )
+
+    # Add positions so the frontend doesn't need to run a layout.
+    #
+    # We are exporting only rdfs:subClassOf triples. In the exported edges:
+    #   source = subclass, target = superclass
+    # For hierarchical layout we invert edges to:
+    #   superclass -> subclass
+    hier_edges: list[tuple[int, int]] = []
+    for e in edges:
+        s = e.get("source")
+        t = e.get("target")
+        try:
+            sid = int(s)  # subclass
+            tid = int(t)  # superclass
+        except Exception:
+            continue
+        hier_edges.append((tid, sid))
+
+    try:
+        layers = level_synchronous_kahn_layers(node_count=len(nodes), edges=hier_edges)
+    except CycleError as e:
+        # Add a small URI sample to aid debugging.
+        sample: list[str] = []
+        for nid in e.remaining_node_ids[:20]:
+            try:
+                sample.append(str(nodes[nid].get("iri")))
+            except Exception:
+                continue
+        raise CycleError(
+            processed=e.processed,
+            total=e.total,
+            remaining_node_ids=e.remaining_node_ids,
+            remaining_iri_sample=sample or None,
+        ) from None
+
+    # Deterministic order within each ring/layer for stable layouts.
+    id_to_iri = [str(n.get("iri", "")) for n in nodes]
+    for layer in layers:
+        layer.sort(key=lambda nid: id_to_iri[nid])
+
+    xs, ys = radial_positions_from_layers(node_count=len(nodes), layers=layers)
+    for i, node in enumerate(nodes):
+        node["x"] = float(xs[i])
+        node["y"] = float(ys[i])
+
+    # Attach labels for URI nodes (blank nodes remain label-less).
+    uri_nodes = [n for n in nodes if n.get("termType") == "uri"]
+    if uri_nodes:
+        iris = [str(n["iri"]) for n in uri_nodes if isinstance(n.get("iri"), str)]
+        label_by_iri = await _fetch_rdfs_labels_for_iris(sparql, iris)
+        for n in uri_nodes:
+            iri = n.get("iri")
+            if isinstance(iri, str) and iri in label_by_iri:
+                n["label"] = label_by_iri[iri]
+
+    meta = GraphResponse.Meta(
+        backend=sparql.name,
+        ttl_path=settings.ttl_path if settings.graph_backend == "rdflib" else None,
+        sparql_endpoint=settings.effective_sparql_endpoint() if settings.graph_backend == "anzograph" else None,
+        include_bnodes=settings.include_bnodes,
+        node_limit=node_limit,
+        edge_limit=edge_limit,
+        nodes=len(nodes),
+        edges=len(edges),
+    )
+    return GraphResponse(nodes=nodes, edges=edges, meta=meta)
--- a/backend/app/pipelines/layout_dag_radial.py
+++ b/backend/app/pipelines/layout_dag_radial.py
@@ -0,0 +1,141 @@
+from __future__ import annotations
+
+import math
+from collections import deque
+from typing import Iterable, Sequence
+
+
+class CycleError(RuntimeError):
+    """
+    Raised when the requested layout requires a DAG, but a cycle is detected.
+
+    `remaining_node_ids` are the node ids that still had indegree > 0 after Kahn.
+    """
+
+    def __init__(
+        self,
+        *,
+        processed: int,
+        total: int,
+        remaining_node_ids: list[int],
+        remaining_iri_sample: list[str] | None = None,
+    ) -> None:
+        self.processed = int(processed)
+        self.total = int(total)
+        self.remaining_node_ids = remaining_node_ids
+        self.remaining_iri_sample = remaining_iri_sample
+
+        msg = f"Cycle detected in subClassOf graph (processed {self.processed}/{self.total} nodes)."
+        if remaining_iri_sample:
+            msg += f" Example nodes: {', '.join(remaining_iri_sample)}"
+        super().__init__(msg)
+
+
+def level_synchronous_kahn_layers(
+    *,
+    node_count: int,
+    edges: Iterable[tuple[int, int]],
+) -> list[list[int]]:
+    """
+    Level-synchronous Kahn's algorithm:
+    - process the entire current queue as one batch (one layer)
+    - only then enqueue newly-unlocked nodes for the next batch
+
+    `edges` are directed (u -> v).
+    """
+    n = int(node_count)
+    if n <= 0:
+        return []
+
+    adj: list[list[int]] = [[] for _ in range(n)]
+    indeg = [0] * n
+
+    for u, v in edges:
+        if u == v:
+            # Self-loops don't help layout and would trivially violate DAG-ness.
+            continue
+        if not (0 <= u < n and 0 <= v < n):
+            continue
+        adj[u].append(v)
+        indeg[v] += 1
+
+    q: deque[int] = deque(i for i, d in enumerate(indeg) if d == 0)
+    layers: list[list[int]] = []
+
+    processed = 0
+    while q:
+        # Consume the full current queue as a single layer.
+        layer = list(q)
+        q.clear()
+        layers.append(layer)
+
+        for u in layer:
+            processed += 1
+            for v in adj[u]:
+                indeg[v] -= 1
+                if indeg[v] == 0:
+                    q.append(v)
+
+    if processed != n:
+        remaining = [i for i, d in enumerate(indeg) if d > 0]
+        raise CycleError(processed=processed, total=n, remaining_node_ids=remaining)
+
+    return layers
+
+
+def radial_positions_from_layers(
+    *,
+    node_count: int,
+    layers: Sequence[Sequence[int]],
+    max_r: float = 5000.0,
+) -> tuple[list[float], list[float]]:
+    """
+    Assign node positions in concentric rings (one ring per layer).
+
+    - radius increases with layer index
+    - nodes within a layer are placed evenly by angle
+    - each ring gets a "golden-angle" rotation to reduce spoke artifacts
+    """
+    n = int(node_count)
+    if n <= 0:
+        return ([], [])
+
+    xs = [0.0] * n
+    ys = [0.0] * n
+    if not layers:
+        return (xs, ys)
+
+    two_pi = 2.0 * math.pi
+    golden = math.pi * (3.0 - math.sqrt(5.0))
+
+    layer_count = len(layers)
+    denom = float(layer_count + 1)
+
+    for li, layer in enumerate(layers):
+        m = len(layer)
+        if m <= 0:
+            continue
+
+        # Keep everything within ~[-max_r, max_r] like the previous spiral layout.
+        r = ((li + 1) / denom) * max_r
+
+        # Rotate each layer deterministically to avoid radial spokes aligning.
+        offset = (li * golden) % two_pi
+
+        if m == 1:
+            nid = int(layer[0])
+            if 0 <= nid < n:
+                xs[nid] = r * math.cos(offset)
+                ys[nid] = r * math.sin(offset)
+            continue
+
+        step = two_pi / float(m)
+        for j, raw_id in enumerate(layer):
+            nid = int(raw_id)
+            if not (0 <= nid < n):
+                continue
+            t = offset + step * float(j)
+            xs[nid] = r * math.cos(t)
+            ys[nid] = r * math.sin(t)
+
+    return (xs, ys)
--- a/backend/app/pipelines/layout_spiral.py
+++ b/backend/app/pipelines/layout_spiral.py
@@ -0,0 +1,30 @@
+from __future__ import annotations
+
+import math
+
+
+def spiral_positions(n: int, *, max_r: float = 5000.0) -> tuple[list[float], list[float]]:
+    """
+    Deterministic "sunflower" (golden-angle) spiral layout.
+
+    This is intentionally simple and stable across runs:
+    - angle increments by the golden angle to avoid radial spokes
+    - radius grows with sqrt(i) to keep density roughly uniform over area
+    """
+    if n <= 0:
+        return ([], [])
+
+    xs = [0.0] * n
+    ys = [0.0] * n
+
+    golden = math.pi * (3.0 - math.sqrt(5.0))
+    denom = float(max(1, n - 1))
+
+    for i in range(n):
+        t = i * golden
+        r = math.sqrt(i / denom) * max_r
+        xs[i] = r * math.cos(t)
+        ys[i] = r * math.sin(t)
+
+    return xs, ys
+
--- a/backend/app/pipelines/owl_imports_combiner.py
+++ b/backend/app/pipelines/owl_imports_combiner.py
@@ -0,0 +1,96 @@
+from __future__ import annotations
+
+import logging
+import os
+from pathlib import Path
+from urllib.parse import unquote, urlparse
+
+from rdflib import Graph
+from rdflib.namespace import OWL
+
+
+logger = logging.getLogger(__name__)
+
+
+def _is_http_url(location: str) -> bool:
+    scheme = urlparse(location).scheme.lower()
+    return scheme in {"http", "https"}
+
+
+def _is_file_uri(location: str) -> bool:
+    return urlparse(location).scheme.lower() == "file"
+
+
+def _file_uri_to_path(location: str) -> Path:
+    u = urlparse(location)
+    if u.scheme.lower() != "file":
+        raise ValueError(f"Not a file:// URI: {location!r}")
+    return Path(unquote(u.path))
+
+
+def resolve_output_location(
+    entry_location: str,
+    *,
+    output_location: str | None,
+    output_name: str,
+) -> str:
+    if output_location:
+        return output_location
+
+    if _is_http_url(entry_location):
+        raise ValueError(
+            "COMBINE_ENTRY_LOCATION points to an http(s) URL; set COMBINE_OUTPUT_LOCATION to a writable file path."
+        )
+
+    entry_path = _file_uri_to_path(entry_location) if _is_file_uri(entry_location) else Path(entry_location)
+    return str(entry_path.parent / output_name)
+
+
+def _output_destination_to_path(output_location: str) -> Path:
+    if _is_file_uri(output_location):
+        return _file_uri_to_path(output_location)
+    if _is_http_url(output_location):
+        raise ValueError("Output location must be a local file path (or file:// URI), not http(s).")
+    return Path(output_location)
+
+
+def output_location_to_path(output_location: str) -> Path:
+    return _output_destination_to_path(output_location)
+
+
+def build_combined_graph(entry_location: str) -> Graph:
+    """
+    Recursively loads an RDF document (file path, file:// URI, or http(s) URL) and its
+    owl:imports into a single in-memory graph.
+    """
+    combined_graph = Graph()
+    visited_locations: set[str] = set()
+
+    def resolve_imports(location: str) -> None:
+        if location in visited_locations:
+            return
+        visited_locations.add(location)
+
+        logger.info("Loading ontology: %s", location)
+        try:
+            combined_graph.parse(location=location)
+        except Exception as e:
+            logger.warning("Failed to load %s (%s)", location, e)
+            return
+
+        imports = [str(o) for _, _, o in combined_graph.triples((None, OWL.imports, None))]
+        for imported_location in imports:
+            if imported_location not in visited_locations:
+                resolve_imports(imported_location)
+
+    resolve_imports(entry_location)
+    return combined_graph
+
+
+def serialize_graph_to_ttl(graph: Graph, output_location: str) -> None:
+    output_path = _output_destination_to_path(output_location)
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+
+    tmp_path = output_path.with_suffix(output_path.suffix + ".tmp")
+    graph.serialize(destination=str(tmp_path), format="turtle")
+    os.replace(str(tmp_path), str(output_path))
--- a/backend/app/pipelines/selection_neighbors.py
+++ b/backend/app/pipelines/selection_neighbors.py
@@ -0,0 +1,137 @@
+from __future__ import annotations
+
+from typing import Any, Iterable
+
+from ..models import GraphResponse, Node
+from ..sparql_engine import SparqlEngine
+
+
+def _values_term(node: Node) -> str | None:
+    iri = node.iri
+    if node.termType == "uri":
+        return f"<{iri}>"
+    if node.termType == "bnode":
+        if iri.startswith("_:"):
+            return iri
+        return f"_:{iri}"
+    return None
+
+
+def selection_neighbors_query(*, selected_nodes: Iterable[Node], include_bnodes: bool) -> str:
+    values_terms: list[str] = []
+    for n in selected_nodes:
+        t = _values_term(n)
+        if t is None:
+            continue
+        values_terms.append(t)
+
+    if not values_terms:
+        # Caller should avoid running this query when selection is empty, but keep this safe.
+        return "SELECT ?nbr WHERE { FILTER(false) }"
+
+    bnode_filter = "" if include_bnodes else "FILTER(!isBlank(?nbr))"
+    values = " ".join(values_terms)
+
+    # Neighbors are defined as any node directly connected by rdf:type (to owl:Class)
+    # or rdfs:subClassOf, in either direction (treating edges as undirected).
+    return f"""
+PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
+PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
+PREFIX owl: <http://www.w3.org/2002/07/owl#>
+
+SELECT DISTINCT ?nbr
+WHERE {{
+  VALUES ?sel {{ {values} }}
+  {{
+    ?sel rdf:type ?o .
+    ?o rdf:type owl:Class .
+    BIND(?o AS ?nbr)
+  }}
+  UNION
+  {{
+    ?s rdf:type ?sel .
+    ?sel rdf:type owl:Class .
+    BIND(?s AS ?nbr)
+  }}
+  UNION
+  {{
+    ?sel rdfs:subClassOf ?o .
+    BIND(?o AS ?nbr)
+  }}
+  UNION
+  {{
+    ?s rdfs:subClassOf ?sel .
+    BIND(?s AS ?nbr)
+  }}
+  FILTER(!isLiteral(?nbr))
+  FILTER(?nbr != ?sel)
+  {bnode_filter}
+}}
+"""
+
+
+def _bindings(res: dict[str, Any]) -> list[dict[str, Any]]:
+    return (((res.get("results") or {}).get("bindings")) or [])
+
+
+def _term_key(term: dict[str, Any], *, include_bnodes: bool) -> tuple[str, str] | None:
+    t = term.get("type")
+    v = term.get("value")
+    if not t or v is None:
+        return None
+    if t == "literal":
+        return None
+    if t == "bnode":
+        if not include_bnodes:
+            return None
+        return ("bnode", f"_:{v}")
+    return ("uri", str(v))
+
+
+async def fetch_neighbor_ids_for_selection(
+    sparql: SparqlEngine,
+    *,
+    snapshot: GraphResponse,
+    selected_ids: list[int],
+    include_bnodes: bool,
+) -> list[int]:
+    id_to_node: dict[int, Node] = {n.id: n for n in snapshot.nodes}
+
+    selected_nodes: list[Node] = []
+    selected_id_set: set[int] = set()
+    for nid in selected_ids:
+        if not isinstance(nid, int):
+            continue
+        n = id_to_node.get(nid)
+        if n is None:
+            continue
+        if n.termType == "bnode" and not include_bnodes:
+            continue
+        selected_nodes.append(n)
+        selected_id_set.add(nid)
+
+    if not selected_nodes:
+        return []
+
+    key_to_id: dict[tuple[str, str], int] = {}
+    for n in snapshot.nodes:
+        key_to_id[(n.termType, n.iri)] = n.id
+
+    q = selection_neighbors_query(selected_nodes=selected_nodes, include_bnodes=include_bnodes)
+    res = await sparql.query_json(q)
+
+    neighbor_ids: set[int] = set()
+    for b in _bindings(res):
+        nbr_term = b.get("nbr") or {}
+        key = _term_key(nbr_term, include_bnodes=include_bnodes)
+        if key is None:
+            continue
+        nid = key_to_id.get(key)
+        if nid is None:
+            continue
+        if nid in selected_id_set:
+            continue
+        neighbor_ids.add(nid)
+
+    # Stable ordering for consistent frontend behavior.
+    return sorted(neighbor_ids)
--- a/backend/app/pipelines/snapshot_service.py
+++ b/backend/app/pipelines/snapshot_service.py
@@ -0,0 +1,63 @@
+from __future__ import annotations
+
+import asyncio
+from dataclasses import dataclass
+
+from ..models import GraphResponse
+from ..sparql_engine import SparqlEngine
+from ..settings import Settings
+from .graph_snapshot import fetch_graph_snapshot
+
+
+@dataclass(frozen=True)
+class SnapshotKey:
+    node_limit: int
+    edge_limit: int
+    include_bnodes: bool
+
+
+class GraphSnapshotService:
+    """
+    Caches graph snapshots so the backend doesn't re-run expensive SPARQL for stats/graph.
+    """
+
+    def __init__(self, *, sparql: SparqlEngine, settings: Settings):
+        self._sparql = sparql
+        self._settings = settings
+
+        self._cache: dict[SnapshotKey, GraphResponse] = {}
+        self._locks: dict[SnapshotKey, asyncio.Lock] = {}
+        self._global_lock = asyncio.Lock()
+
+    async def get(self, *, node_limit: int, edge_limit: int) -> GraphResponse:
+        key = SnapshotKey(
+            node_limit=node_limit,
+            edge_limit=edge_limit,
+            include_bnodes=self._settings.include_bnodes,
+        )
+
+        cached = self._cache.get(key)
+        if cached is not None:
+            return cached
+
+        # Create/get a per-key lock under a global lock to avoid races.
+        async with self._global_lock:
+            lock = self._locks.get(key)
+            if lock is None:
+                lock = asyncio.Lock()
+                self._locks[key] = lock
+
+        async with lock:
+            cached2 = self._cache.get(key)
+            if cached2 is not None:
+                return cached2
+
+            snapshot = await fetch_graph_snapshot(
+                self._sparql,
+                settings=self._settings,
+                node_limit=node_limit,
+                edge_limit=edge_limit,
+            )
+            self._cache[key] = snapshot
+            return snapshot
+
--- a/backend/app/pipelines/subclass_labels.py
+++ b/backend/app/pipelines/subclass_labels.py
@@ -0,0 +1,153 @@
+from __future__ import annotations
+
+from typing import Any
+
+from ..sparql_engine import SparqlEngine
+
+RDFS_SUBCLASS_OF = "http://www.w3.org/2000/01/rdf-schema#subClassOf"
+RDFS_LABEL = "http://www.w3.org/2000/01/rdf-schema#label"
+
+
+def _bindings(res: dict[str, Any]) -> list[dict[str, Any]]:
+    return (((res.get("results") or {}).get("bindings")) or [])
+
+
+def _term_key(term: dict[str, Any]) -> tuple[str, str] | None:
+    t = term.get("type")
+    v = term.get("value")
+    if not t or v is None:
+        return None
+    if t == "literal":
+        return None
+    if t == "bnode":
+        return ("bnode", str(v))
+    return ("uri", str(v))
+
+
+def _key_to_entity_string(key: tuple[str, str]) -> str:
+    t, v = key
+    if t == "bnode":
+        return f"_:{v}"
+    return v
+
+
+def _label_score(binding: dict[str, Any]) -> int:
+    """
+    Higher is better.
+    Prefer English, then no-language, then anything else.
+    """
+    lang = (binding.get("xml:lang") or "").lower()
+    if lang == "en":
+        return 3
+    if lang == "":
+        return 2
+    return 1
+
+
+async def extract_subclass_entities_and_labels(
+    sparql: SparqlEngine,
+    *,
+    include_bnodes: bool,
+    label_batch_size: int = 500,
+) -> tuple[list[str], list[str | None]]:
+    """
+    Pipeline:
+      1) Query all rdfs:subClassOf triples.
+      2) Build a unique set of entity terms from subjects+objects, convert to list.
+      3) Fetch rdfs:label for those entities and return an aligned labels list.
+
+    Returns:
+      entities: list[str] (IRI or "_:bnodeId")
+      labels:   list[str|None], aligned with entities
+    """
+
+    subclass_q = f"""
+SELECT ?s ?o
+WHERE {{
+  ?s <{RDFS_SUBCLASS_OF}> ?o .
+  FILTER(!isLiteral(?o))
+  {"FILTER(!isBlank(?s) && !isBlank(?o))" if not include_bnodes else ""}
+}}
+"""
+    res = await sparql.query_json(subclass_q)
+
+    entity_keys: set[tuple[str, str]] = set()
+    for b in _bindings(res):
+        sk = _term_key(b.get("s") or {})
+        ok = _term_key(b.get("o") or {})
+        if sk is not None and (include_bnodes or sk[0] != "bnode"):
+            entity_keys.add(sk)
+        if ok is not None and (include_bnodes or ok[0] != "bnode"):
+            entity_keys.add(ok)
+
+    # Deterministic ordering.
+    entity_key_list = sorted(entity_keys, key=lambda k: (k[0], k[1]))
+    entities = [_key_to_entity_string(k) for k in entity_key_list]
+
+    # Build label map keyed by term key.
+    best_label_by_key: dict[tuple[str, str], tuple[int, str]] = {}
+
+    # URIs can be batch-queried via VALUES.
+    uri_values = [v for (t, v) in entity_key_list if t == "uri"]
+    for i in range(0, len(uri_values), label_batch_size):
+        batch = uri_values[i : i + label_batch_size]
+        values = " ".join(f"<{u}>" for u in batch)
+        labels_q = f"""
+SELECT ?s ?label
+WHERE {{
+  VALUES ?s {{ {values} }}
+  ?s <{RDFS_LABEL}> ?label .
+}}
+"""
+        lres = await sparql.query_json(labels_q)
+        for b in _bindings(lres):
+            sk = _term_key(b.get("s") or {})
+            if sk is None or sk[0] != "uri":
+                continue
+            label_term = b.get("label") or {}
+            if label_term.get("type") != "literal":
+                continue
+            label_value = label_term.get("value")
+            if label_value is None:
+                continue
+
+            score = _label_score(label_term)
+            prev = best_label_by_key.get(sk)
+            if prev is None or score > prev[0]:
+                best_label_by_key[sk] = (score, str(label_value))
+
+    # Blank nodes can't reliably be addressed by ID across queries, but if enabled we can still
+    # fetch all bnode labels and filter locally.
+    if include_bnodes:
+        bnode_keys = {k for k in entity_key_list if k[0] == "bnode"}
+        if bnode_keys:
+            bnode_labels_q = f"""
+SELECT ?s ?label
+WHERE {{
+  ?s <{RDFS_LABEL}> ?label .
+  FILTER(isBlank(?s))
+}}
+"""
+            blres = await sparql.query_json(bnode_labels_q)
+            for b in _bindings(blres):
+                sk = _term_key(b.get("s") or {})
+                if sk is None or sk not in bnode_keys:
+                    continue
+                label_term = b.get("label") or {}
+                if label_term.get("type") != "literal":
+                    continue
+                label_value = label_term.get("value")
+                if label_value is None:
+                    continue
+                score = _label_score(label_term)
+                prev = best_label_by_key.get(sk)
+                if prev is None or score > prev[0]:
+                    best_label_by_key[sk] = (score, str(label_value))
+
+    labels: list[str | None] = []
+    for k in entity_key_list:
+        item = best_label_by_key.get(k)
+        labels.append(item[1] if item else None)
+
+    return entities, labels
+
--- a/backend/app/rdf_store.py
+++ b/backend/app/rdf_store.py
@@ -0,0 +1,150 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+from rdflib import BNode, Graph, Literal, URIRef
+from rdflib.namespace import RDFS, SKOS
+
+
+LABEL_PREDICATES = {RDFS.label, SKOS.prefLabel, SKOS.altLabel}
+
+
+@dataclass(frozen=True)
+class EdgeRow:
+    source: int
+    target: int
+    predicate: str
+
+
+class RDFStore:
+    def __init__(self, *, ttl_path: str, include_bnodes: bool, max_triples: int | None):
+        self.ttl_path = ttl_path
+        self.include_bnodes = include_bnodes
+        self.max_triples = max_triples
+
+        self.graph: Graph | None = None
+
+        self._id_by_term: dict[Any, int] = {}
+        self._term_by_id: list[Any] = []
+
+        self._labels_by_id: dict[int, str] = {}
+        self._edges: list[EdgeRow] = []
+        self._parsed_triples = 0
+
+    def _term_allowed(self, term: Any) -> bool:
+        if isinstance(term, Literal):
+            return False
+        if isinstance(term, BNode) and not self.include_bnodes:
+            return False
+        return isinstance(term, (URIRef, BNode))
+
+    def _get_id(self, term: Any) -> int | None:
+        if not self._term_allowed(term):
+            return None
+        existing = self._id_by_term.get(term)
+        if existing is not None:
+            return existing
+        nid = len(self._term_by_id)
+        self._id_by_term[term] = nid
+        self._term_by_id.append(term)
+        return nid
+
+    def _term_type(self, term: Any) -> str:
+        if isinstance(term, BNode):
+            return "bnode"
+        return "uri"
+
+    def _term_iri(self, term: Any) -> str:
+        if isinstance(term, BNode):
+            return f"_:{term}"
+        return str(term)
+
+    def load(self, graph: Graph | None = None) -> None:
+        g = graph or Graph()
+        if graph is None:
+            g.parse(self.ttl_path, format="turtle")
+        self.graph = g
+
+        self._id_by_term.clear()
+        self._term_by_id.clear()
+        self._labels_by_id.clear()
+        self._edges.clear()
+
+        parsed = 0
+        for (s, p, o) in g:
+            parsed += 1
+            if self.max_triples is not None and parsed > self.max_triples:
+                break
+
+            # Capture labels but do not emit them as edges.
+            if p in LABEL_PREDICATES and isinstance(o, Literal):
+                sid = self._get_id(s)
+                if sid is not None and sid not in self._labels_by_id:
+                    self._labels_by_id[sid] = str(o)
+                continue
+
+            sid = self._get_id(s)
+            oid = self._get_id(o)
+            if sid is None or oid is None:
+                continue
+
+            self._edges.append(EdgeRow(source=sid, target=oid, predicate=str(p)))
+
+        self._parsed_triples = parsed
+
+    @property
+    def parsed_triples(self) -> int:
+        return self._parsed_triples
+
+    @property
+    def node_count(self) -> int:
+        return len(self._term_by_id)
+
+    @property
+    def edge_count(self) -> int:
+        return len(self._edges)
+
+    def node_slice(self, *, offset: int, limit: int) -> list[dict[str, Any]]:
+        end = min(self.node_count, offset + limit)
+        out: list[dict[str, Any]] = []
+        for nid in range(offset, end):
+            term = self._term_by_id[nid]
+            out.append(
+                {
+                    "id": nid,
+                    "termType": self._term_type(term),
+                    "iri": self._term_iri(term),
+                    "label": self._labels_by_id.get(nid),
+                }
+            )
+        return out
+
+    def edge_slice(self, *, offset: int, limit: int) -> list[dict[str, Any]]:
+        end = min(self.edge_count, offset + limit)
+        out: list[dict[str, Any]] = []
+        for row in self._edges[offset:end]:
+            out.append(
+                {
+                    "source": row.source,
+                    "target": row.target,
+                    "predicate": row.predicate,
+                }
+            )
+        return out
+
+    def edges_within_nodes(self, *, max_node_id_exclusive: int, limit: int) -> list[dict[str, Any]]:
+        out: list[dict[str, Any]] = []
+        for row in self._edges:
+            if row.source >= max_node_id_exclusive or row.target >= max_node_id_exclusive:
+                continue
+            out.append(
+                {
+                    "source": row.source,
+                    "target": row.target,
+                    "predicate": row.predicate,
+                }
+            )
+            if len(out) >= limit:
+                break
+        return out
--- a/backend/app/settings.py
+++ b/backend/app/settings.py
@@ -0,0 +1,58 @@
+from __future__ import annotations
+
+from typing import Literal
+
+from pydantic import Field
+from pydantic_settings import BaseSettings, SettingsConfigDict
+
+
+class Settings(BaseSettings):
+    # Which graph engine executes SPARQL queries.
+    # - rdflib: parse TTL locally and query in-memory
+    # - anzograph: query a remote AnzoGraph SPARQL endpoint (optionally LOAD on startup)
+    graph_backend: Literal["rdflib", "anzograph"] = Field(default="rdflib", alias="GRAPH_BACKEND")
+
+    ttl_path: str = Field(default="/data/o3po.ttl", alias="TTL_PATH")
+    include_bnodes: bool = Field(default=False, alias="INCLUDE_BNODES")
+    max_triples: int | None = Field(default=None, alias="MAX_TRIPLES")
+
+    # Optional: Combine owl:imports into a single TTL file on backend startup.
+    combine_owl_imports_on_start: bool = Field(default=False, alias="COMBINE_OWL_IMPORTS_ON_START")
+    combine_entry_location: str | None = Field(default=None, alias="COMBINE_ENTRY_LOCATION")
+    combine_output_location: str | None = Field(default=None, alias="COMBINE_OUTPUT_LOCATION")
+    combine_output_name: str = Field(default="combined_ontology.ttl", alias="COMBINE_OUTPUT_NAME")
+    combine_force: bool = Field(default=False, alias="COMBINE_FORCE")
+
+    # AnzoGraph / SPARQL endpoint configuration
+    sparql_host: str = Field(default="http://anzograph:8080", alias="SPARQL_HOST")
+    # If not set, the backend uses `${SPARQL_HOST}/sparql`.
+    sparql_endpoint: str | None = Field(default=None, alias="SPARQL_ENDPOINT")
+    sparql_user: str | None = Field(default=None, alias="SPARQL_USER")
+    sparql_pass: str | None = Field(default=None, alias="SPARQL_PASS")
+
+    # File URI as seen by the AnzoGraph container (used with SPARQL `LOAD`).
+    # Example: file:///opt/shared-files/o3po.ttl
+    sparql_data_file: str | None = Field(default=None, alias="SPARQL_DATA_FILE")
+    sparql_graph_iri: str | None = Field(default=None, alias="SPARQL_GRAPH_IRI")
+    sparql_load_on_start: bool = Field(default=False, alias="SPARQL_LOAD_ON_START")
+    sparql_clear_on_start: bool = Field(default=False, alias="SPARQL_CLEAR_ON_START")
+
+    sparql_timeout_s: float = Field(default=300.0, alias="SPARQL_TIMEOUT_S")
+    sparql_ready_retries: int = Field(default=30, alias="SPARQL_READY_RETRIES")
+    sparql_ready_delay_s: float = Field(default=4.0, alias="SPARQL_READY_DELAY_S")
+    sparql_ready_timeout_s: float = Field(default=10.0, alias="SPARQL_READY_TIMEOUT_S")
+
+    # Comma-separated, or "*" (default).
+    cors_origins: str = Field(default="*", alias="CORS_ORIGINS")
+
+    model_config = SettingsConfigDict(env_file=".env", extra="ignore")
+
+    def cors_origin_list(self) -> list[str]:
+        if self.cors_origins.strip() == "*":
+            return ["*"]
+        return [o.strip() for o in self.cors_origins.split(",") if o.strip()]
+
+    def effective_sparql_endpoint(self) -> str:
+        if self.sparql_endpoint and self.sparql_endpoint.strip():
+            return self.sparql_endpoint.strip()
+        return self.sparql_host.rstrip("/") + "/sparql"
--- a/backend/app/sparql_engine.py
+++ b/backend/app/sparql_engine.py
@@ -0,0 +1,177 @@
+from __future__ import annotations
+
+import asyncio
+import base64
+import json
+from typing import Any, Protocol
+
+import httpx
+from rdflib import Graph
+
+from .settings import Settings
+
+
+class SparqlEngine(Protocol):
+    name: str
+
+    async def startup(self) -> None: ...
+
+    async def shutdown(self) -> None: ...
+
+    async def query_json(self, query: str) -> dict[str, Any]: ...
+
+
+class RdflibEngine:
+    name = "rdflib"
+
+    def __init__(self, *, ttl_path: str, graph: Graph | None = None):
+        self.ttl_path = ttl_path
+        self.graph: Graph | None = graph
+
+    async def startup(self) -> None:
+        if self.graph is not None:
+            return
+        g = Graph()
+        g.parse(self.ttl_path, format="turtle")
+        self.graph = g
+
+    async def shutdown(self) -> None:
+        # Nothing to close for in-memory rdflib graph.
+        return None
+
+    async def query_json(self, query: str) -> dict[str, Any]:
+        if self.graph is None:
+            raise RuntimeError("RdflibEngine not started")
+
+        result = self.graph.query(query)
+        payload = result.serialize(format="json")
+        if isinstance(payload, bytes):
+            payload = payload.decode("utf-8")
+        return json.loads(payload)
+
+
+class AnzoGraphEngine:
+    name = "anzograph"
+
+    def __init__(self, *, settings: Settings):
+        self.endpoint = settings.effective_sparql_endpoint()
+        self.timeout_s = settings.sparql_timeout_s
+        self.ready_retries = settings.sparql_ready_retries
+        self.ready_delay_s = settings.sparql_ready_delay_s
+        self.ready_timeout_s = settings.sparql_ready_timeout_s
+
+        self.user = settings.sparql_user
+        self.password = settings.sparql_pass
+        self.data_file = settings.sparql_data_file
+        self.graph_iri = settings.sparql_graph_iri
+        self.load_on_start = settings.sparql_load_on_start
+        self.clear_on_start = settings.sparql_clear_on_start
+
+        self._client: httpx.AsyncClient | None = None
+        self._auth_header = self._build_auth_header(self.user, self.password)
+
+    @staticmethod
+    def _build_auth_header(user: str | None, password: str | None) -> str | None:
+        if not user or not password:
+            return None
+        token = base64.b64encode(f"{user}:{password}".encode("utf-8")).decode("ascii")
+        return f"Basic {token}"
+
+    async def startup(self) -> None:
+        self._client = httpx.AsyncClient(timeout=self.timeout_s)
+
+        await self._wait_ready()
+
+        if self.clear_on_start:
+            await self._update("CLEAR ALL")
+            await self._wait_ready()
+
+        if self.load_on_start:
+            if not self.data_file:
+                raise RuntimeError("SPARQL_LOAD_ON_START=true but SPARQL_DATA_FILE is not set")
+
+            if self.graph_iri:
+                await self._update(f"LOAD <{self.data_file}> INTO GRAPH <{self.graph_iri}>")
+            else:
+                await self._update(f"LOAD <{self.data_file}>")
+
+            # AnzoGraph may still be indexing after LOAD.
+            await self._wait_ready()
+
+    async def shutdown(self) -> None:
+        if self._client is not None:
+            await self._client.aclose()
+            self._client = None
+
+    async def query_json(self, query: str) -> dict[str, Any]:
+        if self._client is None:
+            raise RuntimeError("AnzoGraphEngine not started")
+
+        headers = {
+            "Content-Type": "application/x-www-form-urlencoded",
+            "Accept": "application/sparql-results+json",
+        }
+        if self._auth_header:
+            headers["Authorization"] = self._auth_header
+
+        # AnzoGraph expects x-www-form-urlencoded with `query=...`.
+        resp = await self._client.post(
+            self.endpoint,
+            headers=headers,
+            data={"query": query},
+        )
+        resp.raise_for_status()
+        return resp.json()
+
+    async def _update(self, update: str) -> None:
+        if self._client is None:
+            raise RuntimeError("AnzoGraphEngine not started")
+
+        headers = {
+            "Content-Type": "application/sparql-update",
+            "Accept": "application/json",
+        }
+        if self._auth_header:
+            headers["Authorization"] = self._auth_header
+
+        resp = await self._client.post(self.endpoint, headers=headers, content=update)
+        resp.raise_for_status()
+
+    async def _wait_ready(self) -> None:
+        if self._client is None:
+            raise RuntimeError("AnzoGraphEngine not started")
+
+        # Match the repo's Julia readiness gate: real SPARQL POST + valid JSON parse.
+        headers = {
+            "Content-Type": "application/x-www-form-urlencoded",
+            "Accept": "application/sparql-results+json",
+        }
+        if self._auth_header:
+            headers["Authorization"] = self._auth_header
+
+        last_err: Exception | None = None
+        for _ in range(self.ready_retries):
+            try:
+                resp = await self._client.post(
+                    self.endpoint,
+                    headers=headers,
+                    data={"query": "ASK WHERE { ?s ?p ?o }"},
+                    timeout=self.ready_timeout_s,
+                )
+                resp.raise_for_status()
+                # Ensure it's JSON, not HTML/text during boot.
+                resp.json()
+                return
+            except Exception as e:
+                last_err = e
+                await asyncio.sleep(self.ready_delay_s)
+
+        raise RuntimeError(f"AnzoGraph not ready at {self.endpoint}") from last_err
+
+
+def create_sparql_engine(settings: Settings, *, rdflib_graph: Graph | None = None) -> SparqlEngine:
+    if settings.graph_backend == "rdflib":
+        return RdflibEngine(ttl_path=settings.ttl_path, graph=rdflib_graph)
+    if settings.graph_backend == "anzograph":
+        return AnzoGraphEngine(settings=settings)
+    raise RuntimeError(f"Unsupported GRAPH_BACKEND={settings.graph_backend!r}")
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -0,0 +1,5 @@
+fastapi
+uvicorn[standard]
+rdflib
+pydantic-settings
+httpx
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -1,17 +1,55 @@
 services:
-  app:
-    build: .
-    depends_on:
-      - anzograph
+  backend:
+    build: ./backend
+    ports:
+      - "8000:8000"
+    environment:
+      - GRAPH_BACKEND=${GRAPH_BACKEND:-rdflib}
+      - TTL_PATH=${TTL_PATH:-/data/o3po.ttl}
+      - INCLUDE_BNODES=${INCLUDE_BNODES:-false}
+      - MAX_TRIPLES
+      - CORS_ORIGINS=${CORS_ORIGINS:-http://localhost:5173}
+      - SPARQL_HOST=${SPARQL_HOST:-http://anzograph:8080}
+      - SPARQL_ENDPOINT
+      - SPARQL_USER=${SPARQL_USER:-admin}
+      - SPARQL_PASS=${SPARQL_PASS:-Passw0rd1}
+      - SPARQL_DATA_FILE=${SPARQL_DATA_FILE:-file:///opt/shared-files/o3po.ttl}
+      - SPARQL_GRAPH_IRI
+      - SPARQL_LOAD_ON_START=${SPARQL_LOAD_ON_START:-false}
+      - SPARQL_CLEAR_ON_START=${SPARQL_CLEAR_ON_START:-false}
+      - SPARQL_TIMEOUT_S=${SPARQL_TIMEOUT_S:-300}
+      - SPARQL_READY_RETRIES=${SPARQL_READY_RETRIES:-30}
+      - SPARQL_READY_DELAY_S=${SPARQL_READY_DELAY_S:-4}
+      - SPARQL_READY_TIMEOUT_S=${SPARQL_READY_TIMEOUT_S:-10}
+      - COMBINE_OWL_IMPORTS_ON_START=${COMBINE_OWL_IMPORTS_ON_START:-false}
+      - COMBINE_ENTRY_LOCATION
+      - COMBINE_OUTPUT_LOCATION
+      - COMBINE_OUTPUT_NAME
+      - COMBINE_FORCE=${COMBINE_FORCE:-false}
+    volumes:
+      - ./backend:/app
+      - ./data:/data:Z
+    command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
+    healthcheck:
+      test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/api/health').read()"]
+      interval: 5s
+      timeout: 3s
+      retries: 60
+
+  frontend:
+    build: ./frontend
    ports:
      - "5173:5173"
-    env_file:
-      - .env
-    command: sh -c "npm run layout && npm run dev -- --host"
+    environment:
+      - VITE_BACKEND_URL=${VITE_BACKEND_URL:-http://backend:8000}
    volumes:
-      - .:/app:Z
+      - ./frontend:/app
      - /app/node_modules
-
+    depends_on:
+      - backend
+    # Docker Compose v1 doesn't support depends_on:condition. Do an explicit wait here.
+    command: sh -c "until wget -qO- http://backend:8000/api/health >/dev/null 2>&1; do echo 'waiting for backend...'; sleep 1; done; npm run dev -- --host --port 5173"
+  
  anzograph:
    image: cambridgesemantics/anzograph:latest
    container_name: anzograph
@@ -20,4 +58,3 @@ services:
      - "8443:8443"
    volumes:
      - ./data:/opt/shared-files:Z
-
--- a/frontend/Dockerfile
+++ b/frontend/Dockerfile
@@ -2,6 +2,8 @@ FROM node:lts-alpine

 WORKDIR /app

+EXPOSE 5173
+
 # Copy dependency definitions
 COPY package*.json ./

@@ -11,8 +13,5 @@ RUN npm install
 # Copy the rest of the source code
 COPY . .

-# Expose the standard Vite port
-EXPOSE 5173
-
-# Compute layout, then start the dev server with --host for external access
-CMD ["sh", "-c", "npm run dev -- --host"]
+# Start the dev server with --host for external access
+CMD ["npm", "run", "dev", "--", "--host", "--port", "5173"]
--- a/frontend/index.html
+++ b/frontend/index.html
--- a/frontend/package-lock.json
+++ b/frontend/package-lock.json
--- a/frontend/package.json
+++ b/frontend/package.json
@@ -7,7 +7,7 @@
    "dev": "vite",
    "build": "vite build",
    "preview": "vite preview",
-    "layout": "npx tsx scripts/fetch_from_db.ts && npx tsx scripts/compute_layout.ts"
+    "layout": "tsx scripts/compute_layout.ts"
  },
  "dependencies": {
    "@webgpu/types": "^0.1.69",
--- a/frontend/public/edges.csv
+++ b/frontend/public/edges.csv
--- a/frontend/public/node_positions.csv
+++ b/frontend/public/node_positions.csv
--- a/frontend/scripts/compute_layout.ts
+++ b/frontend/scripts/compute_layout.ts
@@ -0,0 +1,354 @@
+#!/usr/bin/env npx tsx
+/**
+ * Tree-Aware Force Layout
+ *
+ * Generates a random tree (via generate_tree), computes a radial tree layout,
+ * then applies gentle force refinement and writes node_positions.csv.
+ *
+ * Usage: npm run layout
+ */
+
+import { writeFileSync } from "fs";
+import { join, dirname } from "path";
+import { fileURLToPath } from "url";
+import { generateTree } from "./generate_tree.js";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const PUBLIC_DIR = join(__dirname, "..", "public");
+
+// ══════════════════════════════════════════════════════════
+// Configuration
+// ══════════════════════════════════════════════════════════
+
+const ENABLE_FORCE_SIM = true;  // Set to false to skip force simulation
+const ITERATIONS = 100;        // Force iterations (gentle)
+const REPULSION_K = 80;        // Repulsion strength (1% of original 8000)
+const EDGE_LENGTH = 120;       // Desired edge rest length
+const ATTRACTION_K = 0.0002;   // Spring stiffness for edges (1% of original 0.02)
+const THETA = 0.7;             // Barnes-Hut accuracy
+const INITIAL_MAX_DISP = 15;   // Starting max displacement
+const COOLING = 0.998;         // Very slow cooling per iteration
+const MIN_DIST = 0.5;
+const PRINT_EVERY = 10;        // Print progress every N iterations
+
+// Scale radius so the tree is nicely spread
+const RADIUS_PER_DEPTH = EDGE_LENGTH * 1.2;
+
+// ── Special nodes with longer parent-edges ──
+// Add vertex IDs here to give them longer edges to their parent.
+// These nodes (and all their descendants) will be pushed further out.
+const LONG_EDGE_NODES = new Set<number>([
+    // e.g. 42, 99, 150
+]);
+const LONG_EDGE_MULTIPLIER = 3.0;  // How many times longer than normal
+
+// ══════════════════════════════════════════════════════════
+// Generate tree (in-memory)
+// ══════════════════════════════════════════════════════════
+
+const { root, nodeCount: N, childrenOf, parentOf } = generateTree();
+
+const nodeIds: number[] = [];
+for (let i = 0; i < N; i++) nodeIds.push(i);
+
+// Dense index mapping (identity since IDs are 0..N-1)
+const idToIdx = new Map<number, number>();
+for (let i = 0; i < N; i++) idToIdx.set(i, i);
+
+// Edge list as index pairs (child, parent)
+const edges: Array<[number, number]> = [];
+for (const [child, parent] of parentOf) {
+    edges.push([child, parent]);
+}
+
+// Per-node neighbor list (for edge traversal)
+const neighbors: number[][] = Array.from({ length: N }, () => []);
+for (const [a, b] of edges) {
+    neighbors[a].push(b);
+    neighbors[b].push(a);
+}
+
+console.log(`Tree: ${N} nodes, ${edges.length} edges, root=${root}`);
+
+// ══════════════════════════════════════════════════════════
+// Step 1: Radial tree layout (generous spacing, no crossings)
+// ══════════════════════════════════════════════════════════
+
+const x = new Float64Array(N);
+const y = new Float64Array(N);
+const depth = new Uint32Array(N);
+const nodeRadius = new Float64Array(N); // cumulative radius from root
+
+// Compute subtree sizes
+const subtreeSize = new Uint32Array(N).fill(1);
+{
+    const rootIdx = idToIdx.get(root)!;
+    const stack: Array<{ idx: number; phase: "enter" | "exit" }> = [
+        { idx: rootIdx, phase: "enter" },
+    ];
+    while (stack.length > 0) {
+        const { idx, phase } = stack.pop()!;
+        if (phase === "enter") {
+            stack.push({ idx, phase: "exit" });
+            const kids = childrenOf.get(nodeIds[idx]);
+            if (kids) {
+                for (const kid of kids) {
+                    stack.push({ idx: idToIdx.get(kid)!, phase: "enter" });
+                }
+            }
+        } else {
+            const kids = childrenOf.get(nodeIds[idx]);
+            if (kids) {
+                for (const kid of kids) {
+                    subtreeSize[idx] += subtreeSize[idToIdx.get(kid)!];
+                }
+            }
+        }
+    }
+}
+
+// Compute depths & max depth
+let maxDepth = 0;
+{
+    const rootIdx = idToIdx.get(root)!;
+    const stack: Array<{ idx: number; d: number }> = [{ idx: rootIdx, d: 0 }];
+    while (stack.length > 0) {
+        const { idx, d } = stack.pop()!;
+        depth[idx] = d;
+        if (d > maxDepth) maxDepth = d;
+        const kids = childrenOf.get(nodeIds[idx]);
+        if (kids) {
+            for (const kid of kids) {
+                stack.push({ idx: idToIdx.get(kid)!, d: d + 1 });
+            }
+        }
+    }
+}
+
+// BFS radial assignment (cumulative radii to support per-edge lengths)
+{
+    const rootIdx = idToIdx.get(root)!;
+    x[rootIdx] = 0;
+    y[rootIdx] = 0;
+    nodeRadius[rootIdx] = 0;
+
+    interface Entry {
+        idx: number;
+        d: number;
+        aStart: number;
+        aEnd: number;
+    }
+
+    const queue: Entry[] = [{ idx: rootIdx, d: 0, aStart: 0, aEnd: 2 * Math.PI }];
+    let head = 0;
+
+    while (head < queue.length) {
+        const { idx, d, aStart, aEnd } = queue[head++];
+        const kids = childrenOf.get(nodeIds[idx]);
+        if (!kids || kids.length === 0) continue;
+
+        // Sort children by subtree size (largest sectors together for balance)
+        const sortedKids = [...kids].sort(
+            (a, b) => (subtreeSize[idToIdx.get(b)!]) - (subtreeSize[idToIdx.get(a)!])
+        );
+
+        const totalWeight = sortedKids.reduce(
+            (s, k) => s + subtreeSize[idToIdx.get(k)!], 0
+        );
+
+        let angle = aStart;
+        for (const kid of sortedKids) {
+            const kidIdx = idToIdx.get(kid)!;
+            const w = subtreeSize[kidIdx];
+            const sector = (w / totalWeight) * (aEnd - aStart);
+            const mid = angle + sector / 2;
+
+            // Cumulative radius: parent's radius + edge step (longer for special nodes)
+            const step = LONG_EDGE_NODES.has(kid)
+                ? RADIUS_PER_DEPTH * LONG_EDGE_MULTIPLIER
+                : RADIUS_PER_DEPTH;
+            const r = nodeRadius[idx] + step;
+            nodeRadius[kidIdx] = r;
+
+            x[kidIdx] = r * Math.cos(mid);
+            y[kidIdx] = r * Math.sin(mid);
+
+            queue.push({ idx: kidIdx, d: d + 1, aStart: angle, aEnd: angle + sector });
+            angle += sector;
+        }
+    }
+}
+
+console.log(`Radial layout done (depth=${maxDepth}, radius_step=${RADIUS_PER_DEPTH})`);
+
+// ══════════════════════════════════════════════════════════
+// Step 2: Gentle force refinement (preserves non-crossing)
+// ══════════════════════════════════════════════════════════
+
+// Barnes-Hut quadtree for repulsion
+interface BHNode {
+    cx: number; cy: number;
+    mass: number;
+    size: number;
+    children: (BHNode | null)[];
+    bodyIdx: number;
+}
+
+function buildBHTree(): BHNode {
+    let minX = Infinity, maxX = -Infinity, minY = Infinity, maxY = -Infinity;
+    for (let i = 0; i < N; i++) {
+        if (x[i] < minX) minX = x[i];
+        if (x[i] > maxX) maxX = x[i];
+        if (y[i] < minY) minY = y[i];
+        if (y[i] > maxY) maxY = y[i];
+    }
+    const size = Math.max(maxX - minX, maxY - minY, 1) * 1.01;
+    const cx = (minX + maxX) / 2;
+    const cy = (minY + maxY) / 2;
+
+    const root: BHNode = {
+        cx: 0, cy: 0, mass: 0, size,
+        children: [null, null, null, null], bodyIdx: -1,
+    };
+
+    for (let i = 0; i < N; i++) {
+        insert(root, i, cx, cy, size);
+    }
+    return root;
+}
+
+function insert(node: BHNode, idx: number, ncx: number, ncy: number, ns: number): void {
+    if (node.mass === 0) {
+        node.bodyIdx = idx;
+        node.cx = x[idx]; node.cy = y[idx];
+        node.mass = 1;
+        return;
+    }
+    if (node.bodyIdx >= 0) {
+        const old = node.bodyIdx;
+        node.bodyIdx = -1;
+        putInQuadrant(node, old, ncx, ncy, ns);
+    }
+    putInQuadrant(node, idx, ncx, ncy, ns);
+    const tm = node.mass + 1;
+    node.cx = (node.cx * node.mass + x[idx]) / tm;
+    node.cy = (node.cy * node.mass + y[idx]) / tm;
+    node.mass = tm;
+}
+
+function putInQuadrant(node: BHNode, idx: number, ncx: number, ncy: number, ns: number): void {
+    const hs = ns / 2;
+    const qx = x[idx] >= ncx ? 1 : 0;
+    const qy = y[idx] >= ncy ? 1 : 0;
+    const q = qy * 2 + qx;
+    const ccx = ncx + (qx ? hs / 2 : -hs / 2);
+    const ccy = ncy + (qy ? hs / 2 : -hs / 2);
+    if (!node.children[q]) {
+        node.children[q] = {
+            cx: 0, cy: 0, mass: 0, size: hs,
+            children: [null, null, null, null], bodyIdx: -1,
+        };
+    }
+    insert(node.children[q]!, idx, ccx, ccy, hs);
+}
+
+function repulse(node: BHNode, idx: number, fx: Float64Array, fy: Float64Array): void {
+    if (node.mass === 0 || node.bodyIdx === idx) return;
+    const dx = x[idx] - node.cx;
+    const dy = y[idx] - node.cy;
+    const d2 = dx * dx + dy * dy;
+    const d = Math.sqrt(d2) || MIN_DIST;
+
+    if (node.bodyIdx >= 0 || (node.size / d) < THETA) {
+        const f = REPULSION_K * node.mass / (d2 + MIN_DIST);
+        fx[idx] += (dx / d) * f;
+        fy[idx] += (dy / d) * f;
+        return;
+    }
+    for (const c of node.children) {
+        if (c) repulse(c, idx, fx, fy);
+    }
+}
+
+// ── Force simulation ──
+if (ENABLE_FORCE_SIM) {
+    console.log(`Applying gentle forces (${ITERATIONS} steps, 1% strength)...`);
+    const t0 = performance.now();
+    let maxDisp = INITIAL_MAX_DISP;
+
+    for (let iter = 0; iter < ITERATIONS; iter++) {
+        const fx = new Float64Array(N);
+        const fy = new Float64Array(N);
+
+        // 1. Repulsion
+        const tree = buildBHTree();
+        for (let i = 0; i < N; i++) {
+            repulse(tree, i, fx, fy);
+        }
+
+        // 2. Edge attraction (spring toward per-edge rest length)
+        for (const [a, b] of edges) {
+            const dx = x[b] - x[a];
+            const dy = y[b] - y[a];
+            const d = Math.sqrt(dx * dx + dy * dy) || MIN_DIST;
+            const aId = nodeIds[a], bId = nodeIds[b];
+            const isLong = LONG_EDGE_NODES.has(aId) || LONG_EDGE_NODES.has(bId);
+            const restLen = isLong ? EDGE_LENGTH * LONG_EDGE_MULTIPLIER : EDGE_LENGTH;
+            const displacement = d - restLen;
+            const f = ATTRACTION_K * displacement;
+            const ux = dx / d, uy = dy / d;
+            fx[a] += ux * f;
+            fy[a] += uy * f;
+            fx[b] -= ux * f;
+            fy[b] -= uy * f;
+        }
+
+        // 3. Apply forces with displacement cap (cooling reduces it over time)
+        for (let i = 0; i < N; i++) {
+            const mag = Math.sqrt(fx[i] * fx[i] + fy[i] * fy[i]);
+            if (mag > 0) {
+                const cap = Math.min(maxDisp, mag) / mag;
+                x[i] += fx[i] * cap;
+                y[i] += fy[i] * cap;
+            }
+        }
+
+        // 4. Cool down
+        maxDisp *= COOLING;
+
+        if ((iter + 1) % PRINT_EVERY === 0) {
+            let totalForce = 0;
+            for (let i = 0; i < N; i++) totalForce += Math.sqrt(fx[i] * fx[i] + fy[i] * fy[i]);
+            console.log(`  iter ${iter + 1}/${ITERATIONS}  max_disp=${maxDisp.toFixed(2)}  avg_force=${(totalForce / N).toFixed(2)}`);
+        }
+    }
+
+    const elapsed = performance.now() - t0;
+    console.log(`Force simulation done in ${(elapsed / 1000).toFixed(1)}s`);
+} else {
+    console.log("Force simulation SKIPPED (ENABLE_FORCE_SIM = false)");
+}
+
+// ══════════════════════════════════════════════════════════
+// Write output
+// ══════════════════════════════════════════════════════════
+
+// Write node positions
+const outLines: string[] = ["vertex,x,y"];
+for (let i = 0; i < N; i++) {
+    outLines.push(`${nodeIds[i]},${x[i]},${y[i]}`);
+}
+
+const outPath = join(PUBLIC_DIR, "node_positions.csv");
+writeFileSync(outPath, outLines.join("\n") + "\n");
+console.log(`Wrote ${N} positions to ${outPath}`);
+
+// Write edges (so the renderer can draw them)
+const edgeLines: string[] = ["source,target"];
+for (const [child, parent] of parentOf) {
+    edgeLines.push(`${child},${parent}`);
+}
+
+const edgesPath = join(PUBLIC_DIR, "edges.csv");
+writeFileSync(edgesPath, edgeLines.join("\n") + "\n");
+console.log(`Wrote ${edges.length} edges to ${edgesPath}`);
--- a/frontend/scripts/generate_tree.ts
+++ b/frontend/scripts/generate_tree.ts
@@ -0,0 +1,61 @@
+/**
+ * Random Tree Generator
+ *
+ * Generates a random tree with 1–MAX_CHILDREN children per node.
+ * Exports a function that returns the tree data in memory.
+ */
+
+// ══════════════════════════════════════════════════════════
+// Configuration
+// ══════════════════════════════════════════════════════════
+
+const TARGET_NODES = 100000;      // Approximate number of nodes to generate
+const MAX_CHILDREN = 3;        // Each node gets 1..MAX_CHILDREN children
+
+// ══════════════════════════════════════════════════════════
+// Tree data types
+// ══════════════════════════════════════════════════════════
+
+export interface TreeData {
+    root: number;
+    nodeCount: number;
+    childrenOf: Map<number, number[]>;
+    parentOf: Map<number, number>;
+}
+
+// ══════════════════════════════════════════════════════════
+// Generator
+// ══════════════════════════════════════════════════════════
+
+export function generateTree(): TreeData {
+    const childrenOf = new Map<number, number[]>();
+    const parentOf = new Map<number, number>();
+
+    const root = 0;
+    let nextId = 1;
+    const queue: number[] = [root];
+    let head = 0;
+
+    while (head < queue.length && nextId < TARGET_NODES) {
+        const parent = queue[head++];
+        const nKids = 1 + Math.floor(Math.random() * MAX_CHILDREN); // 1..MAX_CHILDREN
+
+        const kids: number[] = [];
+        for (let c = 0; c < nKids && nextId < TARGET_NODES; c++) {
+            const child = nextId++;
+            kids.push(child);
+            parentOf.set(child, parent);
+            queue.push(child);
+        }
+        childrenOf.set(parent, kids);
+    }
+
+    console.log(`Generated tree: ${nextId} nodes, ${parentOf.size} edges, root=${root}`);
+
+    return {
+        root,
+        nodeCount: nextId,
+        childrenOf,
+        parentOf,
+    };
+}
--- a/frontend/src/App.tsx
+++ b/frontend/src/App.tsx
@@ -1,12 +1,26 @@
 import { useEffect, useRef, useState } from "react";
 import { Renderer } from "./renderer";

+function sleep(ms: number): Promise<void> {
+  return new Promise((r) => setTimeout(r, ms));
+}
+
+type GraphMeta = {
+  backend?: string;
+  ttl_path?: string | null;
+  sparql_endpoint?: string | null;
+  include_bnodes?: boolean;
+  node_limit?: number;
+  edge_limit?: number;
+  nodes?: number;
+  edges?: number;
+};
+
 export default function App() {
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const rendererRef = useRef<Renderer | null>(null);
-  const [status, setStatus] = useState("Loading node positions…");
+  const [status, setStatus] = useState("Waiting for backend…");
  const [nodeCount, setNodeCount] = useState(0);
-  const uriMapRef = useRef<Map<number, { uri: string; label: string; isPrimary: boolean }>>(new Map());
  const [stats, setStats] = useState({
    fps: 0,
    drawn: 0,
@@ -15,11 +29,15 @@ export default function App() {
    ptSize: 0,
  });
  const [error, setError] = useState("");
-  const [hoveredNode, setHoveredNode] = useState<{ x: number; y: number; screenX: number; screenY: number; index?: number } | null>(null);
+  const [hoveredNode, setHoveredNode] = useState<{ x: number; y: number; screenX: number; screenY: number; label?: string; iri?: string } | null>(null);
  const [selectedNodes, setSelectedNodes] = useState<Set<number>>(new Set());
+  const [backendStats, setBackendStats] = useState<{ nodes: number; edges: number; backend?: string } | null>(null);
+  const graphMetaRef = useRef<GraphMeta | null>(null);
+  const neighborsReqIdRef = useRef(0);

  // Store mouse position in a ref so it can be accessed in render loop without re-renders
  const mousePos = useRef({ x: 0, y: 0 });
+  const nodesRef = useRef<any[]>([]);

  useEffect(() => {
    const canvas = canvasRef.current;
@@ -36,87 +54,82 @@ export default function App() {

    let cancelled = false;

-    // Fetch CSVs, parse, and init renderer
    (async () => {
      try {
-        setStatus("Fetching data files…");
-        const [nodesResponse, primaryEdgesResponse, secondaryEdgesResponse, uriMapResponse] = await Promise.all([
-          fetch("/node_positions.csv"),
-          fetch("/primary_edges.csv"),
-          fetch("/secondary_edges.csv"),
-          fetch("/uri_map.csv"),
-        ]);
-        if (!nodesResponse.ok) throw new Error(`Failed to fetch nodes: ${nodesResponse.status}`);
-        if (!primaryEdgesResponse.ok) throw new Error(`Failed to fetch primary edges: ${primaryEdgesResponse.status}`);
-        if (!secondaryEdgesResponse.ok) throw new Error(`Failed to fetch secondary edges: ${secondaryEdgesResponse.status}`);
+        // Wait for backend (docker-compose also gates startup via healthcheck, but this
+        // handles running the frontend standalone).
+        const deadline = performance.now() + 180_000;
+        let attempt = 0;
+        while (performance.now() < deadline) {
+          attempt++;
+          setStatus(`Waiting for backend… (attempt ${attempt})`);
+          try {
+            const res = await fetch("/api/health");
+            if (res.ok) break;
+          } catch {
+            // ignore and retry
+          }
+          await sleep(1000);
+          if (cancelled) return;
+        }

-        const [nodesText, primaryEdgesText, secondaryEdgesText, uriMapText] = await Promise.all([
-          nodesResponse.text(),
-          primaryEdgesResponse.text(),
-          secondaryEdgesResponse.text(),
-          uriMapResponse.ok ? uriMapResponse.text() : Promise.resolve(""),
-        ]);
+        setStatus("Fetching graph…");
+        const graphRes = await fetch("/api/graph");
+        if (!graphRes.ok) throw new Error(`Failed to fetch graph: ${graphRes.status}`);
+        const graph = await graphRes.json();
        if (cancelled) return;

-        setStatus("Parsing positions…");
-        const nodeLines = nodesText.split("\n").slice(1).filter(l => l.trim().length > 0);
-        const count = nodeLines.length;
+        const nodes = Array.isArray(graph.nodes) ? graph.nodes : [];
+        const edges = Array.isArray(graph.edges) ? graph.edges : [];
+        const meta = graph.meta || null;
+        const count = nodes.length;

+        nodesRef.current = nodes;
+        graphMetaRef.current = meta && typeof meta === "object" ? (meta as GraphMeta) : null;
+
+        // Build positions from backend-provided node coordinates.
+        setStatus("Preparing buffers…");
        const xs = new Float32Array(count);
        const ys = new Float32Array(count);
+        for (let i = 0; i < count; i++) {
+          const nx = nodes[i]?.x;
+          const ny = nodes[i]?.y;
+          xs[i] = typeof nx === "number" ? nx : 0;
+          ys[i] = typeof ny === "number" ? ny : 0;
+        }
        const vertexIds = new Uint32Array(count);
        for (let i = 0; i < count; i++) {
-          const parts = nodeLines[i].split(",");
-          vertexIds[i] = parseInt(parts[0], 10);
-          xs[i] = parseFloat(parts[1]);
-          ys[i] = parseFloat(parts[2]);
+          const id = nodes[i]?.id;
+          vertexIds[i] = typeof id === "number" ? id >>> 0 : i;
        }

-        setStatus("Parsing edges…");
-        const pLines = primaryEdgesText.split("\n").slice(1).filter(l => l.trim().length > 0);
-        const sLines = secondaryEdgesText.split("\n").slice(1).filter(l => l.trim().length > 0);
-
-        const totalEdges = pLines.length + sLines.length;
-        const edgeData = new Uint32Array(totalEdges * 2);
-
-        let idx = 0;
-        // Parse primary
-        for (let i = 0; i < pLines.length; i++) {
-          const parts = pLines[i].split(",");
-          edgeData[idx++] = parseInt(parts[0], 10);
-          edgeData[idx++] = parseInt(parts[1], 10);
-        }
-        // Parse secondary
-        for (let i = 0; i < sLines.length; i++) {
-          const parts = sLines[i].split(",");
-          edgeData[idx++] = parseInt(parts[0], 10);
-          edgeData[idx++] = parseInt(parts[1], 10);
+        // Build edges as vertex-id pairs.
+        const edgeData = new Uint32Array(edges.length * 2);
+        for (let i = 0; i < edges.length; i++) {
+          const s = edges[i]?.source;
+          const t = edges[i]?.target;
+          edgeData[i * 2] = typeof s === "number" ? s >>> 0 : 0;
+          edgeData[i * 2 + 1] = typeof t === "number" ? t >>> 0 : 0;
        }

-        // Parse URI map if available
-        if (uriMapText) {
-          const uriLines = uriMapText.split("\n").slice(1).filter(l => l.trim().length > 0);
-          for (const line of uriLines) {
-            const parts = line.split(",");
-            if (parts.length >= 4) {
-              const id = parseInt(parts[0], 10);
-              const uri = parts[1];
-              const label = parts[2];
-              const isPrimary = parts[3].trim() === "1";
-              uriMapRef.current.set(id, { uri, label, isPrimary });
-            }
-          }
+        // Use /api/graph meta; don't do a second expensive backend call.
+        if (meta && typeof meta.nodes === "number" && typeof meta.edges === "number") {
+          setBackendStats({
+            nodes: meta.nodes,
+            edges: meta.edges,
+            backend: typeof meta.backend === "string" ? meta.backend : undefined,
+          });
+        } else {
+          setBackendStats({ nodes: nodes.length, edges: edges.length });
        }

-        if (cancelled) return;
-
        setStatus("Building spatial index…");
-        await new Promise(r => setTimeout(r, 0));
+        await new Promise((r) => setTimeout(r, 0));

        const buildMs = renderer.init(xs, ys, vertexIds, edgeData);
        setNodeCount(renderer.getNodeCount());
        setStatus("");
-        console.log(`Init complete: ${count.toLocaleString()} nodes, ${totalEdges.toLocaleString()} edges in ${buildMs.toFixed(0)}ms`);
+        console.log(`Init complete: ${count.toLocaleString()} nodes, ${edges.length.toLocaleString()} edges in ${buildMs.toFixed(0)}ms`);
      } catch (e) {
        if (!cancelled) {
          setError(e instanceof Error ? e.message : String(e));
@@ -200,9 +213,18 @@ export default function App() {
      frameCount++;

      // Find hovered node using quadtree
-      const nodeResult = renderer.findNodeIndexAt(mousePos.current.x, mousePos.current.y);
-      if (nodeResult) {
-        setHoveredNode({ x: nodeResult.x, y: nodeResult.y, screenX: mousePos.current.x, screenY: mousePos.current.y, index: nodeResult.index });
+      const hit = renderer.findNodeIndexAt(mousePos.current.x, mousePos.current.y);
+      if (hit) {
+        const origIdx = renderer.sortedIndexToOriginalIndex(hit.index);
+        const meta = origIdx === null ? null : nodesRef.current[origIdx];
+        setHoveredNode({
+          x: hit.x,
+          y: hit.y,
+          screenX: mousePos.current.x,
+          screenY: mousePos.current.y,
+          label: meta && typeof meta.label === "string" ? meta.label : undefined,
+          iri: meta && typeof meta.iri === "string" ? meta.iri : undefined,
+        });
      } else {
        setHoveredNode(null);
      }
@@ -238,9 +260,72 @@ export default function App() {

  // Sync selection state to renderer
  useEffect(() => {
-    if (rendererRef.current) {
-      rendererRef.current.updateSelection(selectedNodes);
+    const renderer = rendererRef.current;
+    if (!renderer) return;
+
+    // Optimistically reflect selection immediately; neighbors will be filled in by backend.
+    renderer.updateSelection(selectedNodes, new Set());
+
+    // Invalidate any in-flight neighbor request for the previous selection.
+    const reqId = ++neighborsReqIdRef.current;
+
+    // Convert selected sorted indices to backend node IDs (graph-export dense IDs).
+    const selectedIds: number[] = [];
+    for (const sortedIdx of selectedNodes) {
+      const origIdx = renderer.sortedIndexToOriginalIndex(sortedIdx);
+      if (origIdx === null) continue;
+      const nodeId = nodesRef.current?.[origIdx]?.id;
+      if (typeof nodeId === "number") selectedIds.push(nodeId);
    }
+
+    if (selectedIds.length === 0) {
+      return;
+    }
+
+    // Always send the full current selection list; backend returns the merged neighbor set.
+    const ctrl = new AbortController();
+
+    (async () => {
+      try {
+        const meta = graphMetaRef.current;
+        const body = {
+          selected_ids: selectedIds,
+          node_limit: typeof meta?.node_limit === "number" ? meta.node_limit : undefined,
+          edge_limit: typeof meta?.edge_limit === "number" ? meta.edge_limit : undefined,
+        };
+
+        const res = await fetch("/api/neighbors", {
+          method: "POST",
+          headers: { "content-type": "application/json" },
+          body: JSON.stringify(body),
+          signal: ctrl.signal,
+        });
+        if (!res.ok) throw new Error(`POST /api/neighbors failed: ${res.status}`);
+        const data = await res.json();
+        if (ctrl.signal.aborted) return;
+        if (reqId !== neighborsReqIdRef.current) return;
+
+        const neighborIds: unknown = data?.neighbor_ids;
+        const neighborSorted = new Set<number>();
+        if (Array.isArray(neighborIds)) {
+          for (const id of neighborIds) {
+            if (typeof id !== "number") continue;
+            const sorted = renderer.vertexIdToSortedIndexOrNull(id);
+            if (sorted === null) continue;
+            if (!selectedNodes.has(sorted)) neighborSorted.add(sorted);
+          }
+        }
+
+        renderer.updateSelection(selectedNodes, neighborSorted);
+      } catch (e) {
+        if (ctrl.signal.aborted) return;
+        console.warn(e);
+        // Keep the UI usable even if neighbors fail to load.
+        renderer.updateSelection(selectedNodes, new Set());
+      }
+    })();
+
+    return () => ctrl.abort();
  }, [selectedNodes]);

  return (
@@ -312,6 +397,11 @@ export default function App() {
            <div>Zoom: {stats.zoom < 0.01 ? stats.zoom.toExponential(2) : stats.zoom.toFixed(2)} px/unit</div>
            <div>Pt Size: {stats.ptSize.toFixed(1)}px</div>
            <div style={{ color: "#f80" }}>Selected: {selectedNodes.size}</div>
+            {backendStats && (
+              <div style={{ color: "#8f8" }}>
+                Backend{backendStats.backend ? ` (${backendStats.backend})` : ""}: {backendStats.nodes.toLocaleString()} nodes, {backendStats.edges.toLocaleString()} edges
+              </div>
+            )}
          </div>
          <div
            style={{
@@ -349,22 +439,12 @@ export default function App() {
                boxShadow: "0 2px 8px rgba(0,0,0,0.5)",
              }}
            >
-              {(() => {
-                if (hoveredNode.index !== undefined && rendererRef.current) {
-                  const vertexId = rendererRef.current.getVertexId(hoveredNode.index);
-                  const info = vertexId !== undefined ? uriMapRef.current.get(vertexId) : undefined;
-                  if (info) {
-                    return (
-                      <>
-                        <div style={{ fontWeight: "bold", marginBottom: 2 }}>{info.label}</div>
-                        <div style={{ fontSize: "10px", color: "#8cf", wordBreak: "break-all", maxWidth: 400 }}>{info.uri}</div>
-                        {info.isPrimary && <div style={{ color: "#ff0", fontSize: "10px", marginTop: 2 }}>⭐ Primary (rdf:type)</div>}
-                      </>
-                    );
-                  }
-                }
-                return <>({hoveredNode.x.toFixed(2)}, {hoveredNode.y.toFixed(2)})</>;
-              })()}
+              <div style={{ color: "#0ff" }}>
+                {hoveredNode.label || hoveredNode.iri || "(unknown)"}
+              </div>
+              <div style={{ color: "#688" }}>
+                ({hoveredNode.x.toFixed(2)}, {hoveredNode.y.toFixed(2)})
+              </div>
            </div>
          )}
        </>
--- a/frontend/src/index.css
+++ b/frontend/src/index.css
--- a/frontend/src/main.tsx
+++ b/frontend/src/main.tsx
--- a/frontend/src/quadtree.ts
+++ b/frontend/src/quadtree.ts
--- a/frontend/src/renderer.ts
+++ b/frontend/src/renderer.ts
@@ -80,10 +80,11 @@ export class Renderer {
  // Data
  private leaves: Leaf[] = [];
  private sorted: Float32Array = new Float32Array(0);
+  // order[sortedIdx] = originalIdx (original ordering matches input arrays)
+  private sortedToOriginal: Uint32Array = new Uint32Array(0);
+  private vertexIdToSortedIndex: Map<number, number> = new Map();
  private nodeCount = 0;
  private edgeCount = 0;
-  private neighborMap: Map<number, number[]> = new Map();
-  private sortedToVertexId: Uint32Array = new Uint32Array(0);
  private leafEdgeStarts: Uint32Array = new Uint32Array(0);
  private leafEdgeCounts: Uint32Array = new Uint32Array(0);
  private maxPtSize = 256;
@@ -203,6 +204,7 @@ export class Renderer {
    const { sorted, leaves, order } = buildSpatialIndex(xs, ys);
    this.leaves = leaves;
    this.sorted = sorted;
+    this.sortedToOriginal = order;

    // Pre-allocate arrays for render loop (zero-allocation rendering)
    this.visibleLeafIndices = new Uint32Array(leaves.length);
@@ -214,12 +216,6 @@ export class Renderer {
    gl.bufferData(gl.ARRAY_BUFFER, sorted, gl.STATIC_DRAW);
    gl.bindVertexArray(null);

-    // Build sorted index → vertex ID mapping for hover lookups
-    this.sortedToVertexId = new Uint32Array(count);
-    for (let i = 0; i < count; i++) {
-      this.sortedToVertexId[i] = vertexIds[order[i]];
-    }
-
    // Build vertex ID → original input index mapping
    const vertexIdToOriginal = new Map<number, number>();
    for (let i = 0; i < count; i++) {
@@ -233,6 +229,13 @@ export class Renderer {
      originalToSorted[order[i]] = i;
    }

+    // Build vertex ID → sorted index mapping (used by backend-driven neighbor highlighting)
+    const vertexIdToSortedIndex = new Map<number, number>();
+    for (let i = 0; i < count; i++) {
+      vertexIdToSortedIndex.set(vertexIds[i], originalToSorted[i]);
+    }
+    this.vertexIdToSortedIndex = vertexIdToSortedIndex;
+
    // Remap edges from vertex IDs to sorted indices
    const lineIndices = new Uint32Array(edgeCount * 2);
    let validEdges = 0;
@@ -248,18 +251,6 @@ export class Renderer {
    }
    this.edgeCount = validEdges;

-    // Build per-node neighbor list from edges for selection queries
-    const neighborMap = new Map<number, number[]>();
-    for (let i = 0; i < validEdges; i++) {
-      const src = lineIndices[i * 2];
-      const dst = lineIndices[i * 2 + 1];
-      if (!neighborMap.has(src)) neighborMap.set(src, []);
-      neighborMap.get(src)!.push(dst);
-      if (!neighborMap.has(dst)) neighborMap.set(dst, []);
-      neighborMap.get(dst)!.push(src);
-    }
-    this.neighborMap = neighborMap;
-
    // Build per-leaf edge index for efficient visible-only edge drawing
    // Find which leaf each sorted index belongs to
    const nodeToLeaf = new Uint32Array(count);
@@ -339,12 +330,25 @@ export class Renderer {
  }

  /**
-   * Get the original vertex ID for a given sorted index.
-   * Useful for looking up URI labels from the URI map.
+   * Map a sorted buffer index (what findNodeIndexAt returns) back to the original
+   * index in the input arrays used to initialize the renderer.
   */
-  getVertexId(sortedIndex: number): number | undefined {
-    if (sortedIndex < 0 || sortedIndex >= this.sortedToVertexId.length) return undefined;
-    return this.sortedToVertexId[sortedIndex];
+  sortedIndexToOriginalIndex(sortedIndex: number): number | null {
+    if (
+      sortedIndex < 0 ||
+      sortedIndex >= this.sortedToOriginal.length
+    ) {
+      return null;
+    }
+    return this.sortedToOriginal[sortedIndex];
+  }
+
+  /**
+   * Convert a backend node ID (node.id from /api/graph) to a sorted index used by the renderer.
+   */
+  vertexIdToSortedIndexOrNull(vertexId: number): number | null {
+    const idx = this.vertexIdToSortedIndex.get(vertexId);
+    return typeof idx === "number" ? idx : null;
  }

  /**
@@ -428,10 +432,10 @@ export class Renderer {

  /**
   * Update the selection buffer with the given set of node indices.
-   * Also computes neighbors of selected nodes.
-   * Call this whenever React's selection state changes.
+   * Neighbor indices are provided by the backend (SPARQL query) and uploaded separately.
+   * Call this whenever selection or backend neighbor results change.
   */
-  updateSelection(selectedIndices: Set<number>): void {
+  updateSelection(selectedIndices: Set<number>, neighborIndices: Set<number> = new Set()): void {
    const gl = this.gl;

    // Upload selected indices
@@ -441,23 +445,11 @@ export class Renderer {
    gl.bufferData(gl.ELEMENT_ARRAY_BUFFER, indices, gl.DYNAMIC_DRAW);
    gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, null);

-    // Compute neighbors of selected nodes (excluding already selected)
-    const neighborSet = new Set<number>();
-    for (const nodeIdx of selectedIndices) {
-      const nodeNeighbors = this.neighborMap.get(nodeIdx);
-      if (!nodeNeighbors) continue;
-      for (const n of nodeNeighbors) {
-        if (!selectedIndices.has(n)) {
-          neighborSet.add(n);
-        }
-      }
-    }
-
    // Upload neighbor indices
-    const neighborIndices = new Uint32Array(neighborSet);
-    this.neighborCount = neighborIndices.length;
+    const neighborIndexArray = new Uint32Array(neighborIndices);
+    this.neighborCount = neighborIndexArray.length;
    gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, this.neighborIbo);
-    gl.bufferData(gl.ELEMENT_ARRAY_BUFFER, neighborIndices, gl.DYNAMIC_DRAW);
+    gl.bufferData(gl.ELEMENT_ARRAY_BUFFER, neighborIndexArray, gl.DYNAMIC_DRAW);
    gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, null);
  }

--- a/frontend/src/utils/cn.ts
+++ b/frontend/src/utils/cn.ts
--- a/frontend/tsconfig.json
+++ b/frontend/tsconfig.json
--- a/frontend/vite.config.ts
+++ b/frontend/vite.config.ts
@@ -16,4 +16,10 @@ export default defineConfig({
      "@": path.resolve(__dirname, "src"),
    },
  },
+  server: {
+    proxy: {
+      // Backend is reachable as http://backend:8000 inside docker-compose; localhost outside.
+      "/api": process.env.VITE_BACKEND_URL || "http://localhost:8000",
+    },
+  },
 });
--- a/public/node_positions.csv
+++ b/public/node_positions.csv
@@ -1 +0,0 @@
-vertex,x,y
--- a/public/primary_edges.csv
+++ b/public/primary_edges.csv
@@ -1 +0,0 @@
-source,target
--- a/public/secondary_edges.csv
+++ b/public/secondary_edges.csv
@@ -1 +0,0 @@
-source,target
--- a/public/uri_map.csv
+++ b/public/uri_map.csv
@@ -1 +0,0 @@
-id,uri,label,isPrimary
--- a/scripts/compute_layout.ts
+++ b/scripts/compute_layout.ts
@@ -1,376 +0,0 @@
-#!/usr/bin/env npx tsx
-/**
- * Graph Layout
- *
- * Computes a 2D layout for a general graph (not necessarily a tree).
- * 
- * - Primary nodes (from primary_edges.csv) are placed first in a radial layout
- * - Remaining nodes are placed near their connected primary neighbors
- * - Barnes-Hut force simulation relaxes the layout
- *
- * Reads: primary_edges.csv, secondary_edges.csv
- * Writes: node_positions.csv
- *
- * Usage: npx tsx scripts/compute_layout.ts
- */
-
-import { writeFileSync, readFileSync, existsSync } from "fs";
-import { join, dirname } from "path";
-import { fileURLToPath } from "url";
-
-const __dirname = dirname(fileURLToPath(import.meta.url));
-const PUBLIC_DIR = join(__dirname, "..", "public");
-
-// ══════════════════════════════════════════════════════════
-// Configuration
-// ══════════════════════════════════════════════════════════
-
-const ITERATIONS = 200;         // Force iterations
-const REPULSION_K = 200;        // Repulsion strength
-const EDGE_LENGTH = 80;         // Desired edge rest length
-const ATTRACTION_K = 0.005;     // Spring stiffness for edges
-const INITIAL_MAX_DISP = 20;    // Starting max displacement
-const COOLING = 0.995;          // Cooling per iteration
-const MIN_DIST = 0.5;
-const PRINT_EVERY = 20;         // Print progress every N iterations
-const BH_THETA = 0.8;          // Barnes-Hut opening angle
-
-// Primary node radial placement
-const PRIMARY_RADIUS = 300;     // Radius for primary node ring
-
-// ══════════════════════════════════════════════════════════
-// Read edge data from CSVs
-// ══════════════════════════════════════════════════════════
-
-const primaryPath = join(PUBLIC_DIR, "primary_edges.csv");
-const secondaryPath = join(PUBLIC_DIR, "secondary_edges.csv");
-
-if (!existsSync(primaryPath) || !existsSync(secondaryPath)) {
-    console.error(`Error: Missing input files!`);
-    console.error(`  Expected: ${primaryPath}`);
-    console.error(`  Expected: ${secondaryPath}`);
-    console.error(`  Run 'npx tsx scripts/fetch_from_db.ts' first.`);
-    process.exit(1);
-}
-
-function parseEdges(path: string): Array<[number, number]> {
-    const content = readFileSync(path, "utf-8");
-    const lines = content.trim().split("\n");
-    const edges: Array<[number, number]> = [];
-    for (let i = 1; i < lines.length; i++) {
-        const line = lines[i].trim();
-        if (!line) continue;
-        const [src, tgt] = line.split(",").map(Number);
-        if (!isNaN(src) && !isNaN(tgt)) {
-            edges.push([src, tgt]);
-        }
-    }
-    return edges;
-}
-
-const primaryEdges = parseEdges(primaryPath);
-const secondaryEdges = parseEdges(secondaryPath);
-const allEdges = [...primaryEdges, ...secondaryEdges];
-
-// ══════════════════════════════════════════════════════════
-// Build adjacency
-// ══════════════════════════════════════════════════════════
-
-const allNodes = new Set<number>();
-const primaryNodes = new Set<number>();
-const neighbors = new Map<number, Set<number>>();
-
-function addNeighbor(a: number, b: number) {
-    if (!neighbors.has(a)) neighbors.set(a, new Set());
-    neighbors.get(a)!.add(b);
-    if (!neighbors.has(b)) neighbors.set(b, new Set());
-    neighbors.get(b)!.add(a);
-}
-
-for (const [src, dst] of primaryEdges) {
-    allNodes.add(src);
-    allNodes.add(dst);
-    primaryNodes.add(src);
-    primaryNodes.add(dst);
-    addNeighbor(src, dst);
-}
-
-for (const [src, dst] of secondaryEdges) {
-    allNodes.add(src);
-    allNodes.add(dst);
-    addNeighbor(src, dst);
-}
-
-const N = allNodes.size;
-const nodeIds = Array.from(allNodes).sort((a, b) => a - b);
-const idToIdx = new Map<number, number>();
-nodeIds.forEach((id, idx) => idToIdx.set(id, idx));
-
-console.log(
-    `Read graph: ${N} nodes, ${allEdges.length} edges (P=${primaryEdges.length}, S=${secondaryEdges.length})`
-);
-console.log(`Primary nodes: ${primaryNodes.size}`);
-
-// ══════════════════════════════════════════════════════════
-// Initial placement
-// ══════════════════════════════════════════════════════════
-
-const x = new Float64Array(N);
-const y = new Float64Array(N);
-
-// Step 1: Place primary nodes in a radial layout
-const primaryArr = Array.from(primaryNodes).sort((a, b) => a - b);
-const angleStep = (2 * Math.PI) / Math.max(1, primaryArr.length);
-const radius = PRIMARY_RADIUS * Math.max(1, Math.sqrt(primaryArr.length / 10));
-
-for (let i = 0; i < primaryArr.length; i++) {
-    const idx = idToIdx.get(primaryArr[i])!;
-    const angle = i * angleStep;
-    x[idx] = radius * Math.cos(angle);
-    y[idx] = radius * Math.sin(angle);
-}
-
-console.log(`Placed ${primaryArr.length} primary nodes in radial layout (r=${radius.toFixed(0)})`);
-
-// Step 2: Place remaining nodes near their connected neighbors
-// BFS from already-placed nodes
-const placed = new Set<number>(primaryNodes);
-const queue: number[] = [...primaryArr];
-let head = 0;
-
-while (head < queue.length) {
-    const nodeId = queue[head++];
-    const nodeNeighbors = neighbors.get(nodeId);
-    if (!nodeNeighbors) continue;
-
-    for (const nbId of nodeNeighbors) {
-        if (placed.has(nbId)) continue;
-        placed.add(nbId);
-
-        // Place near this neighbor with some jitter
-        const parentIdx = idToIdx.get(nodeId)!;
-        const childIdx = idToIdx.get(nbId)!;
-        const jitterAngle = Math.random() * 2 * Math.PI;
-        const jitterDist = EDGE_LENGTH * (0.5 + Math.random() * 0.5);
-        x[childIdx] = x[parentIdx] + jitterDist * Math.cos(jitterAngle);
-        y[childIdx] = y[parentIdx] + jitterDist * Math.sin(jitterAngle);
-
-        queue.push(nbId);
-    }
-}
-
-// Handle disconnected nodes (place randomly)
-for (const id of nodeIds) {
-    if (!placed.has(id)) {
-        const idx = idToIdx.get(id)!;
-        const angle = Math.random() * 2 * Math.PI;
-        const r = radius * (1 + Math.random());
-        x[idx] = r * Math.cos(angle);
-        y[idx] = r * Math.sin(angle);
-        placed.add(id);
-    }
-}
-
-console.log(`Initial placement complete: ${placed.size} nodes`);
-
-// ══════════════════════════════════════════════════════════
-// Force-directed layout with Barnes-Hut
-// ══════════════════════════════════════════════════════════
-
-console.log(`Running force simulation (${ITERATIONS} iterations, ${N} nodes, ${allEdges.length} edges)...`);
-
-const t0 = performance.now();
-let maxDisp = INITIAL_MAX_DISP;
-
-for (let iter = 0; iter < ITERATIONS; iter++) {
-    const bhRoot = buildBHTree(x, y, N);
-    const fx = new Float64Array(N);
-    const fy = new Float64Array(N);
-
-    // 1. Repulsion via Barnes-Hut
-    for (let i = 0; i < N; i++) {
-        calcBHForce(bhRoot, x[i], y[i], fx, fy, i, BH_THETA, x, y);
-    }
-
-    // 2. Edge attraction (spring force)
-    for (const [aId, bId] of allEdges) {
-        const a = idToIdx.get(aId)!;
-        const b = idToIdx.get(bId)!;
-        const dx = x[b] - x[a];
-        const dy = y[b] - y[a];
-        const d = Math.sqrt(dx * dx + dy * dy) || MIN_DIST;
-        const displacement = d - EDGE_LENGTH;
-        const f = ATTRACTION_K * displacement;
-        const ux = dx / d;
-        const uy = dy / d;
-        fx[a] += ux * f;
-        fy[a] += uy * f;
-        fx[b] -= ux * f;
-        fy[b] -= uy * f;
-    }
-
-    // 3. Apply forces with displacement capping
-    let totalForce = 0;
-    for (let i = 0; i < N; i++) {
-        const mag = Math.sqrt(fx[i] * fx[i] + fy[i] * fy[i]);
-        totalForce += mag;
-        if (mag > 0) {
-            const cap = Math.min(maxDisp, mag) / mag;
-            x[i] += fx[i] * cap;
-            y[i] += fy[i] * cap;
-        }
-    }
-
-    maxDisp *= COOLING;
-
-    if ((iter + 1) % PRINT_EVERY === 0 || iter === 0) {
-        console.log(
-            `  iter ${iter + 1}/${ITERATIONS}  max_disp=${maxDisp.toFixed(2)}  avg_force=${(totalForce / N).toFixed(2)}`
-        );
-    }
-}
-
-const elapsed = performance.now() - t0;
-console.log(`Force simulation done in ${(elapsed / 1000).toFixed(1)}s`);
-
-// ══════════════════════════════════════════════════════════
-// Write output
-// ══════════════════════════════════════════════════════════
-
-const outLines: string[] = ["vertex,x,y"];
-for (let i = 0; i < N; i++) {
-    outLines.push(`${nodeIds[i]},${x[i]},${y[i]}`);
-}
-
-const outPath = join(PUBLIC_DIR, "node_positions.csv");
-writeFileSync(outPath, outLines.join("\n") + "\n");
-console.log(`Wrote ${N} positions to ${outPath}`);
-console.log(`Layout complete.`);
-
-// ══════════════════════════════════════════════════════════
-// Barnes-Hut Helpers
-// ══════════════════════════════════════════════════════════
-
-interface BHNode {
-    mass: number;
-    cx: number;
-    cy: number;
-    minX: number;
-    maxX: number;
-    minY: number;
-    maxY: number;
-    children?: BHNode[];
-    pointIdx?: number;
-}
-
-function buildBHTree(x: Float64Array, y: Float64Array, n: number): BHNode {
-    let minX = Infinity, maxX = -Infinity, minY = Infinity, maxY = -Infinity;
-    for (let i = 0; i < n; i++) {
-        if (x[i] < minX) minX = x[i];
-        if (x[i] > maxX) maxX = x[i];
-        if (y[i] < minY) minY = y[i];
-        if (y[i] > maxY) maxY = y[i];
-    }
-    const cx = (minX + maxX) / 2;
-    const cy = (minY + maxY) / 2;
-    const halfDim = Math.max(maxX - minX, maxY - minY) / 2 + 0.01;
-
-    const root: BHNode = {
-        mass: 0, cx: 0, cy: 0,
-        minX: cx - halfDim, maxX: cx + halfDim,
-        minY: cy - halfDim, maxY: cy + halfDim,
-    };
-
-    for (let i = 0; i < n; i++) {
-        insertBH(root, i, x[i], y[i], x, y);
-    }
-    calcBHMass(root, x, y);
-    return root;
-}
-
-function insertBH(node: BHNode, idx: number, px: number, py: number, x: Float64Array, y: Float64Array) {
-    if (!node.children && node.pointIdx === undefined) {
-        node.pointIdx = idx;
-        return;
-    }
-
-    if (!node.children && node.pointIdx !== undefined) {
-        const oldIdx = node.pointIdx;
-        node.pointIdx = undefined;
-        subdivideBH(node);
-        insertBH(node, oldIdx, x[oldIdx], y[oldIdx], x, y);
-    }
-
-    if (node.children) {
-        const mx = (node.minX + node.maxX) / 2;
-        const my = (node.minY + node.maxY) / 2;
-        let q = 0;
-        if (px > mx) q += 1;
-        if (py > my) q += 2;
-        insertBH(node.children[q], idx, px, py, x, y);
-    }
-}
-
-function subdivideBH(node: BHNode) {
-    const mx = (node.minX + node.maxX) / 2;
-    const my = (node.minY + node.maxY) / 2;
-    node.children = [
-        { mass: 0, cx: 0, cy: 0, minX: node.minX, maxX: mx, minY: node.minY, maxY: my },
-        { mass: 0, cx: 0, cy: 0, minX: mx, maxX: node.maxX, minY: node.minY, maxY: my },
-        { mass: 0, cx: 0, cy: 0, minX: node.minX, maxX: mx, minY: my, maxY: node.maxY },
-        { mass: 0, cx: 0, cy: 0, minX: mx, maxX: node.maxX, minY: my, maxY: node.maxY },
-    ];
-}
-
-function calcBHMass(node: BHNode, x: Float64Array, y: Float64Array) {
-    if (node.pointIdx !== undefined) {
-        node.mass = 1;
-        node.cx = x[node.pointIdx];
-        node.cy = y[node.pointIdx];
-        return;
-    }
-    if (node.children) {
-        let m = 0, sx = 0, sy = 0;
-        for (const c of node.children) {
-            calcBHMass(c, x, y);
-            m += c.mass;
-            sx += c.cx * c.mass;
-            sy += c.cy * c.mass;
-        }
-        node.mass = m;
-        if (m > 0) {
-            node.cx = sx / m;
-            node.cy = sy / m;
-        } else {
-            node.cx = (node.minX + node.maxX) / 2;
-            node.cy = (node.minY + node.maxY) / 2;
-        }
-    }
-}
-
-function calcBHForce(
-    node: BHNode,
-    px: number, py: number,
-    fx: Float64Array, fy: Float64Array,
-    idx: number, theta: number,
-    x: Float64Array, y: Float64Array,
-) {
-    const dx = px - node.cx;
-    const dy = py - node.cy;
-    const d2 = dx * dx + dy * dy;
-    const dist = Math.sqrt(d2);
-    const width = node.maxX - node.minX;
-
-    if (width / dist < theta || !node.children) {
-        if (node.mass > 0 && node.pointIdx !== idx) {
-            const dEff = Math.max(dist, MIN_DIST);
-            const f = (REPULSION_K * node.mass) / (dEff * dEff);
-            fx[idx] += (dx / dEff) * f;
-            fy[idx] += (dy / dEff) * f;
-        }
-    } else {
-        for (const c of node.children) {
-            calcBHForce(c, px, py, fx, fy, idx, theta, x, y);
-        }
-    }
-}
--- a/scripts/fetch_from_db.ts
+++ b/scripts/fetch_from_db.ts
@@ -1,390 +0,0 @@
-#!/usr/bin/env npx tsx
-/**
- * Fetch RDF Data from AnzoGraph DB
- *
- * 1. Query the first 1000 distinct subject URIs
- * 2. Fetch all triples where those URIs appear as subject or object
- * 3. Identify primary nodes (objects of rdf:type)
- * 4. Write primary_edges.csv, secondary_edges.csv, and uri_map.csv
- *
- * Usage: npx tsx scripts/fetch_from_db.ts [--host http://localhost:8080]
- */
-
-import { writeFileSync } from "fs";
-import { join, dirname } from "path";
-import { fileURLToPath } from "url";
-
-const __dirname = dirname(fileURLToPath(import.meta.url));
-const PUBLIC_DIR = join(__dirname, "..", "public");
-
-// ══════════════════════════════════════════════════════════
-// Configuration
-// ══════════════════════════════════════════════════════════
-
-const RDF_TYPE = "http://www.w3.org/1999/02/22-rdf-syntax-ns#type";
-const BATCH_SIZE = 100; // URIs per VALUES batch query
-const MAX_RETRIES = 30; // Wait up to ~120s for AnzoGraph to start
-const RETRY_DELAY_MS = 4000;
-
-// Path to TTL file inside the AnzoGraph container (mapped via docker-compose volume)
-const DATA_FILE = process.env.SPARQL_DATA_FILE || "file:///opt/shared-files/vkg-materialized.ttl";
-
-// Parse --host flag, default to http://localhost:8080
-function getEndpoint(): string {
-    const hostIdx = process.argv.indexOf("--host");
-    if (hostIdx !== -1 && process.argv[hostIdx + 1]) {
-        return process.argv[hostIdx + 1];
-    }
-    // Inside Docker, use service name; otherwise localhost
-    return process.env.SPARQL_HOST || "http://localhost:8080";
-}
-
-const SPARQL_ENDPOINT = `${getEndpoint()}/sparql`;
-
-// Auth credentials (AnzoGraph defaults)
-const SPARQL_USER = process.env.SPARQL_USER || "admin";
-const SPARQL_PASS = process.env.SPARQL_PASS || "Passw0rd1";
-const AUTH_HEADER = "Basic " + Buffer.from(`${SPARQL_USER}:${SPARQL_PASS}`).toString("base64");
-
-// ══════════════════════════════════════════════════════════
-// SPARQL helpers
-// ══════════════════════════════════════════════════════════
-
-interface SparqlBinding {
-    [key: string]: { type: string; value: string };
-}
-
-function sleep(ms: number): Promise<void> {
-    return new Promise((resolve) => setTimeout(resolve, ms));
-}
-
-async function sparqlQuery(query: string, retries = 5): Promise<SparqlBinding[]> {
-    for (let attempt = 1; attempt <= retries; attempt++) {
-        const controller = new AbortController();
-        const timeout = setTimeout(() => controller.abort(), 300_000); // 5 min timeout
-
-        try {
-            const t0 = performance.now();
-            const response = await fetch(SPARQL_ENDPOINT, {
-                method: "POST",
-                headers: {
-                    "Content-Type": "application/x-www-form-urlencoded",
-                    "Accept": "application/sparql-results+json",
-                    "Authorization": AUTH_HEADER,
-                },
-                body: "query=" + encodeURIComponent(query),
-                signal: controller.signal,
-            });
-            const t1 = performance.now();
-            console.log(`    [sparql] response status=${response.status} in ${((t1 - t0) / 1000).toFixed(1)}s`);
-
-            if (!response.ok) {
-                const text = await response.text();
-                throw new Error(`SPARQL query failed (${response.status}): ${text}`);
-            }
-
-            const text = await response.text();
-            const t2 = performance.now();
-            console.log(`    [sparql] body read (${(text.length / 1024).toFixed(0)} KB) in ${((t2 - t1) / 1000).toFixed(1)}s`);
-
-            const json = JSON.parse(text);
-            return json.results.bindings;
-        } catch (err: any) {
-            clearTimeout(timeout);
-            const msg = err instanceof Error ? err.message : String(err);
-            const isTransient = msg.includes("fetch failed") || msg.includes("Timeout") || msg.includes("ABORT") || msg.includes("abort");
-            if (isTransient && attempt < retries) {
-                console.log(`    [sparql] transient error (attempt ${attempt}/${retries}): ${msg.substring(0, 100)}`);
-                console.log(`    [sparql] retrying in 10s (AnzoGraph may still be indexing after LOAD)...`);
-                await sleep(10_000);
-                continue;
-            }
-            throw err;
-        } finally {
-            clearTimeout(timeout);
-        }
-    }
-    throw new Error("sparqlQuery: should not reach here");
-}
-
-async function waitForAnzoGraph(): Promise<void> {
-    console.log(`Waiting for AnzoGraph at ${SPARQL_ENDPOINT}...`);
-    for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
-        try {
-            const response = await fetch(SPARQL_ENDPOINT, {
-                method: "POST",
-                headers: {
-                    "Content-Type": "application/x-www-form-urlencoded",
-                    "Accept": "application/sparql-results+json",
-                    "Authorization": AUTH_HEADER,
-                },
-                body: "query=" + encodeURIComponent("ASK WHERE { ?s ?p ?o }"),
-            });
-            const text = await response.text();
-            // Verify it's actual JSON (not a plain-text error from a half-started engine)
-            JSON.parse(text);
-            console.log(`  AnzoGraph is ready (attempt ${attempt})`);
-            return;
-        } catch (err: any) {
-            const msg = err instanceof Error ? err.message : String(err);
-            console.log(`  Attempt ${attempt}/${MAX_RETRIES}: ${msg.substring(0, 100)}`);
-            if (attempt < MAX_RETRIES) {
-                await sleep(RETRY_DELAY_MS);
-            }
-        }
-    }
-    throw new Error(`AnzoGraph not available after ${MAX_RETRIES} attempts`);
-}
-
-async function sparqlUpdate(update: string): Promise<string> {
-    const response = await fetch(SPARQL_ENDPOINT, {
-        method: "POST",
-        headers: {
-            "Content-Type": "application/sparql-update",
-            "Accept": "application/json",
-            "Authorization": AUTH_HEADER,
-        },
-        body: update,
-    });
-    const text = await response.text();
-    if (!response.ok) {
-        throw new Error(`SPARQL update failed (${response.status}): ${text}`);
-    }
-    return text;
-}
-
-async function loadData(): Promise<void> {
-    console.log(`Loading data from ${DATA_FILE}...`);
-    const t0 = performance.now();
-    const result = await sparqlUpdate(`LOAD <${DATA_FILE}>`);
-    const elapsed = ((performance.now() - t0) / 1000).toFixed(1);
-    console.log(`  Load complete in ${elapsed}s: ${result.substring(0, 200)}`);
-}
-
-// ══════════════════════════════════════════════════════════
-// Step 1: Fetch seed URIs
-// ══════════════════════════════════════════════════════════
-
-async function fetchSeedURIs(): Promise<string[]> {
-    console.log("Querying first 1000 distinct subject URIs...");
-    const t0 = performance.now();
-    const query = `
-SELECT DISTINCT ?s
-WHERE { ?s ?p ?o }
-LIMIT 1000
-`;
-    const bindings = await sparqlQuery(query);
-    const elapsed = ((performance.now() - t0) / 1000).toFixed(1);
-    const uris = bindings.map((b) => b.s.value);
-    console.log(`  Got ${uris.length} seed URIs in ${elapsed}s`);
-    return uris;
-}
-
-// ══════════════════════════════════════════════════════════
-// Step 2: Fetch all triples involving seed URIs
-// ══════════════════════════════════════════════════════════
-
-interface Triple {
-    s: string;
-    p: string;
-    o: string;
-    oType: string; // "uri" or "literal"
-}
-
-async function fetchTriples(seedURIs: string[]): Promise<Triple[]> {
-    console.log(`Fetching triples for ${seedURIs.length} seed URIs (batch size: ${BATCH_SIZE})...`);
-    const allTriples: Triple[] = [];
-
-    for (let i = 0; i < seedURIs.length; i += BATCH_SIZE) {
-        const batch = seedURIs.slice(i, i + BATCH_SIZE);
-        const valuesClause = batch.map((u) => `<${u}>`).join(" ");
-
-        const query = `
-SELECT ?s ?p ?o
-WHERE {
-  VALUES ?uri { ${valuesClause} }
-  {
-    ?uri ?p ?o .
-    BIND(?uri AS ?s)
-  }
-  UNION
-  {
-    ?s ?p ?uri .
-    BIND(?uri AS ?o)
-  }
-}
-`;
-        const bindings = await sparqlQuery(query);
-        for (const b of bindings) {
-            allTriples.push({
-                s: b.s.value,
-                p: b.p.value,
-                o: b.o.value,
-                oType: b.o.type,
-            });
-        }
-
-        const progress = Math.min(i + BATCH_SIZE, seedURIs.length);
-        process.stdout.write(`\r  Fetched triples: batch ${Math.ceil(progress / BATCH_SIZE)}/${Math.ceil(seedURIs.length / BATCH_SIZE)} (${allTriples.length} triples so far)`);
-    }
-    console.log(`\n  Total triples: ${allTriples.length}`);
-    return allTriples;
-}
-
-// ══════════════════════════════════════════════════════════
-// Step 3: Build graph data
-// ══════════════════════════════════════════════════════════
-
-interface GraphData {
-    nodeURIs: string[];        // All unique URIs (subjects & objects that are URIs)
-    uriToId: Map<string, number>;
-    primaryNodeIds: Set<number>; // Nodes that are objects of rdf:type
-    edges: Array<[number, number]>; // [source, target] as numeric IDs
-    primaryEdges: Array<[number, number]>;
-    secondaryEdges: Array<[number, number]>;
-}
-
-function buildGraphData(triples: Triple[]): GraphData {
-    console.log("Building graph data...");
-
-    // Collect all unique URI nodes (skip literal objects)
-    const uriSet = new Set<string>();
-    for (const t of triples) {
-        uriSet.add(t.s);
-        if (t.oType === "uri") {
-            uriSet.add(t.o);
-        }
-    }
-
-    // Assign numeric IDs
-    const nodeURIs = Array.from(uriSet).sort();
-    const uriToId = new Map<string, number>();
-    nodeURIs.forEach((uri, idx) => uriToId.set(uri, idx));
-
-    // Identify primary nodes: objects of rdf:type triples
-    const primaryNodeIds = new Set<number>();
-    for (const t of triples) {
-        if (t.p === RDF_TYPE && t.oType === "uri") {
-            const objId = uriToId.get(t.o);
-            if (objId !== undefined) {
-                primaryNodeIds.add(objId);
-            }
-        }
-    }
-
-    // Build edges (only between URI nodes, skip literal objects)
-    const edgeSet = new Set<string>();
-    const edges: Array<[number, number]> = [];
-    for (const t of triples) {
-        if (t.oType !== "uri") continue;
-        const srcId = uriToId.get(t.s);
-        const dstId = uriToId.get(t.o);
-        if (srcId === undefined || dstId === undefined) continue;
-        if (srcId === dstId) continue; // Skip self-loops
-
-        const key = `${srcId},${dstId}`;
-        if (edgeSet.has(key)) continue; // Deduplicate
-        edgeSet.add(key);
-        edges.push([srcId, dstId]);
-    }
-
-    // Classify edges into primary (touches a primary node) and secondary
-    const primaryEdges: Array<[number, number]> = [];
-    const secondaryEdges: Array<[number, number]> = [];
-    for (const [src, dst] of edges) {
-        if (primaryNodeIds.has(src) || primaryNodeIds.has(dst)) {
-            primaryEdges.push([src, dst]);
-        } else {
-            secondaryEdges.push([src, dst]);
-        }
-    }
-
-    console.log(`  Nodes: ${nodeURIs.length}`);
-    console.log(`  Primary nodes (rdf:type objects): ${primaryNodeIds.size}`);
-    console.log(`  Edges: ${edges.length} (primary: ${primaryEdges.length}, secondary: ${secondaryEdges.length})`);
-
-    return { nodeURIs, uriToId, primaryNodeIds, edges, primaryEdges, secondaryEdges };
-}
-
-// ══════════════════════════════════════════════════════════
-// Step 4: Write CSV files
-// ══════════════════════════════════════════════════════════
-
-function extractLabel(uri: string): string {
-    // Extract local name: after # or last /
-    const hashIdx = uri.lastIndexOf("#");
-    if (hashIdx !== -1) return uri.substring(hashIdx + 1);
-    const slashIdx = uri.lastIndexOf("/");
-    if (slashIdx !== -1) return uri.substring(slashIdx + 1);
-    return uri;
-}
-
-function writeCSVs(data: GraphData): void {
-    // Write primary_edges.csv
-    const pLines = ["source,target"];
-    for (const [src, dst] of data.primaryEdges) {
-        pLines.push(`${src},${dst}`);
-    }
-    const pPath = join(PUBLIC_DIR, "primary_edges.csv");
-    writeFileSync(pPath, pLines.join("\n") + "\n");
-    console.log(`Wrote ${data.primaryEdges.length} primary edges to ${pPath}`);
-
-    // Write secondary_edges.csv
-    const sLines = ["source,target"];
-    for (const [src, dst] of data.secondaryEdges) {
-        sLines.push(`${src},${dst}`);
-    }
-    const sPath = join(PUBLIC_DIR, "secondary_edges.csv");
-    writeFileSync(sPath, sLines.join("\n") + "\n");
-    console.log(`Wrote ${data.secondaryEdges.length} secondary edges to ${sPath}`);
-
-    // Write uri_map.csv (id,uri,label,isPrimary)
-    const uLines = ["id,uri,label,isPrimary"];
-    for (let i = 0; i < data.nodeURIs.length; i++) {
-        const uri = data.nodeURIs[i];
-        const label = extractLabel(uri);
-        const isPrimary = data.primaryNodeIds.has(i) ? "1" : "0";
-        // Escape commas in URIs by quoting
-        const safeUri = uri.includes(",") ? `"${uri}"` : uri;
-        const safeLabel = label.includes(",") ? `"${label}"` : label;
-        uLines.push(`${i},${safeUri},${safeLabel},${isPrimary}`);
-    }
-    const uPath = join(PUBLIC_DIR, "uri_map.csv");
-    writeFileSync(uPath, uLines.join("\n") + "\n");
-    console.log(`Wrote ${data.nodeURIs.length} URI mappings to ${uPath}`);
-}
-
-// ══════════════════════════════════════════════════════════
-// Main
-// ══════════════════════════════════════════════════════════
-
-async function main() {
-    console.log(`SPARQL endpoint: ${SPARQL_ENDPOINT}`);
-    const t0 = performance.now();
-
-    await waitForAnzoGraph();
-    await loadData();
-
-    // Smoke test: simplest possible query to verify connectivity
-    console.log("Smoke test: SELECT ?s ?p ?o LIMIT 3...");
-    const smokeT0 = performance.now();
-    const smokeResult = await sparqlQuery("SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 3");
-    const smokeElapsed = ((performance.now() - smokeT0) / 1000).toFixed(1);
-    console.log(`  Smoke test OK: ${smokeResult.length} results in ${smokeElapsed}s`);
-    if (smokeResult.length > 0) {
-        console.log(`  First triple: ${smokeResult[0].s.value} ${smokeResult[0].p.value} ${smokeResult[0].o.value}`);
-    }
-
-    const seedURIs = await fetchSeedURIs();
-    const triples = await fetchTriples(seedURIs);
-    const graphData = buildGraphData(triples);
-    writeCSVs(graphData);
-
-    const elapsed = ((performance.now() - t0) / 1000).toFixed(1);
-    console.log(`\nDone in ${elapsed}s`);
-}
-
-main().catch((err) => {
-    console.error("Fatal error:", err);
-    process.exit(1);
-});
--- a/scripts/generate_tree.ts
+++ b/scripts/generate_tree.ts
@@ -1,132 +0,0 @@
-/**
- * Random Tree Generator
- *
- * Generates a random tree with 1–MAX_CHILDREN children per node.
- * Splits edges into primary (depth ≤ PRIMARY_DEPTH) and secondary.
- *
- * Usage: npx tsx scripts/generate_tree.ts
- */
-
-import { writeFileSync } from "fs";
-import { join, dirname } from "path";
-import { fileURLToPath } from "url";
-
-const __dirname = dirname(fileURLToPath(import.meta.url));
-const PUBLIC_DIR = join(__dirname, "..", "public");
-
-// ══════════════════════════════════════════════════════════
-// Configuration
-// ══════════════════════════════════════════════════════════
-
-const TARGET_NODES = 10000;     // Approximate number of nodes to generate
-const MAX_CHILDREN = 4;        // Each node gets 1..MAX_CHILDREN children
-const PRIMARY_DEPTH = 4;       // Nodes at depth ≤ this form the primary skeleton
-
-// ══════════════════════════════════════════════════════════
-// Tree data types
-// ══════════════════════════════════════════════════════════
-
-export interface TreeData {
-    root: number;
-    nodeCount: number;
-    childrenOf: Map<number, number[]>;
-    parentOf: Map<number, number>;
-    depthOf: Map<number, number>;
-    primaryNodes: Set<number>;              // all nodes at depth ≤ PRIMARY_DEPTH
-    primaryEdges: Array<[number, number]>;  // [child, parent] edges within primary
-    secondaryEdges: Array<[number, number]>;// remaining edges
-}
-
-// ══════════════════════════════════════════════════════════
-// Generator
-// ══════════════════════════════════════════════════════════
-
-export function generateTree(): TreeData {
-    const childrenOf = new Map<number, number[]>();
-    const parentOf = new Map<number, number>();
-    const depthOf = new Map<number, number>();
-
-    const root = 0;
-    depthOf.set(root, 0);
-    let nextId = 1;
-    const queue: number[] = [root];
-    let head = 0;
-
-    while (head < queue.length && nextId < TARGET_NODES) {
-        const parent = queue[head++];
-        const parentDepth = depthOf.get(parent)!;
-        const nKids = 1 + Math.floor(Math.random() * MAX_CHILDREN); // 1..MAX_CHILDREN
-
-        const kids: number[] = [];
-        for (let c = 0; c < nKids && nextId < TARGET_NODES; c++) {
-            const child = nextId++;
-            kids.push(child);
-            parentOf.set(child, parent);
-            depthOf.set(child, parentDepth + 1);
-            queue.push(child);
-        }
-        childrenOf.set(parent, kids);
-    }
-
-    // Classify edges and nodes by depth
-    const primaryNodes = new Set<number>();
-    const primaryEdges: Array<[number, number]> = [];
-    const secondaryEdges: Array<[number, number]> = [];
-
-    // Root is always primary
-    primaryNodes.add(root);
-
-    for (const [child, parent] of parentOf) {
-        const childDepth = depthOf.get(child)!;
-        if (childDepth <= PRIMARY_DEPTH) {
-            primaryNodes.add(child);
-            primaryNodes.add(parent);
-            primaryEdges.push([child, parent]);
-        } else {
-            secondaryEdges.push([child, parent]);
-        }
-    }
-
-    console.log(
-        `Generated tree: ${nextId} nodes, ` +
-        `${primaryEdges.length} primary edges (depth ≤ ${PRIMARY_DEPTH}), ` +
-        `${secondaryEdges.length} secondary edges`
-    );
-
-    return {
-        root,
-        nodeCount: nextId,
-        childrenOf,
-        parentOf,
-        depthOf,
-        primaryNodes,
-        primaryEdges,
-        secondaryEdges,
-    };
-}
-
-// ══════════════════════════════════════════════════════════
-// Run if executed directly
-// ══════════════════════════════════════════════════════════
-
-if (import.meta.url === `file://${process.argv[1]}`) {
-    const data = generateTree();
-
-    // Write primary_edges.csv
-    const pLines: string[] = ["source,target"];
-    for (const [child, parent] of data.primaryEdges) {
-        pLines.push(`${child},${parent}`);
-    }
-    const pPath = join(PUBLIC_DIR, "primary_edges.csv");
-    writeFileSync(pPath, pLines.join("\n") + "\n");
-    console.log(`Wrote ${data.primaryEdges.length} edges to ${pPath}`);
-
-    // Write secondary_edges.csv
-    const sLines: string[] = ["source,target"];
-    for (const [child, parent] of data.secondaryEdges) {
-        sLines.push(`${child},${parent}`);
-    }
-    const sPath = join(PUBLIC_DIR, "secondary_edges.csv");
-    writeFileSync(sPath, sLines.join("\n") + "\n");
-    console.log(`Wrote ${data.secondaryEdges.length} edges to ${sPath}`);
-}
Author	SHA1	Message	Date
Oxy8	a75b5b93da	Import Solver + neighbors via sparql query	2026-03-04 13:49:14 -03:00
Oxy8	d4bfa5f064	Reorganiza backend	2026-03-02 17:33:45 -03:00
Oxy8	bba0ae887d	Graph access via SPARQL	2026-03-02 16:27:28 -03:00
Oxy8	bf03d333f9	backend	2026-03-02 14:32:42 -03:00