Files
visualizador_instanciados/backend/app
2026-03-05 15:39:47 -03:00
..
2026-03-05 15:39:47 -03:00
2026-03-02 14:32:42 -03:00
2026-03-05 15:39:47 -03:00
2026-03-05 15:39:47 -03:00
2026-03-05 15:39:47 -03:00
2026-03-05 15:39:47 -03:00
2026-03-05 15:39:47 -03:00

Backend App (backend/app)

This folder contains the FastAPI backend for visualizador_instanciados.

The backend executes SPARQL queries against an AnzoGraph SPARQL endpoint over HTTP (optionally LOAD a TTL on startup).

Files

  • main.py
    • FastAPI app setup, startup/shutdown (lifespan), and HTTP endpoints.
  • settings.py
    • Env-driven configuration (pydantic-settings).
  • sparql_engine.py
    • SPARQL execution layer:
      • AnzoGraphEngine: HTTP POST to /sparql with Basic auth + readiness gate.
    • create_sparql_engine(settings) creates the engine.
  • graph_export.py
    • Shared helpers to:
      • build the snapshot SPARQL query used for edge retrieval
      • map SPARQL JSON bindings to {nodes, edges}.
  • models.py
    • Pydantic response/request models:
      • Node, Edge, GraphResponse, StatsResponse, etc.
  • pipelines/graph_snapshot.py
    • Pipeline used by /api/graph to return a {nodes, edges} snapshot via SPARQL.
  • pipelines/layout_dag_radial.py
    • DAG layout helpers used by pipelines/graph_snapshot.py:
      • cycle detection
      • level-synchronous Kahn layering
      • radial (ring-per-layer) positioning.
  • pipelines/snapshot_service.py
    • Snapshot cache layer used by /api/graph and /api/stats so the backend doesn't run expensive SPARQL twice.
  • pipelines/subclass_labels.py
    • Pipeline to extract rdfs:subClassOf entities and aligned rdfs:label list.

Runtime Flow

On startup (FastAPI lifespan):

  1. create_sparql_engine(settings) selects and starts a SPARQL engine.
  2. The engine is stored at app.state.sparql.

On shutdown:

  • app.state.sparql.shutdown() is called to close the HTTP client.

Environment Variables

Most configuration is intended to be provided via container environment variables (see repo root .env and docker-compose.yml).

Core:

  • INCLUDE_BNODES: true/false
  • CORS_ORIGINS: comma-separated list or *

Optional import-combining step (separate container):

The repo's owl_imports_combiner Docker service can be used to recursively load a Turtle file (or URL) plus its owl:imports into a single combined TTL output.

  • COMBINE_OWL_IMPORTS_ON_START: true to run the combiner container on startup (no-op when false)
  • COMBINE_ENTRY_LOCATION: entry file/URL to load (falls back to TTL_PATH if not set)
  • COMBINE_OUTPUT_LOCATION: output path for the combined TTL (defaults to ${dirname(entry)}/${COMBINE_OUTPUT_NAME})
  • COMBINE_OUTPUT_NAME: output filename when COMBINE_OUTPUT_LOCATION is not set (default: combined_ontology.ttl)
  • COMBINE_FORCE: true to rebuild even if the output file already exists

AnzoGraph mode:

  • SPARQL_HOST: base host (example: http://anzograph:8080)
  • SPARQL_ENDPOINT: optional full endpoint; if set, overrides ${SPARQL_HOST}/sparql
  • SPARQL_USER, SPARQL_PASS: Basic auth credentials
  • SPARQL_DATA_FILE: file URI as seen by the AnzoGraph container (example: file:///opt/shared-files/o3po.ttl)
  • SPARQL_GRAPH_IRI: optional graph IRI for LOAD ... INTO GRAPH <...>
  • SPARQL_LOAD_ON_START: true to execute LOAD <SPARQL_DATA_FILE> during startup
  • SPARQL_CLEAR_ON_START: true to execute CLEAR ALL during startup (dangerous)
  • SPARQL_TIMEOUT_S: request timeout for normal SPARQL requests
  • SPARQL_READY_RETRIES, SPARQL_READY_DELAY_S, SPARQL_READY_TIMEOUT_S: readiness gate parameters

AnzoGraph Readiness Gate

AnzoGraphEngine does not assume "container started" means "SPARQL works". It waits for a smoke-test POST:

  • Method: POST ${SPARQL_ENDPOINT}
  • Headers:
    • Content-Type: application/x-www-form-urlencoded
    • Accept: application/sparql-results+json
    • Authorization: Basic ... (if configured)
  • Body: query=ASK WHERE { ?s ?p ?o }
  • Success condition: HTTP 2xx and response parses as JSON

This matches the behavior described in docs/anzograph-readiness-julia.md.

API Endpoints

  • GET /api/health
    • Returns { "status": "ok" }.
  • GET /api/stats
    • Returns counts for the same snapshot used by /api/graph (via the snapshot cache).
  • POST /api/sparql
    • Body: { "query": "<SPARQL SELECT/ASK>" }
    • Returns SPARQL JSON results as-is.
    • Notes:
      • This endpoint is intended for SELECT/ASK returning SPARQL-JSON.
      • SPARQL UPDATE is not exposed here (AnzoGraph LOAD/CLEAR are handled internally during startup).
  • GET /api/graph?node_limit=...&edge_limit=...
    • Returns a graph snapshot as { nodes: [...], edges: [...] }.
    • Implemented as a SPARQL edge query + mapping in pipelines/graph_snapshot.py.

Data Contract

Node

Returned in nodes[] (dense IDs; suitable for indexing in typed arrays):

{
  "id": 0,
  "termType": "uri",
  "iri": "http://example.org/Thing",
  "label": null,
  "x": 0.0,
  "y": 0.0
}
  • id: integer dense node ID used in edges
  • termType: "uri" or "bnode"
  • iri: URI string; blank nodes are normalized to _:<id>
  • label: rdfs:label when available (best-effort; prefers English)
  • x/y: world-space coordinates for rendering (currently a radial layered layout derived from rdfs:subClassOf)

Edge

Returned in edges[]:

{
  "source": 0,
  "target": 12,
  "predicate": "http://www.w3.org/2000/01/rdf-schema#subClassOf"
}
  • source/target: dense node IDs (indexes into nodes[])
  • predicate: predicate IRI string

Snapshot Query (/api/graph)

/api/graph currently uses a SPARQL query that returns only rdfs:subClassOf edges:

  • selects bindings as ?s ?p ?o (with ?p bound to rdfs:subClassOf)
  • excludes literal objects (FILTER(!isLiteral(?o))) for safety
  • optionally excludes blank nodes (unless INCLUDE_BNODES=true)
  • applies LIMIT edge_limit

The result bindings are mapped to dense node IDs (first-seen order) and returned to the caller.

/api/graph also returns meta with snapshot counts and engine info so the frontend doesn't need to call /api/stats.

If a cycle is detected in the returned rdfs:subClassOf snapshot, /api/graph returns HTTP 422 (layout requires a DAG).

Pipelines

pipelines/graph_snapshot.py

fetch_graph_snapshot(...) is the main "export graph" pipeline used by /api/graph.

pipelines/subclass_labels.py

extract_subclass_entities_and_labels(...):

  1. Queries all rdfs:subClassOf triples.
  2. Builds a unique set of subjects+objects, then converts it to a deterministic list.
  3. Queries rdfs:label for those entities and returns aligned lists:
    • entities[i] corresponds to labels[i].

Notes / Tradeoffs

  • /api/graph returns only nodes that appear in the returned edge result set. Nodes not referenced by those edges will not be present.
  • AnzoGraph SPARQL feature support (inference, extensions, performance) is vendor-specific.