backend: support external SPARQL and named-graph snapshots

This commit is contained in:
Oxy8
2026-04-06 13:36:08 -03:00
parent 696844f341
commit 44c1d3eaa6
25 changed files with 1695 additions and 243 deletions

View File

@@ -27,10 +27,14 @@ Important variables:
- `DEFAULT_NODE_LIMIT`, `DEFAULT_EDGE_LIMIT`
- `MAX_NODE_LIMIT`, `MAX_EDGE_LIMIT`
- SPARQL connectivity:
- `SPARQL_SOURCE_MODE` (`local` or `external`)
- `SPARQL_HOST` (default `http://anzograph:8080`) or `SPARQL_ENDPOINT`
- `EXTERNAL_SPARQL_ENDPOINT` for external AnzoGraph access
- `KEYCLOAK_TOKEN_ENDPOINT`, `KEYCLOAK_CLIENT_ID`, `KEYCLOAK_USERNAME`, `KEYCLOAK_PASSWORD`, `KEYCLOAK_SCOPE`
- `SPARQL_USER`, `SPARQL_PASS`
- External mode fetches a bearer token from Keycloak at startup, sends `Authorization: Bearer ...` to `EXTERNAL_SPARQL_ENDPOINT`, and refreshes once on `401 Unauthorized: Jwt is expired`
- Startup behavior:
- `SPARQL_LOAD_ON_START`, `SPARQL_CLEAR_ON_START`
- `SPARQL_LOAD_ON_START`
- `SPARQL_DATA_FILE` (typically `file:///opt/shared-files/<file>.ttl`)
- Other:
- `INCLUDE_BNODES` (include blank nodes in snapshots)
@@ -68,9 +72,9 @@ Stored under `backend_go/graph_queries/` and listed by `GET /api/graph_queries`.
Built-in modes:
- `default` `rdf:type` (to `owl:Class`) + `rdfs:subClassOf`
- `hierarchy` `rdfs:subClassOf` only
- `types` `rdf:type` (to `owl:Class`) only
- `default` `rdf:type` + `rdfs:subClassOf`
- `hierarchy` `rdfs:subClassOf` + `rdf:type`
- `types` `rdf:type` only
To add a new mode:
@@ -94,5 +98,6 @@ To add a new mode:
## Performance notes
- Memory usage is dominated by the cached snapshot (`[]Node`, `[]Edge`) and the temporary SPARQL JSON unmarshalling step.
- Memory usage is dominated by the cached snapshot (`[]Node`, `[]Edge`) and large SPARQL result sets.
- The backend streams SPARQL JSON bindings for snapshot edge batches to reduce decode overhead.
- Tune `DEFAULT_NODE_LIMIT`/`DEFAULT_EDGE_LIMIT` first if memory is too high.