Files
visualizador_instanciados/docs/anzograph-readiness-julia.md
2026-03-02 16:27:28 -03:00

372 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Waiting for AnzoGraph readiness from Julia (how this repo does it)
This repo runs a Julia pipeline (`julia/main.jl`) against an AnzoGraph SPARQL endpoint. The key problem is that **“container started” ≠ “SPARQL endpoint is ready to accept queries”**.
So, before the Julia code does anything that depends on SPARQL (like `LOAD <...>` or large `SELECT`s), it explicitly **waits until AnzoGraph is actually responding to a real SPARQL POST request with valid JSON results**.
This document explains the exact mechanism used here, why it works, and gives copy/paste-ready patterns you can transfer to another project.
---
## 1) Where the waiting happens (pipeline control flow)
In `julia/main.jl`, the entrypoint calls:
```julia
# Step 1: Wait for AnzoGraph
wait_for_anzograph()
# Step 2: Load TTL file
result = sparql_update("LOAD <$SPARQL_DATA_FILE>")
```
So the “await” is not a Julia `Task`/`async` wait; it is a **blocking retry loop** that only returns when it can successfully execute a small SPARQL query.
Reference: `julia/main.jl` defines `wait_for_anzograph()` and calls it from `main()`.
---
## 2) Why this is needed even with Docker Compose `depends_on`
This repos `docker-compose.yml` includes an AnzoGraph `healthcheck`:
```yaml
anzograph:
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:8080/sparql || exit 1"]
interval: 10s
timeout: 5s
retries: 30
start_period: 60s
```
However, `julia-layout` currently depends on `anzograph` with:
```yaml
depends_on:
anzograph:
condition: service_started
```
Meaning:
- Compose will ensure the **container process has started**.
- Compose does **not** guarantee the AnzoGraph HTTP/SPARQL endpoint is ready (unless you use `service_healthy`, and even then a “healthy GET” is not always equivalent to “SPARQL POST works with auth + JSON”).
So the Julia code includes its own readiness gate to prevent failures like:
- TCP connection refused (port not open yet)
- HTTP endpoint reachable but not fully initialized
- Non-JSON/HTML error responses while the service is still booting
---
## 3) What “ready” means in this repo
In this repo, “AnzoGraph is ready” means:
1. An HTTP `POST` to `${SPARQL_HOST}/sparql` succeeds, with headers:
- `Content-Type: application/x-www-form-urlencoded`
- `Accept: application/sparql-results+json`
- `Authorization: Basic ...`
2. The body parses as SPARQL JSON results (`application/sparql-results+json`)
It does **not** strictly mean:
- Your dataset is already loaded
- The loaded data is fully indexed (that can matter in some systems after `LOAD`)
This repo uses readiness as a **“SPARQL endpoint is alive and speaking the protocol”** check.
---
## 4) The actual Julia implementation (as in `julia/main.jl`)
### 4.1 Configuration (endpoint + auth)
The Julia script builds endpoint and auth from environment variables:
```julia
const SPARQL_HOST = get(ENV, "SPARQL_HOST", "http://localhost:8080")
const SPARQL_ENDPOINT = "$SPARQL_HOST/sparql"
const SPARQL_USER = get(ENV, "SPARQL_USER", "admin")
const SPARQL_PASS = get(ENV, "SPARQL_PASS", "Passw0rd1")
const AUTH_HEADER = "Basic " * base64encode("$SPARQL_USER:$SPARQL_PASS")
```
In Docker Compose for this repo, the Julia container overrides `SPARQL_HOST` to use the service DNS name:
```yaml
environment:
- SPARQL_HOST=http://anzograph:8080
```
### 4.2 The smoke query used for readiness
This is the query used in the wait loop:
```julia
const SMOKE_TEST_QUERY = "SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 3"
```
Notes:
- Its intentionally small (`LIMIT 3`) to keep the readiness check cheap.
- It returns *some* bindings when data exists, but **even an empty dataset can still return a valid empty result set**. The code treats “valid response” as ready.
If you want a readiness check that does not depend on any data being present, an `ASK` query is also common:
```sparql
ASK WHERE { ?s ?p ?o }
```
### 4.3 SPARQL query function (request + minimal retry)
`sparql_query(query; retries=...)` is a generic helper that makes SPARQL POST requests:
```julia
function sparql_query(query::String; retries::Int=5)::SparqlResult
for attempt in 1:retries
try
response = HTTP.post(
SPARQL_ENDPOINT,
[
"Content-Type" => "application/x-www-form-urlencoded",
"Accept" => "application/sparql-results+json",
"Authorization" => AUTH_HEADER
];
body = "query=" * HTTP.URIs.escapeuri(query)
)
if response.status == 200
json = JSON.parse(String(response.body))
return SparqlResult(json["results"]["bindings"])
elseif response.status >= 500 && attempt < retries
sleep(10)
continue
else
error("SPARQL query failed with status $(response.status)")
end
catch e
if attempt < retries
sleep(10)
continue
end
rethrow(e)
end
end
error("SPARQL query failed after $retries attempts")
end
```
Important behaviors to preserve when transferring:
- It uses **POST** (not GET) to the SPARQL endpoint.
- It requires a **200** response and successfully parses SPARQL JSON results.
- It retries on:
- `>= 500` server errors
- network / protocol / parsing errors (caught exceptions)
### 4.4 The readiness gate: `wait_for_anzograph`
This is the “await until ready” logic:
```julia
function wait_for_anzograph(max_retries::Int=30)::Bool
println("Waiting for AnzoGraph at $SPARQL_ENDPOINT...")
for attempt in 1:max_retries
try
smoke_result = sparql_query(SMOKE_TEST_QUERY; retries=1)
println(" AnzoGraph is ready (attempt $attempt, smoke rows=$(length(smoke_result.bindings)))")
return true
catch e
println(" Attempt $attempt/$max_retries: $(typeof(e))")
sleep(4)
end
end
error("AnzoGraph not available after $max_retries attempts")
end
```
Why it calls `sparql_query(...; retries=1)`:
- It makes each outer “readiness attempt” a **single** request.
- The outer loop controls cadence (`sleep(4)`) and total wait time.
- This avoids “nested retry loops” (inner sleeps + outer sleeps) that can make waits much longer than intended.
Time bound in the current implementation:
- `max_retries = 30`
- `sleep(4)` between attempts
- Roughly ~120 seconds of waiting (plus request time).
---
## 5) What failures cause it to keep waiting
`wait_for_anzograph()` catches any exception thrown by `sparql_query()` and retries. In practice, that includes:
- **Connection errors** (DNS not ready, connection refused, etc.)
- **Timeouts** (if HTTP request takes too long and the library throws)
- **Non-200 HTTP statuses** that cause `error(...)`
- **Non-JSON / unexpected JSON** responses causing `JSON.parse(...)` to throw
That last point is a big reason a “real SPARQL request + parse” is stronger than just “ping the port”.
---
## 6) Transferable, self-contained version (recommended pattern)
If you want to reuse this in another project, its usually easier to:
- avoid globals,
- make endpoint/auth explicit,
- use a **time-based timeout** instead of `max_retries` (more robust),
- add request timeouts so the wait loop cant hang forever on a single request.
Below is a drop-in module you can copy into your project.
```julia
module AnzoGraphReady
using HTTP
using JSON
using Base64
using Dates
struct SparqlResult
bindings::Vector{Dict{String, Any}}
end
function basic_auth_header(user::AbstractString, pass::AbstractString)::String
return "Basic " * base64encode("$(user):$(pass)")
end
function sparql_query(
endpoint::AbstractString,
auth_header::AbstractString,
query::AbstractString;
retries::Int = 1,
retry_sleep_s::Real = 2,
request_timeout_s::Real = 15,
)::SparqlResult
for attempt in 1:retries
try
response = HTTP.post(
String(endpoint),
[
"Content-Type" => "application/x-www-form-urlencoded",
"Accept" => "application/sparql-results+json",
"Authorization" => auth_header,
];
body = "query=" * HTTP.URIs.escapeuri(String(query)),
readtimeout = request_timeout_s,
)
if response.status != 200
error("SPARQL query failed with status $(response.status)")
end
parsed = JSON.parse(String(response.body))
bindings = get(get(parsed, "results", Dict()), "bindings", Any[])
return SparqlResult(Vector{Dict{String, Any}}(bindings))
catch e
if attempt < retries
sleep(retry_sleep_s)
continue
end
rethrow(e)
end
end
error("sparql_query: unreachable")
end
"""
Wait until AnzoGraph responds to a real SPARQL POST with parseable JSON.
This is the direct analog of this repo's `wait_for_anzograph()`, but with:
- a time-based timeout (`timeout`)
- a request timeout per attempt (`request_timeout_s`)
- simple exponential backoff
"""
function wait_for_anzograph(
endpoint::AbstractString,
auth_header::AbstractString;
timeout::Period = Minute(3),
initial_delay_s::Real = 0.5,
max_delay_s::Real = 5.0,
request_timeout_s::Real = 10.0,
query::AbstractString = "ASK WHERE { ?s ?p ?o }",
)::Nothing
deadline = now() + timeout
delay_s = initial_delay_s
while now() < deadline
try
# A single attempt: if it succeeds, we declare "ready".
sparql_query(
endpoint,
auth_header,
query;
retries = 1,
request_timeout_s = request_timeout_s,
)
return
catch
sleep(delay_s)
delay_s = min(max_delay_s, delay_s * 1.5)
end
end
error("AnzoGraph not available before timeout=$(timeout)")
end
end # module
```
Typical usage (matching this repos environment variables):
```julia
using .AnzoGraphReady
sparql_host = get(ENV, "SPARQL_HOST", "http://localhost:8080")
endpoint = "$(sparql_host)/sparql"
user = get(ENV, "SPARQL_USER", "admin")
pass = get(ENV, "SPARQL_PASS", "Passw0rd1")
auth = AnzoGraphReady.basic_auth_header(user, pass)
AnzoGraphReady.wait_for_anzograph(endpoint, auth; timeout=Minute(5))
# Now it is safe to LOAD / query.
```
---
## 7) Optional: waiting for “data is ready” after `LOAD`
Some systems accept `LOAD` but need time before results show up reliably (indexing / transaction visibility).
If you run into that in your other project, add a second gate after `LOAD`, for example:
1) load, then
2) poll a query that must be true after load (e.g., “triple count > 0”, or a known IRI exists).
Example “post-load gate”:
```julia
post_load_query = """
SELECT (COUNT(*) AS ?n)
WHERE { ?s ?p ?o }
"""
res = AnzoGraphReady.sparql_query(endpoint, auth, post_load_query; retries=1)
# Parse `?n` out of bindings and require it to be > 0; retry until it is.
```
(This repo does not currently enforce “non-empty”; it only enforces “SPARQL is working”.)
---
## 8) Practical checklist when transferring to another project
- Make readiness checks hit the **real SPARQL POST** path you will use in production.
- Require a **valid JSON parse**, not just “port open”.
- Add **per-request timeouts**, so a single hung request cannot hang the whole pipeline.
- Prefer **time-based overall timeout** for predictable behavior in CI.
- Keep the query **cheap** (`ASK` or `LIMIT 1/3`).
- If you use Docker Compose healthchecks, consider also using `depends_on: condition: service_healthy`, but still keep the in-app wait as a safety net (its closer to the real contract your code needs).