Files
visualizador_instanciados/docs/anzograph-readiness-julia.md
2026-03-02 16:27:28 -03:00

12 KiB
Raw Blame History

Waiting for AnzoGraph readiness from Julia (how this repo does it)

This repo runs a Julia pipeline (julia/main.jl) against an AnzoGraph SPARQL endpoint. The key problem is that “container started” ≠ “SPARQL endpoint is ready to accept queries”.

So, before the Julia code does anything that depends on SPARQL (like LOAD <...> or large SELECTs), it explicitly waits until AnzoGraph is actually responding to a real SPARQL POST request with valid JSON results.

This document explains the exact mechanism used here, why it works, and gives copy/paste-ready patterns you can transfer to another project.


1) Where the waiting happens (pipeline control flow)

In julia/main.jl, the entrypoint calls:

# Step 1: Wait for AnzoGraph
wait_for_anzograph()

# Step 2: Load TTL file
result = sparql_update("LOAD <$SPARQL_DATA_FILE>")

So the “await” is not a Julia Task/async wait; it is a blocking retry loop that only returns when it can successfully execute a small SPARQL query.

Reference: julia/main.jl defines wait_for_anzograph() and calls it from main().


2) Why this is needed even with Docker Compose depends_on

This repos docker-compose.yml includes an AnzoGraph healthcheck:

anzograph:
  healthcheck:
    test: ["CMD-SHELL", "curl -f http://localhost:8080/sparql || exit 1"]
    interval: 10s
    timeout: 5s
    retries: 30
    start_period: 60s

However, julia-layout currently depends on anzograph with:

depends_on:
  anzograph:
    condition: service_started

Meaning:

  • Compose will ensure the container process has started.
  • Compose does not guarantee the AnzoGraph HTTP/SPARQL endpoint is ready (unless you use service_healthy, and even then a “healthy GET” is not always equivalent to “SPARQL POST works with auth + JSON”).

So the Julia code includes its own readiness gate to prevent failures like:

  • TCP connection refused (port not open yet)
  • HTTP endpoint reachable but not fully initialized
  • Non-JSON/HTML error responses while the service is still booting

3) What “ready” means in this repo

In this repo, “AnzoGraph is ready” means:

  1. An HTTP POST to ${SPARQL_HOST}/sparql succeeds, with headers:
    • Content-Type: application/x-www-form-urlencoded
    • Accept: application/sparql-results+json
    • Authorization: Basic ...
  2. The body parses as SPARQL JSON results (application/sparql-results+json)

It does not strictly mean:

  • Your dataset is already loaded
  • The loaded data is fully indexed (that can matter in some systems after LOAD)

This repo uses readiness as a “SPARQL endpoint is alive and speaking the protocol” check.


4) The actual Julia implementation (as in julia/main.jl)

4.1 Configuration (endpoint + auth)

The Julia script builds endpoint and auth from environment variables:

const SPARQL_HOST = get(ENV, "SPARQL_HOST", "http://localhost:8080")
const SPARQL_ENDPOINT = "$SPARQL_HOST/sparql"
const SPARQL_USER = get(ENV, "SPARQL_USER", "admin")
const SPARQL_PASS = get(ENV, "SPARQL_PASS", "Passw0rd1")
const AUTH_HEADER = "Basic " * base64encode("$SPARQL_USER:$SPARQL_PASS")

In Docker Compose for this repo, the Julia container overrides SPARQL_HOST to use the service DNS name:

environment:
  - SPARQL_HOST=http://anzograph:8080

4.2 The smoke query used for readiness

This is the query used in the wait loop:

const SMOKE_TEST_QUERY = "SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 3"

Notes:

  • Its intentionally small (LIMIT 3) to keep the readiness check cheap.
  • It returns some bindings when data exists, but even an empty dataset can still return a valid empty result set. The code treats “valid response” as ready.

If you want a readiness check that does not depend on any data being present, an ASK query is also common:

ASK WHERE { ?s ?p ?o }

4.3 SPARQL query function (request + minimal retry)

sparql_query(query; retries=...) is a generic helper that makes SPARQL POST requests:

function sparql_query(query::String; retries::Int=5)::SparqlResult
    for attempt in 1:retries
        try
            response = HTTP.post(
                SPARQL_ENDPOINT,
                [
                    "Content-Type" => "application/x-www-form-urlencoded",
                    "Accept" => "application/sparql-results+json",
                    "Authorization" => AUTH_HEADER
                ];
                body = "query=" * HTTP.URIs.escapeuri(query)
            )

            if response.status == 200
                json = JSON.parse(String(response.body))
                return SparqlResult(json["results"]["bindings"])
            elseif response.status >= 500 && attempt < retries
                sleep(10)
                continue
            else
                error("SPARQL query failed with status $(response.status)")
            end
        catch e
            if attempt < retries
                sleep(10)
                continue
            end
            rethrow(e)
        end
    end
    error("SPARQL query failed after $retries attempts")
end

Important behaviors to preserve when transferring:

  • It uses POST (not GET) to the SPARQL endpoint.
  • It requires a 200 response and successfully parses SPARQL JSON results.
  • It retries on:
    • >= 500 server errors
    • network / protocol / parsing errors (caught exceptions)

4.4 The readiness gate: wait_for_anzograph

This is the “await until ready” logic:

function wait_for_anzograph(max_retries::Int=30)::Bool
    println("Waiting for AnzoGraph at $SPARQL_ENDPOINT...")

    for attempt in 1:max_retries
        try
            smoke_result = sparql_query(SMOKE_TEST_QUERY; retries=1)
            println("  AnzoGraph is ready (attempt $attempt, smoke rows=$(length(smoke_result.bindings)))")
            return true
        catch e
            println("  Attempt $attempt/$max_retries: $(typeof(e))")
            sleep(4)
        end
    end

    error("AnzoGraph not available after $max_retries attempts")
end

Why it calls sparql_query(...; retries=1):

  • It makes each outer “readiness attempt” a single request.
  • The outer loop controls cadence (sleep(4)) and total wait time.
  • This avoids “nested retry loops” (inner sleeps + outer sleeps) that can make waits much longer than intended.

Time bound in the current implementation:

  • max_retries = 30
  • sleep(4) between attempts
  • Roughly ~120 seconds of waiting (plus request time).

5) What failures cause it to keep waiting

wait_for_anzograph() catches any exception thrown by sparql_query() and retries. In practice, that includes:

  • Connection errors (DNS not ready, connection refused, etc.)
  • Timeouts (if HTTP request takes too long and the library throws)
  • Non-200 HTTP statuses that cause error(...)
  • Non-JSON / unexpected JSON responses causing JSON.parse(...) to throw

That last point is a big reason a “real SPARQL request + parse” is stronger than just “ping the port”.


If you want to reuse this in another project, its usually easier to:

  • avoid globals,
  • make endpoint/auth explicit,
  • use a time-based timeout instead of max_retries (more robust),
  • add request timeouts so the wait loop cant hang forever on a single request.

Below is a drop-in module you can copy into your project.

module AnzoGraphReady

using HTTP
using JSON
using Base64
using Dates

struct SparqlResult
    bindings::Vector{Dict{String, Any}}
end

function basic_auth_header(user::AbstractString, pass::AbstractString)::String
    return "Basic " * base64encode("$(user):$(pass)")
end

function sparql_query(
    endpoint::AbstractString,
    auth_header::AbstractString,
    query::AbstractString;
    retries::Int = 1,
    retry_sleep_s::Real = 2,
    request_timeout_s::Real = 15,
)::SparqlResult
    for attempt in 1:retries
        try
            response = HTTP.post(
                String(endpoint),
                [
                    "Content-Type" => "application/x-www-form-urlencoded",
                    "Accept" => "application/sparql-results+json",
                    "Authorization" => auth_header,
                ];
                body = "query=" * HTTP.URIs.escapeuri(String(query)),
                readtimeout = request_timeout_s,
            )

            if response.status != 200
                error("SPARQL query failed with status $(response.status)")
            end

            parsed = JSON.parse(String(response.body))
            bindings = get(get(parsed, "results", Dict()), "bindings", Any[])
            return SparqlResult(Vector{Dict{String, Any}}(bindings))
        catch e
            if attempt < retries
                sleep(retry_sleep_s)
                continue
            end
            rethrow(e)
        end
    end
    error("sparql_query: unreachable")
end

"""
Wait until AnzoGraph responds to a real SPARQL POST with parseable JSON.

This is the direct analog of this repo's `wait_for_anzograph()`, but with:
- a time-based timeout (`timeout`)
- a request timeout per attempt (`request_timeout_s`)
- simple exponential backoff
"""
function wait_for_anzograph(
    endpoint::AbstractString,
    auth_header::AbstractString;
    timeout::Period = Minute(3),
    initial_delay_s::Real = 0.5,
    max_delay_s::Real = 5.0,
    request_timeout_s::Real = 10.0,
    query::AbstractString = "ASK WHERE { ?s ?p ?o }",
)::Nothing
    deadline = now() + timeout
    delay_s = initial_delay_s

    while now() < deadline
        try
            # A single attempt: if it succeeds, we declare "ready".
            sparql_query(
                endpoint,
                auth_header,
                query;
                retries = 1,
                request_timeout_s = request_timeout_s,
            )
            return
        catch
            sleep(delay_s)
            delay_s = min(max_delay_s, delay_s * 1.5)
        end
    end

    error("AnzoGraph not available before timeout=$(timeout)")
end

end # module

Typical usage (matching this repos environment variables):

using .AnzoGraphReady

sparql_host = get(ENV, "SPARQL_HOST", "http://localhost:8080")
endpoint = "$(sparql_host)/sparql"
user = get(ENV, "SPARQL_USER", "admin")
pass = get(ENV, "SPARQL_PASS", "Passw0rd1")

auth = AnzoGraphReady.basic_auth_header(user, pass)
AnzoGraphReady.wait_for_anzograph(endpoint, auth; timeout=Minute(5))

# Now it is safe to LOAD / query.

7) Optional: waiting for “data is ready” after LOAD

Some systems accept LOAD but need time before results show up reliably (indexing / transaction visibility). If you run into that in your other project, add a second gate after LOAD, for example:

  1. load, then
  2. poll a query that must be true after load (e.g., “triple count > 0”, or a known IRI exists).

Example “post-load gate”:

post_load_query = """
SELECT (COUNT(*) AS ?n)
WHERE { ?s ?p ?o }
"""

res = AnzoGraphReady.sparql_query(endpoint, auth, post_load_query; retries=1)
# Parse `?n` out of bindings and require it to be > 0; retry until it is.

(This repo does not currently enforce “non-empty”; it only enforces “SPARQL is working”.)


8) Practical checklist when transferring to another project

  • Make readiness checks hit the real SPARQL POST path you will use in production.
  • Require a valid JSON parse, not just “port open”.
  • Add per-request timeouts, so a single hung request cannot hang the whole pipeline.
  • Prefer time-based overall timeout for predictable behavior in CI.
  • Keep the query cheap (ASK or LIMIT 1/3).
  • If you use Docker Compose healthchecks, consider also using depends_on: condition: service_healthy, but still keep the in-app wait as a safety net (its closer to the real contract your code needs).