Large Instanced Ontology Visualizer

An experimental visualizer designed to render and explore massive instanced ontologies (millions of nodes) with interactive performance.

🚀 The Core Challenge

Ontologies with millions of instances present a significant rendering challenge for traditional graph visualization tools. This project solves this by:

Selective Rendering: Only rendering up to a set limit of nodes (e.g., 2 million) at any given time.
Adaptive Sampling: When zoomed out, it provides a representative spatial sample of the nodes. When zoomed in, the number of nodes within the viewport naturally falls below the rendering limit, allowing for 100% detail with zero performance degradation.
Spatial Indexing: Using a custom Quadtree to manage millions of points in memory and efficiently determine visibility.

🛠 Technical Architecture

1. Data Pipeline & AnzoGraph Integration

The project features an automated pipeline to extract and prepare data from an AnzoGraph DB:

SPARQL Extraction: scripts/fetch_from_db.ts connects to AnzoGraph via its SPARQL endpoint. It fetches a seed set of subjects and their related triples, identifying "primary" nodes (objects of rdf:type).
Graph Classification: Instances are categorized to distinguish between classes and relationships.
Force-Directed Layout: scripts/compute_layout.ts calculates 2D positions for the nodes using a Barnes-Hut optimized force-directed simulation, ensuring scalability for large graphs.

2. Quadtree Spatial Index

To handle millions of nodes without per-frame object allocation:

In-place Sorting: The Quadtree (src/quadtree.ts) spatially sorts the raw Float32Array of positions at build-time.
Index-Based Access: Leaves store only the index ranges into the sorted array, pointing directly to the data sent to the GPU.
Fast Lookups: Used for both frustum culling and efficient "find node under cursor" calculations.

3. WebGL 2 High-Performance Renderer

The renderer (src/renderer.ts) is built for maximum throughput:

WEBGL_multi_draw Extension: Batches multiple leaf nodes into single draw calls, minimizing CPU overhead.
Zero-Allocation Render Loop: The frame loop uses pre-allocated typed arrays to prevent GC pauses.
Dynamic Level of Detail (LOD):
- Points: Always visible, with adaptive density based on zoom.
- Lines: Automatically rendered when zoomed in deep enough to see individual relationships (< 20k visible nodes).
- Selection: Interactive selection of nodes highlights immediate neighbors (incoming/outgoing edges).

🚦 Getting Started

Prerequisites

Docker and Docker Compose
Node.js (for local development)

Deployment

The project includes a docker-compose.yml that spins up both the AnzoGraph database and the visualizer app.

# Start the services
docker-compose up -d

# Inside the app container, the following will run automatically:
# 1. Fetch data from AnzoGraph (fetch_from_db.ts)
# 2. Compute the 2D layout (compute_layout.ts)
# 3. Start the Vite development server

The app will be available at http://localhost:5173.

🖱 Interactions

Drag: Pan the view.
Scroll: Zoom in/out at the cursor position.
Click: Select a node to see its URI/Label and highlight its neighbors.
HUD: Real-time stats on FPS, nodes drawn, and current sampling ratio.

TODO

Positioning: Use better algorithm to position nodes, trying to avoid as much as possible any edges crossing, but at the same time trying to keep the graph compact.
Positioning: Decide how to handle classes which are both instances and classes.
Functionality: Find every equipment with a specific property or that participate in a specific process.
Functionality: Find every equipment which is connecte to a well.
Functionality: Show every connection witin a specified depth.
Functionality: Show every element of a specific class.

4.0 KiB Raw Permalink Blame History