656 lines
22 KiB
Markdown
656 lines
22 KiB
Markdown
# Graph Transport Alternatives
|
|
|
|
## Purpose
|
|
|
|
This document compares alternatives to the current `/api/graph` transport format with two goals:
|
|
|
|
1. reduce the cost of building, transferring, and decoding very large graph payloads
|
|
2. move the frontend transport shape closer to the renderer/GPU input shape while preserving all data the current frontend and backend pipeline still need
|
|
|
|
This analysis is based on the current repo state plus official documentation for browser fetch/streaming and candidate transport formats.
|
|
|
|
## Executive Summary
|
|
|
|
The current bottleneck is not the renderer's typed-array path. It is the browser's need to fully materialize a huge JSON object graph before the renderer ever runs.
|
|
|
|
The best candidates for this repo are:
|
|
|
|
1. **Custom binary columnar payload**
|
|
- Best fit for the current renderer.
|
|
- Lowest decode overhead.
|
|
- Most direct path from backend memory to frontend typed arrays.
|
|
- Requires custom protocol/versioning work.
|
|
|
|
2. **Apache Arrow IPC**
|
|
- Best off-the-shelf columnar binary format.
|
|
- Very good fit for typed-array-heavy rendering.
|
|
- Strong option if you want a standard format instead of inventing one.
|
|
- Heavier conceptual/tooling footprint than a custom binary envelope.
|
|
|
|
3. **Columnar JSON**
|
|
- Easiest migration.
|
|
- Better than today's row-oriented JSON.
|
|
- Still fundamentally JSON, so it does not remove the browser's JSON parse/object-materialization cost.
|
|
|
|
4. **NDJSON / streamed chunked JSON**
|
|
- Good if progressiveness matters.
|
|
- Better than one giant monolithic JSON document.
|
|
- Still weaker than a binary/columnar format for this renderer.
|
|
|
|
The strongest overall recommendation is:
|
|
|
|
- **Long-term**: custom binary columnar payload or Arrow IPC
|
|
- **Low-risk interim**: columnar JSON, possibly with chunking/streaming
|
|
|
|
Not recommended as the primary solution for this repo:
|
|
|
|
- row-oriented MessagePack
|
|
- Protocol Buffers as one giant message
|
|
|
|
## Verified Current Pipeline
|
|
|
|
### Backend side
|
|
|
|
The backend builds a `GraphResponse` and caches it in memory:
|
|
|
|
- `backend_go/models.go`
|
|
- `backend_go/snapshot_service.go`
|
|
- `backend_go/graph_snapshot.go`
|
|
|
|
The response shape is:
|
|
|
|
```go
|
|
type GraphResponse struct {
|
|
Nodes []Node
|
|
Edges []Edge
|
|
RouteSegments []RouteSegment
|
|
Meta *GraphMeta
|
|
}
|
|
```
|
|
|
|
and it is currently written as one JSON document with:
|
|
|
|
```go
|
|
json.NewEncoder(w).Encode(v)
|
|
```
|
|
|
|
in `backend_go/http_helpers.go`.
|
|
|
|
### Frontend side
|
|
|
|
The frontend currently does:
|
|
|
|
1. `fetch("/api/graph?...")`
|
|
2. `await graphRes.json()`
|
|
3. read `graph.nodes`, `graph.edges`, `graph.route_segments`, `graph.meta`
|
|
4. build:
|
|
- `Float32Array xs`
|
|
- `Float32Array ys`
|
|
- `Uint32Array vertexIds`
|
|
- `Uint32Array edgeData`
|
|
- `Float32Array routeLineVertices`
|
|
5. call `renderer.init(xs, ys, vertexIds, edgeData, routeLineVertices)`
|
|
|
|
Relevant files:
|
|
|
|
- `frontend/src/App.tsx`
|
|
- `frontend/src/renderer.ts`
|
|
|
|
This means the current browser path is:
|
|
|
|
- wire bytes
|
|
- JSON text/body handling
|
|
- JS arrays of node/edge objects
|
|
- typed arrays
|
|
- renderer-side typed arrays/maps/GPU buffers
|
|
|
|
The expensive part happens before step 4.
|
|
|
|
## Verified Data Access Audit
|
|
|
|
This section verifies every field currently produced by the backend and whether it is actually needed by the frontend transport.
|
|
|
|
### Main graph response fields
|
|
|
|
| Field | Produced in backend | Used by frontend? | Where used | Required on wire for current UX? | Notes |
|
|
| --- | --- | --- | --- | --- | --- |
|
|
| `nodes[].id` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Used to build `vertexIds`, and to map selected renderer indices back to backend IDs for selection queries. |
|
|
| `nodes[].x` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Used to build `xs`. |
|
|
| `nodes[].y` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Used to build `ys`. |
|
|
| `nodes[].iri` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes, if keeping current hover UX | Used for hover tooltip text. |
|
|
| `nodes[].label` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes, if keeping current hover UX | Used for hover tooltip text. |
|
|
| `nodes[].termType` | `backend_go/models.go` | No frontend use | none in `frontend/src` | No | Still needed internally by backend snapshot/selection index. |
|
|
| `edges[].source` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Used to build `edgeData`. |
|
|
| `edges[].target` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Used to build `edgeData`. |
|
|
| `edges[].predicate_id` | `backend_go/models.go` | No main-graph frontend use | none in `frontend/src/App.tsx` | No | Still needed internally by backend snapshot and hierarchy layout preparation. |
|
|
| `route_segments[].points` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes when route segments are present | Used to build `routeLineVertices`. |
|
|
| `route_segments[].edge_index` | `backend_go/models.go` | Not used after parsing | `graphRouteSegmentArray` validation only | No | Could be dropped from frontend transport if route lines are pre-flattened. |
|
|
| `route_segments[].kind` | `backend_go/models.go` | Not used after parsing | `graphRouteSegmentArray` validation only | No | Could be dropped from frontend transport if route lines are pre-flattened. |
|
|
| `meta.backend` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Displayed in overlay. |
|
|
| `meta.nodes` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Displayed in overlay. |
|
|
| `meta.edges` | `backend_go/models.go` | Yes | `frontend/src/App.tsx` | Yes | Displayed in overlay. |
|
|
| `meta.graph_query_id` | `backend_go/models.go` | Yes | `frontend/src/selection_queries/api.ts` | Yes | Sent back on selection endpoints. |
|
|
| `meta.node_limit` | `backend_go/models.go` | Yes | `frontend/src/selection_queries/api.ts` | Yes | Sent back on selection endpoints. |
|
|
| `meta.edge_limit` | `backend_go/models.go` | Yes | `frontend/src/selection_queries/api.ts` | Yes | Sent back on selection endpoints. |
|
|
| `meta.ttl_path` | `backend_go/models.go` | No | none in `frontend/src` | No | Frontend type declares it, but current UI does not use it. |
|
|
| `meta.sparql_endpoint` | `backend_go/models.go` | No | none in `frontend/src` | No | Not used by current UI. |
|
|
| `meta.include_bnodes` | `backend_go/models.go` | No | none in `frontend/src` | No | Not used by current UI. |
|
|
| `meta.layout_engine` | `backend_go/models.go` | No | none in `frontend/src` | No | Not used by current UI. |
|
|
| `meta.layout_root_iri` | `backend_go/models.go` | No | none in `frontend/src` | No | Not used by current UI. |
|
|
| `meta.predicates` | `backend_go/models.go` | No frontend use | none in `frontend/src` | No | Still used internally by backend selection/hierarchy logic. |
|
|
|
|
### Backend-internal fields that do not need to stay in the frontend transport
|
|
|
|
This is the most important audit result.
|
|
|
|
The backend currently reuses one struct for:
|
|
|
|
- internal cached snapshot
|
|
- HTTP response payload
|
|
|
|
That is convenient, but it means the frontend receives fields that only the backend needs.
|
|
|
|
Verified internal-only dependencies:
|
|
|
|
- `snapshot.Nodes[].TermType` is used in `backend_go/selection_query.go` to build the selection index.
|
|
- `snapshot.Meta.Predicates` is used in `backend_go/selection_query.go`.
|
|
- `Edge.PredicateID` is used internally for hierarchy layout preparation in `backend_go/hierarchy_layout_bridge.go`.
|
|
|
|
The frontend does **not** need those fields for current behavior.
|
|
|
|
### What the frontend actually needs
|
|
|
|
For the current graph view, the hot path can be reduced to:
|
|
|
|
- `vertexIds[]`
|
|
- `xs[]`
|
|
- `ys[]`
|
|
- `edgeSources[]`
|
|
- `edgeTargets[]`
|
|
- `routeLineVertices[]` or route geometry equivalent
|
|
- `label[]` and `iri[]` by node index
|
|
- `meta.backend`
|
|
- `meta.nodes`
|
|
- `meta.edges`
|
|
- `meta.graph_query_id`
|
|
- `meta.node_limit`
|
|
- `meta.edge_limit`
|
|
|
|
That is much closer to a columnar or binary payload than to the current array-of-objects JSON.
|
|
|
|
## Why the Current JSON Path Hurts
|
|
|
|
`Response.json()` is not just a lightweight decode helper. MDN states that `Response.json()` reads the stream to completion and resolves with the result of parsing the body text as JSON into a JavaScript object.
|
|
|
|
That matters here because the current payload is row-oriented:
|
|
|
|
- millions of node objects
|
|
- millions of edge objects
|
|
|
|
Even though the renderer later wants typed arrays, the browser must first create those JS objects.
|
|
|
|
This is exactly the part that can stall or run out of memory before `renderer.init(...)` starts.
|
|
|
|
## Alternatives
|
|
|
|
### 1. Columnar JSON
|
|
|
|
#### Idea
|
|
|
|
Keep JSON, but change the schema from row-oriented objects:
|
|
|
|
```json
|
|
{
|
|
"nodes": [{ "id": 1, "x": 0.1, "y": 0.2, ... }],
|
|
"edges": [{ "source": 1, "target": 2, ... }]
|
|
}
|
|
```
|
|
|
|
to column-oriented arrays:
|
|
|
|
```json
|
|
{
|
|
"vertex_ids": [...],
|
|
"xs": [...],
|
|
"ys": [...],
|
|
"edge_sources": [...],
|
|
"edge_targets": [...],
|
|
"node_labels": [...],
|
|
"node_iris": [...],
|
|
"route_line_vertices": [...],
|
|
"meta": { ... }
|
|
}
|
|
```
|
|
|
|
#### Pros
|
|
|
|
- easiest migration from the current API contract
|
|
- no schema compiler
|
|
- easy to debug with ordinary tooling
|
|
- much closer to what the renderer already consumes
|
|
- avoids creating per-edge objects in frontend application code
|
|
|
|
#### Cons
|
|
|
|
- still goes through JSON parsing
|
|
- still materializes JS arrays before typed arrays are built
|
|
- huge numeric arrays in JSON are still text, not binary
|
|
- string columns are still ordinary JS strings
|
|
|
|
#### Fit for current pipeline
|
|
|
|
Good.
|
|
|
|
No current frontend feature would be lost if the payload includes:
|
|
|
|
- ids/xs/ys/edge sources/targets
|
|
- labels/iris
|
|
- route line vertices or equivalent
|
|
- the small subset of meta fields currently used
|
|
|
|
#### Overall assessment
|
|
|
|
Best low-risk intermediate step.
|
|
|
|
It is clearly better than today's row-oriented JSON, but it is not the endgame if the goal is to remove the parse bottleneck for 1 GB+ payloads.
|
|
|
|
### 2. NDJSON / Chunked JSON
|
|
|
|
#### Idea
|
|
|
|
Change the backend to stream multiple JSON records instead of one giant JSON object.
|
|
|
|
Examples:
|
|
|
|
- one line per chunk of nodes/edges
|
|
- one line for metadata
|
|
- one line per route segment chunk
|
|
|
|
NDJSON is explicitly designed for transporting multiple JSON texts in a stream protocol.
|
|
|
|
#### Pros
|
|
|
|
- can start processing before the whole payload arrives
|
|
- better observability and progress reporting
|
|
- easier cancellation/retry semantics
|
|
- avoids one monolithic `Response.json()` boundary
|
|
|
|
#### Cons
|
|
|
|
- record-per-edge NDJSON would still create far too many JS objects
|
|
- to be worth it here, it should be **chunked columnar NDJSON**, not row NDJSON
|
|
- frontend load path must become stream-based
|
|
- renderer still currently expects all arrays at once
|
|
|
|
#### Fit for current pipeline
|
|
|
|
Moderate.
|
|
|
|
It can preserve all current information, but it does not by itself solve the "final representation should look like GPU inputs" goal unless each chunk is already columnar.
|
|
|
|
#### Best shape if chosen
|
|
|
|
Not:
|
|
|
|
- one JSON object per edge
|
|
- one JSON object per node
|
|
|
|
Better:
|
|
|
|
- one NDJSON record for metadata
|
|
- then NDJSON records where each record contains columnar chunks:
|
|
- `vertex_ids_chunk`
|
|
- `xs_chunk`
|
|
- `ys_chunk`
|
|
- `edge_sources_chunk`
|
|
- `edge_targets_chunk`
|
|
|
|
#### Overall assessment
|
|
|
|
Viable, but only attractive if progressiveness is a major goal. On its own, it is weaker than columnar binary formats for this renderer.
|
|
|
|
### 3. MessagePack
|
|
|
|
#### Idea
|
|
|
|
Use a compact binary encoding instead of JSON.
|
|
|
|
The official JavaScript implementation supports:
|
|
|
|
- `encode`
|
|
- `decode`
|
|
- `decodeAsync(stream)`
|
|
- `decodeArrayStream(stream)`
|
|
- `decodeMultiStream(stream)`
|
|
|
|
and even custom extension types for faster handling of large `Float32Array` payloads.
|
|
|
|
#### Pros
|
|
|
|
- smaller payload than JSON
|
|
- binary transport
|
|
- async and stream-capable decoding APIs exist
|
|
- mature JS library
|
|
|
|
#### Cons
|
|
|
|
- if you keep the current row-oriented schema, you still get one huge object graph after decode
|
|
- therefore MessagePack alone does not remove the fundamental object-allocation problem
|
|
- custom extension types improve typed-array cases, but then you are already halfway to designing a custom binary protocol
|
|
|
|
#### Fit for current pipeline
|
|
|
|
Moderate.
|
|
|
|
It can preserve all current information easily.
|
|
|
|
But if the schema remains object-heavy, the browser still ends up with millions of JS objects.
|
|
|
|
#### Overall assessment
|
|
|
|
Useful if paired with a **columnar** schema. Not compelling as a first move if the schema stays row-oriented.
|
|
|
|
### 4. Apache Arrow IPC
|
|
|
|
#### Idea
|
|
|
|
Use Arrow's columnar binary format and Arrow JS support.
|
|
|
|
Arrow JS provides:
|
|
|
|
- `tableFromIPC(...)`
|
|
- support for `fetch(...)`
|
|
- typed-array-backed vectors
|
|
- dictionary-encoded strings
|
|
- a columnar memory model explicitly meant for efficient processing and movement of large in-memory data
|
|
|
|
#### Pros
|
|
|
|
- strongest off-the-shelf fit for typed-array-oriented rendering
|
|
- columnar by design
|
|
- binary rather than textual
|
|
- supports large numeric columns very naturally
|
|
- supports dictionary encoding for repeated strings like labels or IRIs
|
|
- much closer to the renderer/GPU input shape than JSON objects
|
|
|
|
#### Cons
|
|
|
|
- larger conceptual/tooling jump than columnar JSON
|
|
- route segments are nested/variable-length; representing them cleanly needs design
|
|
- frontend code becomes Arrow-aware unless the decode is hidden behind an adapter
|
|
- backend must serialize Arrow on the Go side or produce Arrow-compatible IPC
|
|
|
|
#### Fit for current pipeline
|
|
|
|
Very good.
|
|
|
|
Current frontend needs can be represented as columns:
|
|
|
|
- `vertex_ids: uint32`
|
|
- `xs: float32`
|
|
- `ys: float32`
|
|
- `edge_sources: uint32`
|
|
- `edge_targets: uint32`
|
|
- `labels: utf8` or dictionary-encoded utf8
|
|
- `iris: utf8` or dictionary-encoded utf8
|
|
|
|
Route geometry should probably not stay as nested route-segment objects. It would fit better as:
|
|
|
|
- pre-flattened `route_line_vertices` float column/buffer
|
|
- or a second Arrow table dedicated to line segments
|
|
|
|
#### Overall assessment
|
|
|
|
One of the two best solutions for this repo.
|
|
|
|
If you want a standard format instead of inventing one, Arrow is the most attractive candidate.
|
|
|
|
### 5. FlatBuffers
|
|
|
|
#### Idea
|
|
|
|
Use a schema-defined binary format designed for direct access without unpacking/parsing.
|
|
|
|
FlatBuffers explicitly advertises:
|
|
|
|
- access to serialized data without parsing/unpacking
|
|
- memory efficiency and speed
|
|
- forwards/backwards compatibility
|
|
|
|
#### Pros
|
|
|
|
- very strong memory-efficiency story
|
|
- schema evolution support
|
|
- no full parse/unpack step in the same way as JSON
|
|
- can model both scalars and more complex structures
|
|
|
|
#### Cons
|
|
|
|
- requires schema/compiler/generated bindings
|
|
- JavaScript integration is more manual than JSON or Arrow
|
|
- ergonomics in app code are not as simple as arrays/objects
|
|
- strings and nested route structures are supported, but the developer experience is more specialized
|
|
|
|
#### Fit for current pipeline
|
|
|
|
Good, technically.
|
|
|
|
It can preserve all current information and remove the giant object-graph parse step.
|
|
|
|
However, compared with Arrow or a custom binary envelope, it is a less natural conceptual fit for a renderer whose hot path is already columnar/typed-array-based.
|
|
|
|
#### Overall assessment
|
|
|
|
A strong technical option, but probably not the most ergonomic option for this specific frontend.
|
|
|
|
### 6. Protocol Buffers
|
|
|
|
#### Idea
|
|
|
|
Use a schema-defined binary format with generated bindings.
|
|
|
|
#### Pros
|
|
|
|
- compact binary encoding
|
|
- schema/versioning
|
|
- mature ecosystem
|
|
|
|
#### Cons
|
|
|
|
- official docs describe protobuf as a good fit for typed structured messages up to a few megabytes
|
|
- the same docs warn that large data can require loading entire messages into memory and can cause multiple copies
|
|
- large repeated numeric arrays are not protobuf's sweet spot
|
|
- still not especially close to the renderer's typed-array model
|
|
|
|
#### Fit for current pipeline
|
|
|
|
Poor for this specific payload size and shape.
|
|
|
|
#### Overall assessment
|
|
|
|
Not recommended for this main graph transport.
|
|
|
|
### 7. Custom Binary Typed-Array Envelope
|
|
|
|
#### Idea
|
|
|
|
Define a transport specifically around what the renderer and hover/selection pipeline need.
|
|
|
|
Example structure:
|
|
|
|
- small fixed header or small JSON header:
|
|
- version
|
|
- counts
|
|
- offsets/lengths
|
|
- meta subset
|
|
- then raw binary buffers:
|
|
- `vertex_ids`
|
|
- `xs`
|
|
- `ys`
|
|
- `edge_sources`
|
|
- `edge_targets`
|
|
- `route_line_vertices`
|
|
- string dictionary / offsets for `label` and `iri`
|
|
|
|
#### Pros
|
|
|
|
- closest possible fit to current renderer
|
|
- no schema compiler required
|
|
- no row-object materialization
|
|
- easiest path to zero-copy or near-zero-copy arrays on the frontend
|
|
- easiest path to worker transfer via `ArrayBuffer`
|
|
- can separate hot render data from cold metadata cleanly
|
|
|
|
#### Cons
|
|
|
|
- custom protocol to design, version, validate, and document
|
|
- less tooling/interoperability than Arrow
|
|
- backend and frontend both need careful binary codecs
|
|
|
|
#### Fit for current pipeline
|
|
|
|
Excellent.
|
|
|
|
You can preserve all current behavior while only sending the data the frontend actually uses.
|
|
|
|
#### Overall assessment
|
|
|
|
The best performance-oriented fit if you are comfortable owning a custom format.
|
|
|
|
## Comparison Table
|
|
|
|
| Option | Closeness to GPU shape | Avoids giant object graph | Supports all current frontend data | Streaming-friendly | Implementation cost | Recommendation |
|
|
| --- | --- | --- | --- | --- | --- | --- |
|
|
| Current row JSON | Poor | No | Yes | Poor | Already done | Replace |
|
|
| Columnar JSON | Medium | No | Yes | Medium | Low | Good interim |
|
|
| NDJSON chunked columnar JSON | Medium | Partially | Yes | Good | Medium | Situational |
|
|
| MessagePack row-oriented | Poor | No | Yes | Good | Medium | Not enough alone |
|
|
| MessagePack columnar | Medium | Partially | Yes | Good | Medium | Viable but secondary |
|
|
| Arrow IPC | Very high | Yes or mostly yes | Yes | Good | Medium-high | Strong candidate |
|
|
| FlatBuffers | High | Yes | Yes | Medium | High | Good but specialized |
|
|
| Protobuf | Low-medium | No practical win here | Yes | Medium | Medium-high | Not recommended |
|
|
| Custom binary typed-array envelope | Very high | Yes | Yes | Good | High | Strongest fit |
|
|
|
|
## Recommended Data Contract Shapes
|
|
|
|
### Recommended shape for any non-row-oriented solution
|
|
|
|
The frontend does not need node/edge objects as its primary graph transport.
|
|
|
|
The main graph payload should be modeled as:
|
|
|
|
- `vertex_ids`
|
|
- `xs`
|
|
- `ys`
|
|
- `edge_sources`
|
|
- `edge_targets`
|
|
- `route_line_vertices`
|
|
- `node_labels`
|
|
- `node_iris`
|
|
- `meta`
|
|
|
|
This can be represented as:
|
|
|
|
- columnar JSON
|
|
- Arrow columns
|
|
- FlatBuffers vectors
|
|
- custom binary sections
|
|
|
|
### Fields that can be removed from the frontend transport immediately
|
|
|
|
Without changing current visible behavior, the main graph transport does not need to include:
|
|
|
|
- `nodes[].termType`
|
|
- `edges[].predicate_id`
|
|
- `meta.predicates`
|
|
- `meta.ttl_path`
|
|
- `meta.sparql_endpoint`
|
|
- `meta.include_bnodes`
|
|
- `meta.layout_engine`
|
|
- `meta.layout_root_iri`
|
|
- `route_segments[].edge_index`
|
|
- `route_segments[].kind`
|
|
|
|
Important:
|
|
|
|
Some of those fields are still needed by the backend's **internal snapshot**, especially for selection queries and hierarchy layout. That argues for splitting:
|
|
|
|
- internal snapshot model
|
|
- frontend transport DTO
|
|
|
|
instead of continuing to reuse one struct for both.
|
|
|
|
## Additional Architectural Notes
|
|
|
|
### A worker is complementary, not a transport format
|
|
|
|
Web Workers can move parsing/build work off the main thread, and `ArrayBuffer` is transferable. That is useful, but it does not by itself solve the current over-allocation problem if the payload is still a giant row-oriented JSON document.
|
|
|
|
Workers are most valuable when paired with:
|
|
|
|
- binary columnar payloads
|
|
- streamed columnar chunks
|
|
- transfer of `ArrayBuffer`s rather than giant JS object graphs
|
|
|
|
### The backend can keep a richer internal snapshot than it sends
|
|
|
|
This repo already caches snapshots server-side. Selection and triple queries are built from the backend snapshot and the small `graphMeta` values sent back by the client.
|
|
|
|
That means the frontend transport can be much slimmer than the backend snapshot representation, as long as the backend retains its richer internal data.
|
|
|
|
This is the cleanest way to avoid losing information while optimizing the frontend transport.
|
|
|
|
## Final Recommendation
|
|
|
|
### Best long-term option
|
|
|
|
Pick one of:
|
|
|
|
1. **Custom binary typed-array envelope**
|
|
2. **Apache Arrow IPC**
|
|
|
|
Reason:
|
|
|
|
- both map naturally to the renderer's actual input model
|
|
- both avoid the giant row-object parse path
|
|
- both can preserve all current frontend-visible information
|
|
|
|
### Best low-risk migration path
|
|
|
|
If you want an incremental step before going binary:
|
|
|
|
1. split backend internal snapshot from frontend transport DTO
|
|
2. move `/api/graph` to **columnar JSON**
|
|
3. keep only the metadata fields the frontend actually uses
|
|
4. later replace the same columnar DTO with Arrow or custom binary
|
|
|
|
That path reduces waste immediately and keeps the eventual binary migration straightforward.
|
|
|
|
## Sources
|
|
|
|
Official documentation and primary sources used for the comparison:
|
|
|
|
- MDN `Response.json()`
|
|
- https://developer.mozilla.org/en-US/docs/Web/API/Response/json
|
|
- MDN `TextDecoderStream`
|
|
- https://developer.mozilla.org/en-US/docs/Web/API/TextDecoderStream
|
|
- MDN Web Workers
|
|
- https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers
|
|
- MDN Transferable Objects
|
|
- https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects
|
|
- Apache Arrow JavaScript
|
|
- https://arrow.apache.org/js/current/
|
|
- https://arrow.apache.org/js/main/functions/Arrow.dom.tableFromIPC.html
|
|
- NDJSON specification
|
|
- https://github.com/ndjson/ndjson-spec
|
|
- MessagePack for JavaScript
|
|
- https://github.com/msgpack/msgpack-javascript
|
|
- FlatBuffers overview and JavaScript docs
|
|
- https://flatbuffers.dev/
|
|
- https://flatbuffers.dev/languages/javascript/
|
|
- Protocol Buffers overview
|
|
- https://protobuf.dev/overview/
|
|
- Streaming JSON parser references
|
|
- https://github.com/juanjoDiaz/streamparser-json
|
|
- https://rictic.github.io/jsonriver/
|