Graphistry Architecture#

Deployment Model#

Client/server model for direct users, embedding users, and admins - Live diagram

Engines and connectors - chart

Server Software Architecture#

Service Overview#

Graphistry uses a microservices architecture with GPU-accelerated graph processing. All GPU services communicate using Apache Arrow format for high-performance data exchange.

Proxy Layer#

Service	Role
caddy	External SSL/TLS termination, primary entry point (ports 80, 443)
nginx	Internal reverse proxy, routes requests to backend services

Frontend Services#

Service	Role
nexus	Django backend API, user management, dataset metadata, file uploads
streamgl-viz	WebGL graph visualization frontend, WebSocket/Falcor protocol
pivot	Investigation interface for graph exploration

GPU Services#

Service	Role	GPU Technology
streamgl-gpu	Graph layout computation (ForceAtlas2)	OpenCL
forge-etl-python	GPU ETL orchestrator, data processing	CUDA/cuDF
dask-cuda-worker	Distributed GPU data transformation	RAPIDS/cuDF

Infrastructure Services#

Service	Role
postgres	Metadata storage (users, sessions, dataset metadata)
redis	Caching, session storage, message queuing
dask-scheduler	Distributed computing coordinator

Data Flow#

Graph Visualization Pipeline#

1. Upload      User uploads data via API or UI
                    |
                    v
2. Ingest      nexus -> forge-etl-python (file processing)
                    |
                    v
3. Transform   forge-etl-python -> dask-cuda-worker (RAPIDS GPU processing)
                    |
                    v
4. Layout      streamgl-gpu (ForceAtlas2 GPU layout computation)
                    |
                    v
5. Render      streamgl-viz -> Browser (WebGL visualization)

Request Flow#

Browser -> caddy:443 -> nginx -> backend services
                          |
                          +-- nexus:8000 (API, auth)
                          +-- streamgl-viz:8080 (visualization)
                          +-- forge-etl-python:8080 (data)
                          +-- streamgl-gpu:8080 (layout)

Service Dependencies#

Startup Order#

Services start in dependency order:

Infrastructure: postgres, redis
Distributed Computing: dask-scheduler, dask-cuda-worker
Core API: nexus
GPU Services: forge-etl-python, streamgl-gpu
Applications: streamgl-viz, pivot, notebook
Proxies: nginx, caddy

Runtime Dependencies#

Service	Depends On
nexus	postgres
forge-etl-python	dask-scheduler, dask-cuda-worker, postgres, redis
dask-cuda-worker	dask-scheduler, GPU
streamgl-gpu	GPU (OpenCL)
streamgl-viz	nexus, forge-etl-python

GPU Resource Management#

Per-Service GPU Assignment#

Each GPU service can be assigned to specific GPUs via environment variables:

# Global default
CUDA_VISIBLE_DEVICES=0,1,2,3

# Per-service overrides
FORGE_CUDA_VISIBLE_DEVICES=0,1     # forge-etl-python
STREAMGL_CUDA_VISIBLE_DEVICES=2,3  # streamgl-gpu
DCW_CUDA_VISIBLE_DEVICES=0,1       # dask-cuda-worker

See Environment Variables for details.

Multi-Worker Configuration#

Workers are distributed across GPUs using round-robin assignment:

CUDA_VISIBLE_DEVICES=0,1,2,3
FORGE_NUM_WORKERS=16      # 4 workers per GPU
STREAMGL_NUM_WORKERS=16   # 4 workers per GPU
DASK_NUM_WORKERS=4        # 1 worker per GPU

See GPU Configuration Wizard for automated configuration.

Network Architecture#

Internal Network#

All services communicate on an internal Docker network (grph_net). Only caddy exposes external ports.

Ports#

Port	Service	Access
80, 443	caddy	External (user access)
8000	nexus	Internal
8080	streamgl-*, forge-etl-python	Internal
5432	postgres	Internal
6379	redis	Internal

SSL/TLS#

External: Caddy handles SSL termination with automatic certificate management
Internal: Services communicate over HTTP within Docker network

Scaling#

Horizontal Scaling#

GPU workers: Increase FORGE_NUM_WORKERS, STREAMGL_NUM_WORKERS, DASK_NUM_WORKERS
Multi-GPU: Add GPUs via CUDA_VISIBLE_DEVICES

Vertical Scaling#

GPU memory: Use GPUs with more VRAM for larger graphs
CPU/RAM: More cores and memory for larger concurrent user counts

See Deployment Planning for capacity guidance

Graphistry Architecture

Contents