Graphistry Architecture#
See also:
Deployment Model#
Client/server model for direct users, embedding users, and admins - Live diagram
Engines and connectors - chart
Server Software Architecture#
Service Overview#
Graphistry uses a microservices architecture with GPU-accelerated graph processing. All GPU services communicate using Apache Arrow format for high-performance data exchange.
Proxy Layer#
Service |
Role |
|---|---|
caddy |
External SSL/TLS termination, primary entry point (ports 80, 443) |
nginx |
Internal reverse proxy, routes requests to backend services |
Frontend Services#
Service |
Role |
|---|---|
nexus |
Django backend API, user management, dataset metadata, file uploads |
streamgl-viz |
WebGL graph visualization frontend, WebSocket/Falcor protocol |
pivot |
Investigation interface for graph exploration |
GPU Services#
Service |
Role |
GPU Technology |
|---|---|---|
streamgl-gpu |
Graph layout computation (ForceAtlas2) |
OpenCL |
forge-etl-python |
GPU ETL orchestrator, data processing |
CUDA/cuDF |
dask-cuda-worker |
Distributed GPU data transformation |
RAPIDS/cuDF |
Infrastructure Services#
Service |
Role |
|---|---|
postgres |
Metadata storage (users, sessions, dataset metadata) |
redis |
Caching, session storage, message queuing |
dask-scheduler |
Distributed computing coordinator |
Data Flow#
Graph Visualization Pipeline#
1. Upload User uploads data via API or UI
|
v
2. Ingest nexus -> forge-etl-python (file processing)
|
v
3. Transform forge-etl-python -> dask-cuda-worker (RAPIDS GPU processing)
|
v
4. Layout streamgl-gpu (ForceAtlas2 GPU layout computation)
|
v
5. Render streamgl-viz -> Browser (WebGL visualization)
Request Flow#
Browser -> caddy:443 -> nginx -> backend services
|
+-- nexus:8000 (API, auth)
+-- streamgl-viz:8080 (visualization)
+-- forge-etl-python:8080 (data)
+-- streamgl-gpu:8080 (layout)
Service Dependencies#
Startup Order#
Services start in dependency order:
Infrastructure: postgres, redis
Distributed Computing: dask-scheduler, dask-cuda-worker
Core API: nexus
GPU Services: forge-etl-python, streamgl-gpu
Applications: streamgl-viz, pivot, notebook
Proxies: nginx, caddy
Runtime Dependencies#
Service |
Depends On |
|---|---|
nexus |
postgres |
forge-etl-python |
dask-scheduler, dask-cuda-worker, postgres, redis |
dask-cuda-worker |
dask-scheduler, GPU |
streamgl-gpu |
GPU (OpenCL) |
streamgl-viz |
nexus, forge-etl-python |
GPU Resource Management#
Per-Service GPU Assignment#
Each GPU service can be assigned to specific GPUs via environment variables:
# Global default
CUDA_VISIBLE_DEVICES=0,1,2,3
# Per-service overrides
FORGE_CUDA_VISIBLE_DEVICES=0,1 # forge-etl-python
STREAMGL_CUDA_VISIBLE_DEVICES=2,3 # streamgl-gpu
DCW_CUDA_VISIBLE_DEVICES=0,1 # dask-cuda-worker
See Environment Variables for details.
Multi-Worker Configuration#
Workers are distributed across GPUs using round-robin assignment:
CUDA_VISIBLE_DEVICES=0,1,2,3
FORGE_NUM_WORKERS=16 # 4 workers per GPU
STREAMGL_NUM_WORKERS=16 # 4 workers per GPU
DASK_NUM_WORKERS=4 # 1 worker per GPU
See GPU Configuration Wizard for automated configuration.
Network Architecture#
Internal Network#
All services communicate on an internal Docker network (grph_net). Only caddy exposes external ports.
Ports#
Port |
Service |
Access |
|---|---|---|
80, 443 |
caddy |
External (user access) |
8000 |
nexus |
Internal |
8080 |
streamgl-*, forge-etl-python |
Internal |
5432 |
postgres |
Internal |
6379 |
redis |
Internal |
SSL/TLS#
External: Caddy handles SSL termination with automatic certificate management
Internal: Services communicate over HTTP within Docker network
Scaling#
Horizontal Scaling#
GPU workers: Increase
FORGE_NUM_WORKERS,STREAMGL_NUM_WORKERS,DASK_NUM_WORKERSMulti-GPU: Add GPUs via
CUDA_VISIBLE_DEVICES
Vertical Scaling#
GPU memory: Use GPUs with more VRAM for larger graphs
CPU/RAM: More cores and memory for larger concurrent user counts
See Deployment Planning for capacity guidance