GPU Configuration Wizard#
Auto-configure optimal GPU settings for your Graphistry deployment based on your hardware.
Overview#
The GPU Configuration Wizard simplifies multi-GPU configuration by:
Auto-detecting available GPUs via
nvidia-smiSupporting 140+ hardware presets for cloud and on-prem environments
Generating optimal worker counts using a simple replication model
Exporting settings directly to
custom.env
Usage#
# Interactive mode - displays recommended settings
./etc/scripts/gpu-config-wizard.sh
# Export mode - print to stdout (for copying to custom.env)
./etc/scripts/gpu-config-wizard.sh -E
# Export mode - append directly to custom.env file
./etc/scripts/gpu-config-wizard.sh -E ./data/config/custom.env
# Use hardware preset
./etc/scripts/gpu-config-wizard.sh -p aws-p3-8xlarge
# Custom GPU count
./etc/scripts/gpu-config-wizard.sh -n 4
# Specific GPU indices
./etc/scripts/gpu-config-wizard.sh -g 0,2,5-7
# Custom worker multipliers
./etc/scripts/gpu-config-wizard.sh -w 8,8,2
# List all available presets
./etc/scripts/gpu-config-wizard.sh -l
Options#
Option |
Description |
|---|---|
|
Use N GPUs (indices 0 to N-1) |
|
Use specific GPUs (e.g., |
|
Worker multipliers: forge,streamgl,dask (default: 4,4,1) |
|
Use hardware preset (140+ available) |
|
Export env vars (stdout or append to file) |
|
List all available presets |
|
Show help message |
Replication Model#
The wizard uses a simple replication model:
Total workers = N GPUs x multiplier
Default multipliers (per GPU):
forge-etl-python: 4 workers
streamgl-gpu: 4 workers
dask-cuda-worker: 1 worker
Workers > GPUs is intentional for multi-user concurrent handling:
Multiple users can run blocking tasks simultaneously
Cached operations can be served while compute runs
Round-robin GPU assignment distributes load
Examples#
# Auto-detect GPUs, use default multipliers
./etc/scripts/gpu-config-wizard.sh
# Use 4 GPUs with default multipliers (16 forge, 16 streamgl, 4 dask)
./etc/scripts/gpu-config-wizard.sh -n 4
# Use specific GPUs with custom multipliers
./etc/scripts/gpu-config-wizard.sh -g 2,3,5-7 -w 2,2,1
# Simulate DGX A100 configuration
./etc/scripts/gpu-config-wizard.sh -p dgx-a100
# AWS p3.8xlarge preset with custom multipliers
./etc/scripts/gpu-config-wizard.sh -p aws-p3-8xlarge -w 2,2,1
# Print configuration to stdout (for copying)
./etc/scripts/gpu-config-wizard.sh -p dgx-a100 -E
# Export configuration directly to custom.env file
./etc/scripts/gpu-config-wizard.sh -p dgx-a100 -E ./data/config/custom.env
Hardware Presets#
NVIDIA DGX Systems#
Preset |
GPUs |
Description |
|---|---|---|
|
8 |
NVIDIA DGX A100 (8x A100 80GB) |
|
8 |
NVIDIA DGX H100 (8x H100 80GB) |
|
4 |
NVIDIA DGX Station A100 (4x A100 80GB) |
|
8 |
NVIDIA DGX-1 (8x V100 32GB) |
|
16 |
NVIDIA DGX-2 (16x V100 32GB) |
|
1 |
NVIDIA DGX Spark ARM (1x GB200 128GB) |
|
8 |
NVIDIA DGX B200 (8x B200 192GB) |
NVIDIA Grace Hopper / Blackwell#
Preset |
GPUs |
Description |
|---|---|---|
|
1 |
NVIDIA GH200 Grace Hopper ARM (1x GH200 96GB) |
|
2 |
NVIDIA GH200 NVL ARM (2x GH200 96GB) |
|
2 |
NVIDIA GB200 NVL2 ARM (2x GB200 384GB) |
|
72 |
NVIDIA GB200 NVL72 ARM (72x GB200 13.5TB total) |
|
8 |
NVIDIA HGX B200 (8x B200 192GB) |
|
8 |
NVIDIA HGX B100 (8x B100 192GB) |
AWS EC2 Instances#
Preset |
GPUs |
Description |
|---|---|---|
|
8 |
AWS p5.48xlarge (8x H100 80GB) |
|
8 |
AWS p4d.24xlarge (8x A100 40GB) |
|
8 |
AWS p4de.24xlarge (8x A100 80GB) |
|
8 |
AWS p3.16xlarge (8x V100 16GB) |
|
4 |
AWS p3.8xlarge (4x V100 16GB) |
|
1 |
AWS p3.2xlarge (1x V100 16GB) |
|
1 |
AWS g5.xlarge (1x A10G 24GB) |
|
4 |
AWS g5.12xlarge (4x A10G 24GB) |
|
8 |
AWS g5.48xlarge (8x A10G 24GB) |
|
1 |
AWS g4dn.xlarge (1x T4 16GB) |
|
4 |
AWS g4dn.12xlarge (4x T4 16GB) |
|
1 |
AWS g5g.xlarge Graviton2 ARM (1x T4 16GB) |
|
2 |
AWS g5g.16xlarge Graviton2 ARM (2x T4 16GB) |
|
2 |
AWS g5g.metal Graviton2 ARM (2x T4 16GB) |
Microsoft Azure#
Preset |
GPUs |
Description |
|---|---|---|
|
8 |
Azure ND96asr v4 (8x A100 40GB) |
|
8 |
Azure ND96amsr A100 v4 (8x A100 80GB) |
|
8 |
Azure ND96isr H100 v5 (8x H100 80GB) |
|
1 |
Azure NC24ads A100 v4 (1x A100 80GB) |
|
2 |
Azure NC48ads A100 v4 (2x A100 80GB) |
|
4 |
Azure NC96ads A100 v4 (4x A100 80GB) |
|
1 |
Azure NC6s v3 (1x V100 16GB) |
|
2 |
Azure NC12s v3 (2x V100 16GB) |
|
4 |
Azure NC24s v3 (4x V100 16GB) |
|
1 |
Azure NC4as T4 v3 (1x T4 16GB) |
|
4 |
Azure NC64as T4 v3 (4x T4 16GB) |
|
1 |
Azure NV6ads A10 v5 (1x A10 4GB, 1/6 GPU) |
|
1 |
Azure NV36ads A10 v5 (1x A10 24GB) |
|
2 |
Azure NV72ads A10 v5 (2x A10 24GB) |
|
1 |
Azure NCads H100 v5 ARM (1x H100 80GB) |
Google Cloud Platform#
Preset |
GPUs |
Description |
|---|---|---|
|
1 |
GCP a2-highgpu-1g (1x A100 40GB) |
|
2 |
GCP a2-highgpu-2g (2x A100 40GB) |
|
4 |
GCP a2-highgpu-4g (4x A100 40GB) |
|
8 |
GCP a2-highgpu-8g (8x A100 40GB) |
|
16 |
GCP a2-megagpu-16g (16x A100 40GB) |
|
1 |
GCP a2-ultragpu-1g (1x A100 80GB) |
|
2 |
GCP a2-ultragpu-2g (2x A100 80GB) |
|
4 |
GCP a2-ultragpu-4g (4x A100 80GB) |
|
8 |
GCP a2-ultragpu-8g (8x A100 80GB) |
|
8 |
GCP a3-highgpu-8g (8x H100 80GB) |
|
1 |
GCP n1-standard (1x T4 16GB) |
|
4 |
GCP n1-standard (4x T4 16GB) |
|
1 |
GCP g2-standard-4 ARM (1x L4 24GB) |
|
2 |
GCP g2-standard-16 ARM (2x L4 24GB) |
Oracle Cloud Infrastructure#
Preset |
GPUs |
Description |
|---|---|---|
|
4 |
OCI BM.GPU.A100-v2.8 (4x A100 40GB) |
|
8 |
OCI BM.GPU.H100.8 (8x H100 80GB) |
|
4 |
OCI BM.GPU.A10.4 (4x A10 24GB) |
|
1 |
OCI VM.GPU.A10.1 (1x A10 24GB) |
|
2 |
OCI VM.GPU.A10.2 (2x A10 24GB) |
|
4 |
OCI BM.GPU.A10.4 ARM (4x A10 24GB) |
Lambda Labs#
Preset |
GPUs |
Description |
|---|---|---|
|
1 |
Lambda Labs (1x A100 40GB) |
|
2 |
Lambda Labs (2x A100 40GB) |
|
4 |
Lambda Labs (4x A100 40GB) |
|
8 |
Lambda Labs (8x A100 40GB) |
|
1 |
Lambda Labs (1x H100 80GB) |
|
8 |
Lambda Labs (8x H100 80GB) |
|
1 |
Lambda Labs (1x A10 24GB) |
|
4 |
Lambda Labs (4x RTX A6000 48GB) |
CoreWeave#
Preset |
GPUs |
Description |
|---|---|---|
|
1 |
CoreWeave (1x A100 40GB) |
|
1 |
CoreWeave (1x A100 80GB) |
|
8 |
CoreWeave (8x A100 40GB) |
|
1 |
CoreWeave (1x H100 80GB) |
Workstations#
Preset |
GPUs |
Description |
|---|---|---|
|
1 |
Workstation (1x RTX 4090 24GB) |
|
2 |
Workstation (2x RTX 4090 24GB) |
|
1 |
Workstation (1x RTX A6000 48GB) |
|
2 |
Workstation (2x RTX A6000 48GB) |
|
4 |
Workstation (4x RTX A6000 48GB) |
Consumer / Hobbyist#
Preset |
GPUs |
Description |
|---|---|---|
|
1 |
Consumer (1x RTX 5090 32GB) |
|
2 |
Consumer (2x RTX 5090 32GB) |
|
1 |
Consumer (1x RTX 4090 24GB) |
|
2 |
Consumer (2x RTX 4090 24GB) |
|
4 |
Consumer (4x RTX 4090 24GB) |
|
1 |
Consumer (1x RTX 3090 24GB) |
|
2 |
Consumer (2x RTX 3090 24GB) |
|
4 |
Consumer (4x RTX 3090 24GB) |
|
8 |
Hobbyist (8x RTX 3090 24GB) |
|
10 |
Hobbyist (10x RTX 3090 24GB) |
Rack Servers#
Preset |
GPUs |
Description |
|---|---|---|
|
24 |
Rack server (24x A100 40GB) |
|
32 |
Rack server (32x A100 40GB) |
Development#
Preset |
GPUs |
Description |
|---|---|---|
|
1 |
Development (1x RTX 4080 16GB) |
|
2 |
Development (2x RTX 3060 12GB) |
Supercomputers#
Preset |
GPUs |
Description |
|---|---|---|
|
256 |
NVIDIA DGX SuperPOD (32x DGX H100) |
|
256 |
NVIDIA DGX SuperPOD (32x DGX B200) |
|
32 |
NVIDIA DGX BasePOD (4x DGX H100) |
|
100000 |
xAI Colossus Supercomputer (100k H100) |
|
16000 |
Meta Research SuperCluster (16000x A100 80GB) |
|
14400 |
Microsoft Eagle Supercomputer (14400x H100 80GB) |
Generated Configuration#
The wizard generates environment variables for data/config/custom.env:
# GPU Assignment (all services share all GPUs)
CUDA_VISIBLE_DEVICES=0,1,2,3
# Worker Configuration (N GPUs x multiplier)
FORGE_NUM_WORKERS=16
STREAMGL_NUM_WORKERS=16
DASK_NUM_WORKERS=4
Applying Configuration#
After generating settings:
Copy the settings to
data/config/custom.env(or use-Eto export directly)Restart GPU services:
./graphistry up --force-recreate forge-etl-python streamgl-gpu dask-cuda-worker
Fine-Grained Control#
For advanced tuning beyond the wizard, set these environment variables directly in data/config/custom.env:
Worker counts:
FORGE_NUM_WORKERS- forge-etl-python workersSTREAMGL_NUM_WORKERS- streamgl-gpu workersDASK_NUM_WORKERS- dask-cuda-worker instances
Per-service GPU assignment:
FORGE_CUDA_VISIBLE_DEVICES- forge-etl-pythonDCW_CUDA_VISIBLE_DEVICES- dask-cuda-workerSTREAMGL_CUDA_VISIBLE_DEVICES- streamgl-gpuDASK_SCHEDULER_CUDA_VISIBLE_DEVICES- dask-schedulerGAK_PUBLIC_CUDA_VISIBLE_DEVICES- graph-app-kit-publicGAK_PRIVATE_CUDA_VISIBLE_DEVICES- graph-app-kit-privateNOTEBOOK_CUDA_VISIBLE_DEVICES- notebook
See Performance Tuning - Multi-GPU for detailed configuration guidance.