Compute
GPU compute in Hubify Labs — RunPod integration, pod management, cost optimization, and the GPU inference playbook.
Hubify Labs gives you on-demand access to high-end GPU compute for running experiments. Currently powered by RunPod, with Modal serverless functions coming soon.
Supported Hardware
| GPU | VRAM | Best For | Cost Range |
|---|---|---|---|
| H200 | 141 GB | Large-scale MCMC, foundation model inference, multi-survey sweeps | $$$ |
| H100 | 80 GB | Training runs, medium MCMC chains, anomaly detection | $$ |
| A100 | 80 GB | General GPU compute, smaller models | $ |
| CPU | N/A | Data preprocessing, analysis, lightweight tasks | Free tier |
Pod Lifecycle
Provision
When an experiment needs GPU, Hubify provisions a pod on RunPod. The system selects the optimal GPU type based on the experiment's memory and compute requirements.
Initialize
The pod boots with your lab's environment: dependencies installed, data mounted, SSH keys configured.
Execute
Your experiment runs on the pod. Logs stream in real time. Intermediate results checkpoint to persistent storage.
Teardown
When the experiment completes (or fails), the pod is torn down automatically. Results are saved to your lab before teardown.
Cost Optimization
Hubify automatically optimizes for cost:
total_cost = runtime_hours * cost_per_hour
If H200 finishes in 1 hour at $4/hr = $4
H100 finishes in 3 hours at $2/hr = $6
→ System picks H200 (cheaper overall)
You can set a monthly budget cap per lab. When you approach the limit, experiments queue instead of launching, and you get a notification.
GPU Inference Playbook
Warning: Always use
torch.utils.data.DataLoaderwithnum_workers=16, pin_memory=True, prefetch_factor=4for image/data inference. This gives a 32x speedup over serial processing.
Key rules from the playbook:
- Never use serial PIL decoding for batch image processing
- Never use
ProcessPoolExecutorfor GPU-bound work - Never use HuggingFace streaming for production inference
- Always pin memory and prefetch for GPU DataLoaders
Persistent Storage
Each lab gets persistent storage that survives pod teardowns:
/workspace/on pods maps to your lab's persistent volume- Experiment outputs are automatically synced back to the lab
- Datasets can be pre-staged in persistent storage for fast access
SSH Access
Every running pod is accessible via SSH for debugging:
# Get SSH command for a running pod
hubify pod ssh EXP-054
# Direct SSH (shown in pod details)
ssh root@205.196.19.52 -p 11452
Idle Pod Detection
Note: An idle GPU is a violation. Hubify monitors pod utilization and alerts you when a pod is sitting idle. The system will suggest the next experiment to deploy on an idle pod.
If a pod finishes its assigned experiment and no follow-up is queued, the system:
- Alerts you that the pod is idle
- Suggests experiments from the queue that could use this pod
- Auto-deploys the next experiment if you have auto-schedule enabled
CLI
# List active pods
hubify pod list
# Launch a pod manually
hubify pod create --gpu h100 --hours 4
# Check pod status
hubify pod status pod-abc123
# SSH into a pod
hubify pod ssh pod-abc123
# Terminate a pod
hubify pod stop pod-abc123
# View cost summary
hubify pod cost --month current
Coming Soon: Modal
Modal integration will add serverless GPU functions — pay per second, no pod management. Ideal for short-lived tasks like figure generation, small inferences, and data transformations.