Run Your First Experiment
Launch your first GPU-powered experiment in Hubify Labs — from setup to results in minutes.
This guide walks you through running your first experiment on GPU compute. We will use a simple MCMC chain as an example, but the workflow applies to any experiment type.
Prerequisites
- A lab created (create one first)
- GPU compute connected (set up RunPod) — or use CPU for this tutorial
Overview
Every experiment follows the same lifecycle:
DRAFT → QUEUED → RUNNING → QC_GATE → COMPLETE
You define it. The orchestrator queues it. An agent runs it on a GPU pod. QC validates the results. Done.
Option 1: Natural Language
The fastest way to run an experiment is to describe it to the orchestrator.
Open the Orchestrator Chat in Captain View and type:
```
Run a test MCMC chain with 1000 samples on the base Planck dataset.
Use an H100 pod. Save the chain output and a posterior plot.
```
The orchestrator will:
1. Create the experiment (EXP-001)
2. Allocate an H100 pod
3. Assign the Research Lead
4. Execute and report back when complete
```bash
hubify experiment run "Test MCMC chain, 1000 samples, Planck base, H100 pod"
```
Option 2: Structured Definition
For more control, define the experiment explicitly.
Write a config file
Create an experiment config:
# experiment.yaml
name: "test-mcmc-planck"
description: "Test MCMC chain on Planck base likelihood"
script: run_cobaya.py
config: planck_base.yaml
pod:
gpu: h100
timeout: 2h
outputs:
- chain_samples.txt
- posterior_plot.png
qc:
convergence_threshold: 1.10 # Relaxed for test run
min_samples: 1000
Submit the experiment
hubify experiment run --file experiment.yaml
Watch the logs
hubify logs EXP-001 --follow
You will see real-time output from the pod:
[10:42:01] Pod provisioned: h100-abc123
[10:42:15] Environment initialized
[10:42:20] Starting Cobaya MCMC sampler...
[10:43:05] Sample 100/1000
[10:44:12] Sample 500/1000
[10:45:30] Sample 1000/1000
[10:45:31] Chain complete. Writing output...
[10:45:35] QC gate: checking convergence...
[10:45:36] QC PASS: R-hat = 1.04 (threshold: 1.10)
[10:45:37] Experiment COMPLETE
Review results
# View experiment summary
hubify experiment status EXP-001
# Download outputs
hubify experiment outputs EXP-001 --download ./results/
# View in Data Explorer
hubify data open EXP-001
Understanding the Output
After completion, your experiment includes:
| Output | Description |
|---|---|
chain_samples.txt | Raw MCMC chain (space-delimited, weights in column 1) |
posterior_plot.png | Auto-generated posterior distribution |
experiment_log.txt | Full execution log |
qc_report.json | QC gate results (convergence, completeness) |
reproducibility.json | Git SHA, dependencies, config checksums |
What Happens Next
The Houston Method requires every completed experiment to generate follow-up tasks:
- Scientific analysis — What do the results mean?
- Knowledge base update — Record findings in the wiki
- Paper integration — Tag results for paper sections if applicable
- Queue expansion — Generate 5-15 new tasks based on what was learned
The orchestrator handles this automatically after QC passes.
Troubleshooting
<AccordionGroup>Experiment stuck in QUEUED
Check that compute is connected and pods are available:
hubify pod list
hubify pod budget
QC gate failed
View the QC report for details:
hubify experiment qc EXP-001
Common fixes: increase sample count, check input data, adjust convergence threshold.
Pod crashed mid-experiment
Resume from the last checkpoint:
hubify experiment resume EXP-001 --from-checkpoint latest
</AccordionGroup>