Experiments
Experiments are the atomic unit of research progress — GPU-powered tasks with automatic logging, QC gates, and reproducibility tracking.
An experiment is a discrete, trackable research task. It is the atomic unit of progress in Hubify Labs. Every experiment has a lifecycle, assigned compute, quality control, and full provenance tracking.
Experiment Lifecycle
DRAFT → QUEUED → RUNNING → QC_GATE → COMPLETE / FAILED
Draft
Define the experiment: name, description, input data, expected outputs, and compute requirements. This can be done manually or by an agent interpreting your natural-language request.
Queued
The experiment enters the queue. The orchestrator assigns it to an agent and allocates a GPU pod based on compute requirements.
Running
The assigned agent executes the experiment on the allocated pod. Logs stream in real time. Intermediate results are checkpointed.
QC Gate
Every experiment must pass a quality control gate before results are accepted. The QC gate checks:
- Output completeness (all expected files produced)
- Statistical validity (convergence, error bounds)
- Reproducibility (config + data + code are frozen)
- Cross-model review (a different model verifies the results)
Complete / Failed
Experiments that pass QC are marked complete and their results flow into the knowledge base, paper pipeline, and lab site. Failed experiments are logged with diagnostics for debugging.
The Houston Method
Hubify Labs enforces a mandatory completion protocol for every experiment:
Note: Nothing is "complete" without: QC gate → scientific analysis → interpretation → cross-survey connection → site sync → queue expansion → backup.
Every completed experiment must generate 5-15 new tasks — questions raised, follow-up analyses needed, or new hypotheses to test. This ensures the research queue never runs dry.
Experiment Properties
| Property | Description |
|---|---|
id | Unique identifier (e.g., EXP-054) |
name | Human-readable name |
status | Current lifecycle stage |
agent | Assigned agent(s) |
pod | GPU pod allocation |
inputs | Input datasets, configs, parameters |
outputs | Result files, figures, metrics |
qc_score | Quality control score (0-100) |
duration | Wall-clock runtime |
cost | Compute cost in USD |
parent | Parent experiment (if this is a follow-up) |
Compute Allocation
When an experiment is queued, the system selects the optimal pod:
If estimated_runtime(H100) * cost_per_hour(H100) < estimated_runtime(H200) * cost_per_hour(H200):
allocate H100
else:
allocate H200
You can override this by specifying a pod type explicitly.
Reproducibility
Every experiment automatically captures:
- Git commit of the codebase at execution time
- Exact package versions (pip freeze / conda list)
- Config files (YAML, JSON) used
- Input data checksums (SHA-256)
- Random seeds
This means any experiment can be re-run identically months or years later.
CLI
# Run an experiment
hubify experiment run --name "mcmc-base" --pod h100 --config config.yaml
# Check status
hubify experiment status EXP-054
# List recent experiments
hubify experiment list --limit 20
# View logs
hubify logs EXP-054 --follow
# Rerun a failed experiment
hubify experiment rerun EXP-054
Chaining Experiments
Experiments can depend on each other. When experiment A completes, experiment B automatically starts with A's outputs as inputs:
# experiment-chain.yaml
chain:
- name: "data-preprocessing"
pod: h100
script: preprocess.py
- name: "mcmc-sampling"
pod: h200
script: run_mcmc.py
depends_on: "data-preprocessing"
- name: "convergence-check"
pod: cpu
script: check_convergence.py
depends_on: "mcmc-sampling"
hubify experiment run --chain experiment-chain.yaml