Experiments

Experiments are the atomic unit of research progress — GPU-powered tasks with automatic logging, QC gates, and reproducibility tracking.

An experiment is a discrete, trackable research task. It is the atomic unit of progress in Hubify Labs. Every experiment has a lifecycle, assigned compute, quality control, and full provenance tracking.

Experiment Lifecycle

DRAFT → QUEUED → RUNNING → QC_GATE → COMPLETE / FAILED

Draft

Define the experiment: name, description, input data, expected outputs, and compute requirements. This can be done manually or by an agent interpreting your natural-language request.

Queued

The experiment enters the queue. The orchestrator assigns it to an agent and allocates a GPU pod based on compute requirements.

Running

The assigned agent executes the experiment on the allocated pod. Logs stream in real time. Intermediate results are checkpointed.

QC Gate

Every experiment must pass a quality control gate before results are accepted. The QC gate checks:

  • Output completeness (all expected files produced)
  • Statistical validity (convergence, error bounds)
  • Reproducibility (config + data + code are frozen)
  • Cross-model review (a different model verifies the results)

Complete / Failed

Experiments that pass QC are marked complete and their results flow into the knowledge base, paper pipeline, and lab site. Failed experiments are logged with diagnostics for debugging.

The Houston Method

Hubify Labs enforces a mandatory completion protocol for every experiment:

Note: Nothing is "complete" without: QC gate → scientific analysis → interpretation → cross-survey connection → site sync → queue expansion → backup.

Every completed experiment must generate 5-15 new tasks — questions raised, follow-up analyses needed, or new hypotheses to test. This ensures the research queue never runs dry.

Experiment Properties

PropertyDescription
idUnique identifier (e.g., EXP-054)
nameHuman-readable name
statusCurrent lifecycle stage
agentAssigned agent(s)
podGPU pod allocation
inputsInput datasets, configs, parameters
outputsResult files, figures, metrics
qc_scoreQuality control score (0-100)
durationWall-clock runtime
costCompute cost in USD
parentParent experiment (if this is a follow-up)

Compute Allocation

When an experiment is queued, the system selects the optimal pod:

If estimated_runtime(H100) * cost_per_hour(H100) < estimated_runtime(H200) * cost_per_hour(H200):
    allocate H100
else:
    allocate H200

You can override this by specifying a pod type explicitly.

Reproducibility

Every experiment automatically captures:

  • Git commit of the codebase at execution time
  • Exact package versions (pip freeze / conda list)
  • Config files (YAML, JSON) used
  • Input data checksums (SHA-256)
  • Random seeds

This means any experiment can be re-run identically months or years later.

CLI

# Run an experiment
hubify experiment run --name "mcmc-base" --pod h100 --config config.yaml

# Check status
hubify experiment status EXP-054

# List recent experiments
hubify experiment list --limit 20

# View logs
hubify logs EXP-054 --follow

# Rerun a failed experiment
hubify experiment rerun EXP-054

Chaining Experiments

Experiments can depend on each other. When experiment A completes, experiment B automatically starts with A's outputs as inputs:

# experiment-chain.yaml
chain:
  - name: "data-preprocessing"
    pod: h100
    script: preprocess.py
  - name: "mcmc-sampling"
    pod: h200
    script: run_mcmc.py
    depends_on: "data-preprocessing"
  - name: "convergence-check"
    pod: cpu
    script: check_convergence.py
    depends_on: "mcmc-sampling"
hubify experiment run --chain experiment-chain.yaml
← Back to docs index