Features

Everything you need
to run a research lab.

14 capabilities. Each one complete, not a prototype. Independent research at institution scale.

Architecture

Three surfaces. One lab.

Web app · Desktop app · CLI/TUI — each is the full research IDE.

Your work doesn't care which window you're in. Pick up a session in the browser, continue it in the desktop app, or drive it from the terminal. The orchestrator on Fly.io keeps everything in sync — experiments, papers, agents, memory — across all three surfaces in real time.

The Web app is the full research IDE in your browser. No install. Open it on any machine. The Desktop app is the same IDE compiled to a native macOS app — file drop, dock badge, system notifications, hubify:// deep links. The CLI/TUI is a ~120-command Go binary with a bubbletea TUI mirror of every web view.

Web app

Full IDE · no install · any browser

Desktop app

Full IDE · native macOS · file drop

CLI · TUI

~120 commands · bubbletea TUI

Compute

AI-native experiment dispatch.

The orchestrator routes every job to the cheapest credible compute target.

Every experiment dispatch answers four questions in order: (1) Does this task have a tensor operation in the hot path? (2) How long will it run? (3) Is there an existing pod with free capacity? (4) Does the job have checkpoint discipline? The router answers these questions automatically — you never log into RunPod directly.

The cost difference is real: a GPU pod running CPU work costs 10-20× more than a CPU pod running the same job. The §41 router catches these mismatches before they happen. Before any job that processes > 10,000 items or runs > 30 minutes, the orchestrator runs a micro-batch test to measure throughput, GPU/CPU utilization, and VRAM high-water mark, then extrapolates the full-run cost.

Rule 1

CPU vs GPU — biggest cost lever

Rule 2

Pod vs Serverless — the duration question

Rule 3

Reuse existing pods before spinning new ones

Rule 4

Checkpoint every 10 min — no data loss

Always-on

The orchestrator never sleeps.

Fly.io machine. Cron every 5 min. Standups 3×/day. Idle-GPU watchdog.

The orchestrator is a long-running process on Fly.io — not a serverless function, not a cron trigger. It wakes up every 5 minutes to check for work. Three times a day it runs a standup: it reads the current experiment queue, paper status, agent health, and credit balance, then writes a structured report. Overnight it runs the publish-ready loop on any paper that hasn't shipped.

The idle-GPU watchdog checks every 15 minutes for pods that are running but not processing any work. It sends a SIGTERM, flushes the checkpoint, and stops the pod. That alone saves $50-200/month on a typical research week.

Cron

Every 5 min — experiment queue, agent health

Standups

08:07 · 13:13 · 18:23 — structured reports

Watchdog

Idle-GPU detection + graceful pod stop

Overnight

Publish-ready loop — papers that haven't shipped

Security

Lab Sovereignty Rule.

Read OK. Write FORBIDDEN. Triple-enforced at CLI + MCP + API.

Every agent that runs in your lab has read access to shared cross-lab data and write access only to its own lab. The Lab Sovereignty Rule is enforced at three independent layers: the CLI checks it before any mutation, the MCP server validates it before any tool call, and the REST API rejects unauthorized writes with a 403.

Agents can read other labs' shared datasets and learnings — that's how cross-lab knowledge flows. But they cannot write to another lab's experiments, papers, or agent memory. The boundary is absolute. If an agent bug tries to write to the wrong lab, all three enforcement layers catch it independently.

CLI layer

hubify write commands check lab ownership

MCP layer

Tool calls validated against lab_id claim

API layer

403 on unauthorized write attempts

Cross-lab read

Shared datasets + learnings — opt-in

Science

No echo chambers. Cross-model peer review.

Claude · gpt-5.4 · Gemini · Grok — 4 reviewers that disagree with each other.

Every paper gets reviewed by four AI models before any human reads it. The reviewers are running different model families — not just different prompts on the same model. They're designed to disagree. A paper that clears all four is genuinely robust. A paper that fails three is telling you something important.

Each review returns a structured verdict: approve / concern / reject — plus a ranked concern list, a summary, and a confidence score. The readiness score auto-computes from the review ensemble: each approval adds 10 points, each rejection subtracts 8, each concern subtracts 2. The target is 70+ before arXiv submission.

Claude Opus 4.7

Primary reviewer · deep reasoning

gpt-5.4

Cross-reviewer · adversarial framing

Gemini 2.5 Pro

Cross-reviewer · multi-modal analysis

Grok 3

Cross-reviewer · skeptic mode

Protocol

Houston Method v2. Post-experiment ritual.

9 mandatory tasks. Every experiment. No exceptions.

After every experiment completes — pass or fail — the Houston Method v2 protocol fires automatically. It creates 9 mandatory follow-up tasks: write findings, generate figure, peer review, update knowledge wiki, check for anomalies, promote contributions, plan next experiment, update paper draft, and update project roadmap.

The 9-task protocol is not a suggestion. It's enforced at the platform level. The orchestrator tracks protocol completion per experiment. A lab with 0% compliance has 0 completed protocols — it doesn't mean nothing happened, it means the ritual wasn't followed. Captain view shows a live compliance bar.

Step 1

Write findings to knowledge wiki

Step 2

Generate figure from results

Step 3

Trigger cross-model peer review

Step 4

Check for anomalies + outliers

Steps 5–9

Promote · plan next · update paper · roadmap

Structure

The hierarchy. Locked taxonomy.

Lab → Project → Pipeline → Experiment → Task. No shortcuts.

Every piece of research lives somewhere in the hierarchy. A Lab contains Projects. A Project contains Pipelines. A Pipeline contains Experiments. An Experiment contains Tasks. This isn't organizational theater — it's how the orchestrator navigates your work, how the CLI understands context flags, and how cross-lab sharing knows what's shareable.

Above the Lab sits the Global layer — shared across all labs, read-only from any lab's agents. Below the Task is the Intent layer — conversation threads, brainstorm sessions, and notes that haven't been formalized yet. The hierarchy is the address space of your research.

Global

Shared learnings · cross-lab datasets

Lab

Isolated research environment

Project → Pipeline

Research programs + experiment sequences

Experiment → Task

Atomic units of work + protocol steps

Intelligence

4-layer memory. Agents read the right scope.

User · Agent · Lab · Global — automatic context injection per layer.

Every agent reads from four memory layers automatically. User memory contains your preferences, research style, and explicit instructions — things you've told the platform about yourself. Agent memory contains each agent's learned patterns, mental models, and episode history — what the agent has learned from past runs. Lab memory contains the lab's shared context: scientific background, running conventions, current paper drafts. Global memory contains cross-lab learnings that have been promoted to the shared tier.

When the orchestrator picks up a new experiment, it reads from all four layers and constructs a context-rich prompt — without you having to re-explain your preferences every session. The memory layers are visible and editable in the Memory view inside the IDE.

User layer

Preferences · research style · explicit prefs

Agent layer

Per-agent mental models + episode history

Lab layer

Scientific background · running conventions

Global layer

Cross-lab promoted learnings (read-only)

Publishing

Publish-ready loop. Autonomous 5-round.

No future-research punts. No incomplete sections. 7-point preflight.

The publish-ready loop runs 5 rounds of autonomous improvement on a paper before any human reviews it. Each round checks: abstract present, PDF compiled, target journal set, ≥2 peer reviews, readiness ≥70%, no blocking rejects, ≥3,000 words. If a round fails, the orchestrator writes a note explaining what's missing and what it did to fix it.

The loop's hard rule: no future-research punts. 'This will be studied in future work' is not an acceptable response to a reviewer concern. The orchestrator is trained to either answer the concern with available data or explicitly acknowledge the limitation with a specific reason why the current paper scope doesn't address it.

Round 1–5

Autonomous improvement pass — no human needed

Preflight

7-point checklist — abstract, PDF, journal, reviews

Hard rule

No future-research punts — address or acknowledge

Output

arXiv-ready tar.gz package with LaTeX + figures

Sandbox

Vibe coding sandbox.

Spin up a full research environment in 60 seconds. No config.

Every lab comes with a vibe-coding environment — a sandboxed Vercel deployment where you can iterate on lab site design, figure layouts, and paper templates with natural language. Describe what you want, the AI writes the code, the sandbox builds it, and you see the result. The Lab Sovereignty Rule applies here too — the sandbox can't write to your experiments or papers.

The sandbox isn't a toy. It's the same environment used to build the five existing lab sites (BigBounce, PTA-GW, Chirality, Dark Matter, ETI). Each has a different visual identity while sharing the same data model underneath.

Build time

60 seconds from blank to deployed site

Outputs

Lab site · paper pages · figure galleries

Safety

Sandbox cannot write to lab data

Hosting

Vercel auto-deploy from lab subdomain

Visualization

Activity Graph. The neural brain view.

SVG force-directed graph. Every node is a real Convex record.

The Activity Graph shows your lab as a living neural network. Agents are sage green nodes. Skills are tan. Projects and hubs are blue. Papers and contributions are lavender. Experiments are rose. Edges are drawn from agents to the experiments they've run, from papers to the experiments that contributed data, from skills to the agents that hold them.

The graph is force-directed: nodes repel each other, edges pull them together. Pulsing dots travel along edges in real time — each dot represents a recent action logged to the activity feed. Hover any node to see its Convex data in an info card.

Agents

#5EE89A sage — orchestrators + leads + workers

Papers

#B88AE8 lavender — papers + contributions

Experiments

#E878A0 rose — experiment nodes

Live pulses

Real-time activity dots on edges

Network

Cross-lab comm gateway.

Labs talk to each other. Shared datasets. Promoted learnings.

The Lab Sovereignty Rule says agents can't write to other labs — but they can read from them. The cross-lab gateway is the bus that makes this safe. A lab can opt-in to sharing specific datasets, learnings, and contributions. Other labs' agents can query the shared tier — not the full lab, just the promoted shared slice.

The gateway is not a message-passing system. It's a read-only query bus with opt-in sharing. The result is a network of labs that can learn from each other without compromising data sovereignty. The Dossier view shows the cross-lab stats table: experiments, papers, agents, contributions, and novelty per lab.

Shared datasets

Opt-in dataset sharing — public subset

Shared learnings

Promoted insights — read-only cross-lab

Gateway query

shared.listDatasets + shared.listLearnings

Sovereignty

Lab data stays in its own boundary

Protocol

MCP server. Agents drive the platform.

28+ tools. Cursor-compatible. Any MCP-aware client.

The MCP server exposes the full Hubify Labs platform as a set of AI-callable tools. An agent can list experiments, create tasks, trigger peer review, query the knowledge wiki, dispatch GPU pods, and read lab memory — all through the MCP protocol. Any MCP-aware client (Cursor, Claude Desktop, your own agent) can drive the platform.

The MCP server runs as a separate process alongside the Fly.io orchestrator. It uses the same Convex client as the web app — there's no separate database or API to keep in sync. Mutations go directly to Convex, queries come back in real time.

28+ tools

experiments · papers · agents · tasks · figures · datasets

Protocol

Model Context Protocol — JSON-RPC 2.0 over stdio

Clients

Cursor · Claude Desktop · any MCP-aware agent

Transport

Same Convex backend as web app — no sync lag

Terminal

CLI · TUI. `hubify` Go binary.

~120 commands. --lab flag. Every web view mirrored in the terminal.

The `hubify` CLI is a compiled Go binary (~120 commands) with a bubbletea TUI that mirrors every view in the web app. `hubify experiments list`, `hubify papers list`, `hubify agents status`, `hubify tasks add`, `hubify pods list` — all backed by the same Convex API as the web app. The `--lab` flag targets any lab by slug or ID.

The TUI mode (`hubify tui`) renders a full interactive terminal UI: sidebar navigation, view panels, chat input, live Convex subscriptions. You can drive your entire research lab from a terminal window — useful when the web app isn't available, when you're SSH'd into a pod, or when you just prefer the keyboard.

~120 commands

experiments · papers · pods · agents · tasks · surveys

--lab flag

Target any lab by slug or ID

TUI mode

Full interactive terminal UI — hubify tui

Auth

CLI auth token via hubify auth login

Ready to start your lab?

Your orchestrator is already running. Your agents are already wired. You start on day one.

Create your lab Read the guides