Multi-Agent System

A hierarchical agent team with cross-model peer review, reasoning-based routing, and full visibility into agent communication.

Hubify Labs runs a hierarchical multi-agent system designed to mirror the structure of a real research group. The system is built on three principles: hierarchy for efficiency, cross-model review for accuracy, and full transparency for trust.

Architecture

                  ┌──────────────┐
                  │   Captain    │  ← You
                  │  (Human)     │
                  └──────┬───────┘
                         │
                  ┌──────▼───────┐
                  │ Orchestrator │  ← Claude Opus 4.6
                  │  (Router)    │
                  └──┬───┬───┬───┘
            ┌────────┘   │   └────────┐
     ┌──────▼──────┐ ┌──▼────┐ ┌─────▼──────┐
     │Research Lead│ │Paper  │ │Compute Lead│
     │ (Opus 4.6) │ │Lead   │ │ (Haiku 4.5)│
     └──┬────┬────┘ └──┬────┘ └─────┬──────┘
        │    │         │             │
     Workers Workers  Workers     Workers
     (Haiku) (Haiku)  (Haiku)     (Haiku)

Reasoning-Based Routing

Every task has a reasoning requirement. The orchestrator routes accordingly:

**Models:** Claude Opus 4.6, gpt-5.4

**Tasks:** Research strategy, paper drafting, peer review, novel scientific analysis, hypothesis generation, cross-survey interpretation.

**Handled by:** Orchestrator or Lead agents.


**Models:** Claude Haiku 4.5, gpt-5.4-mini

**Tasks:** Data analysis, code generation, experiment configuration, statistical testing, literature summarization.

**Handled by:** Lead agents or senior Workers.


**Models:** Claude Haiku, GPT-3.5-turbo

**Tasks:** Data formatting, file management, wiki updates, figure export, LaTeX compilation, log parsing.

**Handled by:** Worker agents.

Cross-Model Peer Review

Warning: Every significant output is reviewed by a model from a different provider. Same-model review is not permitted.

The review matrix ensures no blind spots:

Output fromReviewed by
ClaudeGPT-4, Gemini, or Grok
GPT-4Claude, Gemini, or Perplexity
GeminiClaude or GPT-4

Reviews check for:

  • Factual accuracy and hallucination detection
  • Logical consistency with prior results
  • Mathematical and statistical correctness
  • Missing citations or prior work
  • Overstatements and unsupported claims

Activity Feed

All agent communication is visible in the Activity Feed — a real-time, color-coded stream:

[10:42] 🟢 Research Lead completed EXP-054 (MCMC base chain)
[10:43] 🔵 Orchestrator → Paper Lead: "Integrate EXP-054 results into Section 4"
[10:44] 🔵 Paper Lead → Draft Worker: "Update posterior table with new chain means"
[10:45] 🟡 QC Worker flagged EXP-055: convergence R-hat = 1.08 (threshold: 1.05)
[10:46] 🔴 Research Lead escalated: "EXP-055 needs longer chains. Request H200 pod."

Tilldone Pattern

When a worker fails a task, the lead agent takes over rather than just reporting the failure:

  1. Worker attempts the task
  2. Worker fails (error, bad output, QC failure)
  3. Lead agent receives the failure with full context
  4. Lead agent executes the task itself using higher reasoning
  5. If the lead also fails, it escalates to the orchestrator

This pattern ensures tasks complete without constant human intervention.

Adding Agents

# Add a specialized lead
hubify agent add --role lead --name "Cosmology Lead" --model claude-opus \
  --specialty "MCMC analysis, CMB power spectra, dark energy constraints"

# Add workers
hubify agent add --role worker --name "Figure Generator" --model claude-haiku
hubify agent add --role worker --name "Data Processor" --model claude-haiku

# View the full roster
hubify agent list --tree

Agent Metrics

Each agent tracks:

  • Tasks completed vs failed
  • Average task duration
  • QC pass rate
  • Review acceptance rate
  • Cost per task
← Back to docs index