Multi-Agent System
A hierarchical agent team with cross-model peer review, reasoning-based routing, and full visibility into agent communication.
Hubify Labs runs a hierarchical multi-agent system designed to mirror the structure of a real research group. The system is built on three principles: hierarchy for efficiency, cross-model review for accuracy, and full transparency for trust.
Architecture
┌──────────────┐
│ Captain │ ← You
│ (Human) │
└──────┬───────┘
│
┌──────▼───────┐
│ Orchestrator │ ← Claude Opus 4.6
│ (Router) │
└──┬───┬───┬───┘
┌────────┘ │ └────────┐
┌──────▼──────┐ ┌──▼────┐ ┌─────▼──────┐
│Research Lead│ │Paper │ │Compute Lead│
│ (Opus 4.6) │ │Lead │ │ (Haiku 4.5)│
└──┬────┬────┘ └──┬────┘ └─────┬──────┘
│ │ │ │
Workers Workers Workers Workers
(Haiku) (Haiku) (Haiku) (Haiku)
Reasoning-Based Routing
Every task has a reasoning requirement. The orchestrator routes accordingly:
**Models:** Claude Opus 4.6, gpt-5.4
**Tasks:** Research strategy, paper drafting, peer review, novel scientific analysis, hypothesis generation, cross-survey interpretation.
**Handled by:** Orchestrator or Lead agents.
**Models:** Claude Haiku 4.5, gpt-5.4-mini
**Tasks:** Data analysis, code generation, experiment configuration, statistical testing, literature summarization.
**Handled by:** Lead agents or senior Workers.
**Models:** Claude Haiku, GPT-3.5-turbo
**Tasks:** Data formatting, file management, wiki updates, figure export, LaTeX compilation, log parsing.
**Handled by:** Worker agents.
Cross-Model Peer Review
Warning: Every significant output is reviewed by a model from a different provider. Same-model review is not permitted.
The review matrix ensures no blind spots:
| Output from | Reviewed by |
|---|---|
| Claude | GPT-4, Gemini, or Grok |
| GPT-4 | Claude, Gemini, or Perplexity |
| Gemini | Claude or GPT-4 |
Reviews check for:
- Factual accuracy and hallucination detection
- Logical consistency with prior results
- Mathematical and statistical correctness
- Missing citations or prior work
- Overstatements and unsupported claims
Activity Feed
All agent communication is visible in the Activity Feed — a real-time, color-coded stream:
[10:42] 🟢 Research Lead completed EXP-054 (MCMC base chain)
[10:43] 🔵 Orchestrator → Paper Lead: "Integrate EXP-054 results into Section 4"
[10:44] 🔵 Paper Lead → Draft Worker: "Update posterior table with new chain means"
[10:45] 🟡 QC Worker flagged EXP-055: convergence R-hat = 1.08 (threshold: 1.05)
[10:46] 🔴 Research Lead escalated: "EXP-055 needs longer chains. Request H200 pod."
Tilldone Pattern
When a worker fails a task, the lead agent takes over rather than just reporting the failure:
- Worker attempts the task
- Worker fails (error, bad output, QC failure)
- Lead agent receives the failure with full context
- Lead agent executes the task itself using higher reasoning
- If the lead also fails, it escalates to the orchestrator
This pattern ensures tasks complete without constant human intervention.
Adding Agents
# Add a specialized lead
hubify agent add --role lead --name "Cosmology Lead" --model claude-opus \
--specialty "MCMC analysis, CMB power spectra, dark energy constraints"
# Add workers
hubify agent add --role worker --name "Figure Generator" --model claude-haiku
hubify agent add --role worker --name "Data Processor" --model claude-haiku
# View the full roster
hubify agent list --tree
Agent Metrics
Each agent tracks:
- Tasks completed vs failed
- Average task duration
- QC pass rate
- Review acceptance rate
- Cost per task