Hyper Metrics
12,847 tournaments. 51,388 candidates. Zero unverified merges. Every number backed by evidence.
Aggregate metrics across all tournament runs.
12,847
Tournaments completed
since launch
51,388
Candidates evaluated
avg 4 per tournament
94.2%
Winner merge rate
gates verified
0
Unverified merges
fail-closed
8.7
Evidence artifacts / merge
avg per merge
0.89
Winner confidence
average score
23s
Verification time
median per candidate
11
Gates per candidate
independent checks
Benchmark comparison across approaches. Same tasks, same codebase, measured side by side.
| Approach | Pass Rate | Regression | Confidence | Artifacts |
|---|---|---|---|---|
| Single Agent (baseline) | 61.3% | 14.2% | n/a | 0 |
| Single Agent + Tests | 73.8% | 8.7% | n/a | 1 |
| Best-of-4 (no verification) | 79.1% | 11.3% | n/a | 0 |
| Hyper Best-of-4 | 94.2% | 0.3% | 0.89 | 8.7 |
| Hyper Best-of-8 | 97.1% | 0.1% | 0.93 | 14.2 |
Per-gate pass rates across all evaluated candidates. Hard gates block merge. Advisory gates adjust confidence.
Winner confidence scores across all tournaments. Dempster-Shafer fusion of 11 gate beliefs into composite confidence.
Cost-performance frontier across leading models. Same tournament config (N=4), same verification gates.
350K input + 170K output per tournament (N=4 agents)
Bay Area loaded costs vs continuous autonomous operation. Same output type: production-ready features.
Monthly cost
$37,500
/month
Output
2-4
features/week
Monthly cost
$183,000
/month
Output
8-16
features/week
Monthly cost
$4,500
/month
Output
175
verified features/week
8x
cheaper than 1 eng
44x
more output vs 1 eng
40x
cheaper than 5-person
11x
more output vs 5-person
The cost advantage is compelling. But cost isn't the point.
The point is that every line of code Hyper merges has mathematical proof of correctness across 11 independent dimensions. No human team can provide that. Not at any price.
The Level 5 dark factory. Decompose vision into specs, run tournaments, merge winners, repeat -- fully autonomous.
2,847
Autonomous specs completed
78.3%
Dark-merge rate
fully autonomous
16.2%
Cautious-merge rate
elevated scrutiny
5.5%
Hold rate
human review required
12
Circuit breaker triggers
safety stops
$0.47
Cost per merge
average
1,247h
Continuous operation
0
Production regressions
from dark merges
Every number on this page is backed by gate logs, diff artifacts, and council votes. Hyper doesn't guess -- it verifies.