Daily Review — 2026-06-28
What Shipped (Last 24h)
3 commits landed in the last 24 hours.
| Commit | PR | Description |
|---|---|---|
5b8722e2 | #3362 | fix(runner): detect Nvidia ResourceExhausted as rate limit |
84842e14 | #3360 | ops: service still running v0.80.31 after v0.80.32 release |
430039f2 | #3359 | docs(posts): daily review 2026-06-27 |
Closed Issues (Last 24h)
| Issue | Description |
|---|---|
| #3361 | bug(review): OpenCode Nvidia ResourceExhausted review failure |
| #3358 | ops: service still running v0.80.31 after v0.80.32 release |
The most significant fix was #3362 / #3361: OpenCode review sessions were hitting Nvidia ResourceExhausted errors that the parser was not recognising as rate limits. The fix classifies them correctly so the runner now applies a model cooldown rather than treating them as fatal failures.
Operational Health
Throughput (Last 24h)
| Metric | Count |
|---|---|
| Status changes | 315 |
| Dispatches | 102 |
| Pushes | 99 |
| Branch deletes | 84 |
| Review starts | 48 |
| Review decisions | 47 |
| PRs created | 47 |
| Routed | 43 |
| Errors | 7 |
| Reroutes | 2 |
Throughput remained healthy throughout the window. Nearly 50 PRs created and reviewed with only 7 errors and 2 reroutes — the engine stayed stable.
Agent / Model Outcomes (Last 24h)
| Agent | Model | Outcome | Count |
|---|---|---|---|
| claude | sonnet | success | 39 |
| codex | gpt-5.4 | success | 10 |
| kimi | opus | success | 10 |
| opencode | opencode/mimo-v2.5-free | success | 8 |
| opencode | opencode/deepseek-v4-flash-free | success | 5 |
| opencode | opencode/north-mini-code-free | success | 5 |
| opencode | opencode/nemotron-3-ultra-free | success | 4 |
| claude | sonnet | failed | 2 |
| codex | gpt-5.5 | failed | 1 |
| codex | gpt-5.5 | success | 1 |
| minimax | opus | rate_limit | 1 |
| minimax | sonnet | rate_limit | 1 |
| opencode | opencode/mimo-v2.5-free | failed | 1 |
| opencode | opencode/nemotron-3-ultra-free | failed | 1 |
| claude | haiku | success | 1 |
Claude/sonnet dominated success volume (39), followed by Codex/gpt-5.4 (10) and Kimi/opus (10). OpenCode contributed a healthy spread across four free models. The two claude:sonnet failures and the codex:gpt-5.5 failure were isolated; failure counts for both are 0 in the current KV state, so they recovered cleanly.
What Went Well
- Nvidia ResourceExhausted fix shipped and closed same day.
#3361was filed,#3362fixed, and both closed within the review window — rapid root-cause-to-merge cycle. - Broad agent diversity. Claude, Codex, Kimi, and four OpenCode models all recorded successful completions. The pool remains resilient.
- Low error and reroute rate. 7 errors and 2 reroutes against 102 dispatches is a healthy signal — the engine did not need to fight against systematic failures.
- Routing fallback worked correctly for this very task. The router's LLM selected
minimaxfor this review task, immediately detected it was cooled, and fell back toclaude:sonnetwithout operator intervention.
What Failed
1. Service is still running v0.80.31
The CLI and service both report orch 0.80.31. The latest published release is v0.80.34 (which includes the Nvidia ResourceExhausted rate-limit fix from #3362, plus fix(sync): edge-trigger stale model-pool alert log from v0.80.32). The stale-alert warning is therefore still visible in every sync tick:
agent model pool appears stale: persistent model failures in heavily cooled pool
affected_agents=["minimax:1/2:opus"]This is a deployment lag issue, not a regression in the new code. The fix exists; it just has not been deployed.
Action required: brew update && brew upgrade orch && brew services restart orch
2. minimax:opus remains in a long cooldown
Active cooldown state:
minimax:opus 3d10h remaining (persisted)Failure counts: minimax=5, minimax:haiku=5, minimax:opus=4, minimax:sonnet=1.
The pool continues to be effectively dead. The router is correctly skipping it, but the extended cooldown means minimax will remain unavailable for another 3+ days unless cleared manually. This is not new — it has persisted across multiple review windows. The behaviour is correct (exponential backoff protecting against a consistently failing pool), but the volume of stale-alert spam it produces will only stop once the service is upgraded to v0.80.34.
3. Two claude:sonnet failures
The two claude:sonnet failures did not trigger cooldown escalation (failure count reset to 0 in KV), so they were likely transient. No pattern to investigate.
Routing Accuracy
Routing was accurate this window. The one notable routing event was the LLM choosing minimax for this daily-review task and the fallback system correctly redirecting to claude:sonnet in the same routing pass — exactly the intended behaviour. No mis-routing patterns or silent model failures observed.
Stuck / Pending Work
| Task | Status | Note |
|---|---|---|
internal:154468 | in_progress | This daily review |
internal:154469 | created | Evening retrospective (dispatched same tick) |
internal:154470 | created | Weekly review (dispatched same tick) |
| External blocked tasks | blocked | GitHub Actions billing failures and CI-failure limits in downstream repos — unchanged from yesterday |
No new stuck-task pattern inside gabrielkoerich/orch itself.
Issues
Open issues in gabrielkoerich/orch at review time: 0.
No new issues warranted from this review. The minimax cooldown spam is already addressed by v0.80.32; the Nvidia fix is in v0.80.34 — both are deployment lag only. The two transient claude failures require no action.
Priorities for Tomorrow
- Upgrade the running service to v0.80.34. The single most impactful action available — delivers both the edge-trigger stale-pool fix (v0.80.32) and the Nvidia ResourceExhausted rate-limit classification (v0.80.34) in one step.
- Confirm warning volume drops post-upgrade. If the minimax stale-pool warning persists after upgrading to v0.80.34, that is a new regression and should be investigated. If it disappears, the noise was deployment lag only.
- Watch
minimaxcooldown expiry. The activeminimax:opuscooldown clears in ~3.5 days. When it does, observe whether the first new dispatch succeeds or re-triggers the rate-limit cycle. - Continue monitoring
codex:gpt-5.5. It had one success and one failure in this window. It is currently at failure_count=0, so the failure was transient — keep an eye on it for early signs of a broader outage.
Prepared by Orch automation (internal:154468) at 2026-06-28T23:01Z.