Morning Review — 2026-04-05
Morning Review — 2026-04-05
Recent Commits & Progress
Exceptional overnight output — 40+ commits since yesterday morning. The theme shifted from reliability fixes to performance and correctness of the review/dispatch pipeline:
Review & dispatch correctness:
review subscriber leaks dispatch key on semaphore-full and transition-failure paths(#1878) — tasks were getting stuck inNeedsReviewuntil restart; dispatch key was never released on error pathsauto_unblock_blocked_tasks uses external_id for dispatch key— blocked tasks remained stuck because unblock used wrong key; follow-up to the dispatch key leak patternstore error in tick.rs causes tasks to skip review and marked done— silent store failure made tasks bypass review entirely
Performance:
review_poll Phase 3 checks is_pr_merged sequentially(#1877) → now usesjoin_allmatching Phase 2's parallel approachtick_unblock_parents calls GitHub API per child task instead of local store(#1876) — was O(N×M) API calls per tick; now uses local storeconsolidate 7 sequential DB queries in get_metrics_summary(#1797) — already landed yesterday but visible in today's run data
Cooldown correctness:
kv_increment failure in cooldown silently returns 1— critical: failure counts weren't persisting, so exponential backoff never advanced past attempt 1 for failed agentsexpand_alias silently coerces invalid cron alias params to 0— cronjobs with malformed params were running at wrong intervals
Session/recovery:
parse_session_name breaks for repos with hyphens— sessions for repos likemy-repowere never matched; affected stuck-task detectiontouch updated_at after session exit— stuck-task recovery was triggering too early for fast-completing tasks
Cleanup:
- Deduplicated 3 pairs of rate-limiting functions in
github/http.rs(#1872) - Kill orphaned review tmux session on auto_merge (#1871)
- Logging improvements: PR creation skips, branch deletion, review outcome categorization
Operational Health
Overall: healthy. No crashes, no Tokio panics, empty error log.
Agent success rates (last 24h)
| Agent | Model | Successes | Failures |
|---|---|---|---|
| claude | sonnet | 73 | 1 |
| minimax | opus | 53 | 3 |
| codex | gpt-5.3-codex | 28 | 4 |
| claude | haiku | 17 | 0 |
| opencode | github-copilot/gpt-5-mini | 16 | 1 |
| opencode | qwen3.6-plus-free | 14 | 0 |
| claude | opus | 13 | 0 |
| kimi | opus | 0 | 6 |
kimi cooldown: Auth errors + billing cycle. Cooldowns persist: kimi agent ~2h15m remaining, kimi:haiku ~4h remaining. Generic backoff system is handling this correctly — routing redirects to claude/minimax/codex.
Task activity (last 12h): 1290 status changes, 355 dispatches, 332 pushes, 268 branch deletes, 169 review starts, 161 PR creates, 145 review decisions. 29 errors (all transient HTTP retries). 7 re-routes (expected for failed first attempts).
Notable log events
- GitHub GraphQL HTTP failures (×2 transient): Retried and succeeded. No circuit breaker triggered.
- internal:52349 silent failure: opencode session started but produced no output within 30 min; stuck-task recovery correctly reset to
new, re-dispatched to claude, completed successfully (PR #395 merged). - kimi auth error at routing: Handled correctly — pool entry failed, cooldown applied, next pool entry (minimax:haiku) succeeded.
Error log
/opt/homebrew/var/log/orch.error.log is 0 bytes — no service crashes since last restart.
Stuck / Blocked Tasks
| Task | Status | Reason |
|---|---|---|
| #38243 | blocked | Migrate integration tests to Surfpool (different project) |
| #35832 | blocked | Adapter integration tests against devnet (different project) |
| #35829–35831 | blocked | Mainnet deploy, landing page, telemetry (different project) |
All blocked tasks are in an unrelated project (Solana/oblivion). No orch tasks stuck from this project. Clean queue.
Retro Follow-Ups
| Priority from 2026-04-04 retro | Status |
|---|---|
| Verify pre-dispatch validation stable (#1836) | ✓ Confirmed — no false positives observed in overnight runs |
Watch for async blocking calls (std::fs::) | Partially addressed: review.rs fixed ×2, engine fixed. Additional pass warranted |
| kimi recovery | In progress — cooldowns at ~2-4h, will auto-recover |
| Backlog clear → new cycle | ✓ New issues are flowing through (dispatch key bugs, cron fixes) |
Priorities for Today
Verify dispatch key leak fix is stable —
#1878fixed review subscriber dispatch key leak;auto_unblock_blocked_taskswrong-key fix also landed. Watch that no tasks accumulate inneeds_revieworblockedunexpectedly across today's runs.kv_increment silence fix follow-up — The fix that
kv_incrementwas silently returning 1 on failure means exponential backoff was never advancing. Now that it's fixed, cooldown durations will actually escalate as designed. Verify that the first post-fix failures produce the correct escalating cooldown incooldown list.Async blocking audit — Three passes to fix
std::fsin async contexts (engine, review.rs round 1, review.rs round 2) suggests more may remain. A targetedrg 'std::fs::' src/across all async fns would surface remaining risk.kimi recovery window — kimi agent auto-recovers in ~2.25h, kimi:haiku in ~4h. No action unless cooldown doesn't clear after the window.
Cron timing correctness —
expand_aliascoercion fix means any previously misconfigured cron aliases were running at interval 0 (immediately/every tick). Verify scheduled jobs are running at their intended intervals after the fix deploys.