Evening Retrospective — 2026-05-06
Summary
One meaningful code fix landed today — the kimi/glm review runner false-failure (#3064), which was a follow-up to yesterday's kimi runner fix (#3059). The review agent path in review.rs was not updated alongside runner/mod.rs, so review runs for kimi still failed despite PR #3060. That gap is now closed. Three issues remain open and blocked; two long-stale internal tasks still require owner triage.
What Was Accomplished
- #3064 fixed and merged (
231be228):fix(review): kimi review runs succeed despite exit 1 after PR #3060—invoke_review_agentnow mirrors the logic from PR #3060: when the agent exits non-zero, check for a valid output file (is_error=false) before recording a failure. Also fixed the hardcoded-1exit code incomplete_review_runcalls to use the actual exit code fromexit.txt. Closes the last known kimi false-failure path.
What Failed / Still Pending
#3051 still open (
bug(router): gpt-5.3-codex not filtered for opencode agent): Two agent attempts have failed to land code. The issue points atis_known_unavailable_model()needinggpt-5.3-codexadded for the opencode runner path. Morning review confirmed 3 failures in the prior 24h from this pattern. Status:blockedafter 2 attempts.#3052 still open (
bug(runner): SSH auth failure in push permanently blocks tasks): Two attempts, no committed code fix. Push path needs SSH handshake errors treated as transient with backoff. Status:blockedafter 2 attempts.#3065 newly opened (
bug: CI-failure-blocked tasks stay stuck for 24h even when PR is already closed): New bug — tasks blocked on CI failure are not re-evaluated when the PR closes. Status:in_progresswith opencode/claude-sonnet-4.6.internal:148540 (11+ days blocked): Still unresolved. Owner triage needed immediately — this is well beyond failure threshold.
internal:148850 (4 days blocked): Still blocked. Review agent failure threshold exceeded.
Routing Accuracy & Agent Observations
- The two kimi fixes (#3059, #3064) together close the full false-failure loop: runner now handles exit-1 on NDJSON completion, and the review runner now also handles exit-1 with valid output. Both paths are consistent.
- Morning review confirmed LLM routing was operational today (minimax, claude, kimi haiku all used) — no round-robin fallback observed. This is an improvement over the prior week pattern.
- opencode/gpt-5.3-codex failures (#3051) persist. Each failure wastes a dispatch cycle and increments failure count. The fix is known and small; the issue is agent execution failing to deliver code.
- #3065 (CI-blocked task resurrection) is a new pattern — tasks blocked waiting for CI that has already concluded or been abandoned stay stuck. This could affect any task with a closed PR.
Performance / Bottlenecks
- No new watchdog stalls today. The
llm_budget_secsfix from #3050 (30s default) appears to have stabilized tick timing. - One GitHub 503 was recovered automatically (morning review noted this).
push_failedpattern (opencode/gpt-5-mini, 2 failures in prior 24h) not confirmed in today's data. May have been transient or related to the same SSH issue as #3052.
Learnings
- Review runner must mirror runner changes: When fixing agent exit-code handling in
runner/mod.rs, always checkreview.rsfor the same pattern. These two paths handle similar completion detection logic and both need to be updated together. - Two agent attempts is the empirical limit for #3051 and #3052: These issues have survived 2 agent attempts each. Either the task prompt needs more specificity (exact file + function name), or the agent needs to be different (try
agent:claudelabel override). Consider addingagent:claude complexity:simplelabels to force a targeted approach.
Priorities for Tomorrow (Morning Review)
- Triage internal:148540 — 12 days blocked. Run
orch task close internal:148540 --note "exceeded triage window"ororch task unblock internal:148540. This is past actionable. - Triage internal:148850 — 4 days blocked.
orch task unblock internal:148850or close. - Force-route #3051 with agent:claude — Two opencode attempts failed. Add
agent:claude complexity:simplelabel to the issue andorch task unblock 3051. The fix is: add"gpt-5.3-codex"tois_known_unavailable_model()in the opencode runner. - Force-route #3052 with agent:claude — Same pattern. The fix is: detect
sign_and_send_pubkey/ SSH handshake errors in the push path and treat them as transient with exponential backoff. - Monitor #3065 — New issue, already in_progress. Check outcome in morning.
Prepared by Orch automation (internal task internal:149129).