Morning Review — 2026-05-06
Recent Commits (last 24h)
| Hash | Message |
|---|---|
13e07473 | docs(posts): add evening retrospective for 2026-05-05 (internal:149072) (#3061) |
3ed47351 | bug(runner): kimi agent exits with code 1 on successful completion — NDJSON terminal_reason:completed not detected before error path (#3060) |
4c692b93 | docs: morning review 2026-05-05 (#3058) |
One meaningful code fix landed: the kimi/glm false-failure bug (#3059/3ed47351). The runner now checks for terminal_reason:completed in NDJSON output before treating a non-zero exit as failure. This eliminates spurious outcome=failed records that were driving unnecessary cooldowns and re-routes.
Operational Summary
Orch v0.70.26. Pipeline active. Agent breakdown for last 24h:
| Agent | Model | Outcome | Count |
|---|---|---|---|
| glm | opus | success | 17 |
| codex | gpt-5.3-codex | success | 12 |
| minimax | opus | success | 12 |
| opencode | github-copilot/claude-sonnet-4.6 | success | 11 |
| opencode | github-copilot/gpt-5-mini | success | 7 |
| claude | sonnet | success | 6 |
| kimi | opus | success | 5 |
| claude | opus | success | 3 |
| opencode | gpt-5.3-codex | failed | 3 |
| kimi | opus | failed | 2 |
| opencode | github-copilot/claude-opus-4.6 | failed | 2 |
| opencode | github-copilot/gpt-5-mini | push_failed | 2 |
| codex | gpt-5.3-codex | failed | 1 |
opencode/gpt-5.3-codex still failing — 3 failures in the last 24h, same Model not found pattern. #3051 is open and blocked after 2 attempts. No code fix landed.
push_failed on opencode/gpt-5-mini — 2 push failures; #3052 (SSH retry) still open and blocked.
Log Highlights
- LLM routing working this morning: Router used minimax, claude, and kimi (haiku model) to classify tasks — LLM routing not falling back to round-robin as of today's morning startup. This is an improvement over yesterday.
- Watchdog triggered once:
tick loop has not completed a tick in 69s (threshold 60s)during morning job burst when 3 internal tasks were created simultaneously (morning-review, morning-briefing, twitter-trending-watch). Single occurrence; slow tick resolved. - GitHub 503: One GitHub server error (503) retried and recovered automatically.
- Error log clean:
/opt/homebrew/var/log/orch.error.logis empty — clean state from last restart.
Task Snapshot
| Status | Task | Age | Note |
|---|---|---|---|
| in_progress | internal:149092 | now | This review |
| blocked | #3051 | 2d, 2 tries | gpt-5.3-codex opencode filter |
| blocked | #3052 | 2d, 2 tries | SSH push retry |
| blocked | internal:148850 | 3d | Review agent failure threshold |
| blocked | internal:148540 | 11d | Self-improvement, failure threshold |
Retro Follow-Up (from 2026-05-05 evening)
| Priority | Status |
|---|---|
| Land opencode/gpt-5.3-codex filter | ❌ Still 3 failures today — not fixed |
| Land SSH push retry | ❌ #3052 still blocked, no code |
| Triage internal:148540 (10d blocked) | ❌ Still blocked, now 11d |
| Investigate router LLM cooldown | ✅ LLM routing active this morning (no fallback observed) |
Active Blockers
#3051 — opencode/gpt-5.3-codex not filtered: Two agent attempts have failed to land a code fix. The
is_known_unavailable_model()function in the opencode runner path needsgpt-5.3-codexadded to its exclusion list. Owner action ororch task unblock 3051with different agent/model guidance.#3052 — SSH push retry: Two attempts, no code fix. The push path needs to treat SSH handshake failures as transient and apply backoff. Owner or
orch task unblock 3052.internal:148540 (11 days): This task has exceeded failure threshold and is beyond normal retry. Either close it (
orch task close internal:148540) or triage manually.internal:148850 (3 days): Review agent failure threshold exceeded. Triage needed.
Priorities for Today
- Triage blocked tasks —
orch task unblock allor manually closeinternal:148540andinternal:148850. These consume DB state and show up as noise in every review. - Apply #3051 fix — Check
src/engine/runner/agents/opencode.rsor equivalent for model filtering; addgpt-5.3-codexto the exclusion list. This is a small, targeted change. - Apply #3052 fix — Add SSH error pattern to transient-push-failure detection in
src/engine/runner/git_ops.rsor response fallback path. - Monitor push_failed pattern — 2
push_failedfor opencode/gpt-5-mini in 24h. If this grows, investigate whether it's the same SSH issue as #3052.
Prepared by Orch automation (internal task internal:149092, attempt 1).