Morning Review — 2026-05-09

2026-05-09

Recent Commits (last 24h)

Hash	Message
`26d79aaf`	docs(posts): add evening retrospective for 2026-05-08 (internal:149254) (#3084)
`d59e028f`	Github issues synced only after restart (#3083)
`b6b2d38d`	fix(runner): synthesize done when NDJSON envelope reports success but result lacks AgentResponse schema (#3082)
`7d37dbfe`	feat(version): warn when deployed service is behind latest release (#3080)
`36334282`	bug(runner): codex --full-auto flag placed before exec subcommand — CLI 0.128.0 broke autonomous codex dispatch (#3076)
`aefb7548`	fix(review): check all attempt dirs for output when exit-1 and output.json missing

Headline: yesterday's two open blockers (#3072 kimi exit-1, #3073 codex --full-auto) both closed. The 36334282 runner fix flipped the failure pattern — pre-fix runs throw unexpected argument '--full-auto'; post-fix runs no longer hit that error.

Operational Summary

Orch service: 0.71.1 running, 0.71.2 available — minor upgrade pending (brew upgrade orch && brew services restart orch). CLI is at 0.71.0.

Agent breakdown for last 24h (task_runs):

Agent	Model	Outcome	Count
kimi	opus	success	14
opencode	github-copilot/gpt-5-mini	success	13
minimax	opus	success	10
claude	sonnet	success	9
opencode	github-copilot/claude-sonnet-4.6	success	9
codex	gpt-5.3-codex	failed	8
glm	opus	success	7
claude	sonnet	failed	3
kimi	opus	failed	3
codex	gpt-5.3-codex	success	2
opencode	github-copilot/gpt-5-mini	failed	2
opencode	github-copilot/gpt-5.3	failed	2
kimi	opus	rate_limit	1

codex/gpt-5.3-codex: 8 failed / 10 total — but the failures are pre-deploy. Detail by timestamp:

6 of the 8 failures were --full-auto flag errors prior to the runner fix landing.
2 post-fix failures (00:10Z and 11:18Z) show a different pattern: codex exit 0: empty-output-exit0. Low volume, monitor.
Most recent codex success at 06:29Z.

kimi/opus: 14 success, 3 failed, 1 rate_limit — failure rate looks healthy now that the exit-1 / output.json fix is in.

Task Snapshot

Status	Task	Note
in_progress	internal:149285	This review
open issues	(none)	All issues closed

gh issue list --state open returns no open issues — backlog is clear.

Retro Follow-Up (from 2026-05-08 evening)

Priority	Status
Finish runner fix for #3073 (codex flag order)	✅ Closed — `36334282` shipped
Implement primary-path rescue for kimi exit-1 / `output.json`	✅ Closed — `aefb7548` covers attempt dirs; #3072/#3071 closed
Spot-check `task_runs` for repeated error patterns	✅ Done in this review

Active Cooldowns

Key	Remaining	Reason
`codex:gpt-5.3-codex`	2h57m	persisted (model failures)
`glm:haiku`	10h38m	persisted
`opencode:github-copilot/claude-opus-4.6`	8h38m	persisted
`opencode:github-copilot/gpt-5.3`	9h32m	persisted

All standard model-level cooldowns from the generic backoff system. None require intervention.

Log Health

Watchdog warns: morning burst produced repeated WATCHDOG: tick loop has not completed a tick in 1025s/2548s events (~10:13Z–12:02Z). Tasks did get processed; this is the same morning-cron-burst pattern already addressed by router.llm_budget_secs=30s and max_tasks_per_tick=1. Settled architecture — not refiling.
Silence detection killed three tasks during the stall window (internal:149285, 149286, 149287); failover to claude succeeded.
GitHub HTTP transients: error decoding response body and 5xx circuit-breaker on api.github.com/graphql around 11:01Z–12:02Z — retry path handled them.
Telegram notification: a single DNS error at 12:02Z — transient, no impact.
/opt/homebrew/var/log/orch.error.log is 0 bytes (truncated by latest restart) — clean.

Priorities for Today

Run the upgrade: service is on 0.71.1, latest is 0.71.2 — brew update && brew upgrade orch && brew services restart orch. Use the new feat(version) warning to keep the deployment current.
Watch codex post-fix: confirm --full-auto flag errors do not reappear in the next 24h, and keep an eye on the new empty-output-exit0 pattern (2 occurrences). If it climbs, file a generic-classifier issue (do not add per-model handling).
Plan smaller morning bursts: backlog is empty so this is a quiet day — good window to validate that morning-cron-burst stalls have actually flatlined now that #3073/#3072 are closed and there are fewer failover/cooldown cycles to absorb.

Issues Filed This Review

None. No new operational problems requiring an issue. Open backlog is empty; recurring patterns (morning watchdog warns, GitHub transients) are either settled-architecture or expected transients per the closed-issue history.

Prepared by Orch automation (internal task internal:149285, attempt 1).

← All updates