Gabriel Koerich Orchestrator

Evening Retrospective — 2026-03-03

Summary

Strong ops day focused on agent reliability. The biggest wins were eliminating git-fetch failures via pre-fetching, fixing orch stream output delivery, and breaking a needs_review re-route loop. Several resilience improvements landed around GitHub rate limits, webhook deduplication, and re-routing failed agents with no commits. Service health looks stable.


Morning Review Recap

PriorityOutcome
Monitor status-based review workflowNo regressions observed; review gate loop with no PR fixed.
Consider reducing stuck thresholdNot addressed today.
Keep an eye on review edge casesReview gate loop fixed; no new stuck in_review reports.

Tasks Completed Today

AreaChangesNotes
Agent workflowpre-fetch branches + remove git fetch from agent prompt; re-route on agent failure with no commits; break needs_review re-route loopEliminates sandbox git-fetch failures and prevents infinite reroute loops.
Streaming & sessionsorch stream output fixed; duplicate tmux session creation fixedRestores live output observability and avoids duplicate sessions.
GitHub sync reliabilityproactive API rate limit handling; webhook delivery dedupe; review gate loop when no PR fixedFewer stuck tasks during API blips and no-PR review loops.
PerformanceN+1 GitHub API calls removed in dashboard/statusCLI should be noticeably faster.
Opencode integrationpermissions blocked by global config fixed; NDJSON review response parsing fixedReduces opencode-specific failures.

What Didn't Go Well

  • GitHub API instability continues to surface (rate limits, transient failures). Mitigation landed today, but this remains a recurring risk.
  • Routing loop risk: the needs_review re-route loop fix suggests earlier logic allowed repeated re-dispatches without progress. Fixed, but worth monitoring.

Prompt Effectiveness

PromptAssessment
prompts/agent_system.mdClear and strict; workflow checklist and sandbox constraints are explicit.
prompts/route.mdGood; label-bias guidance is explicit and should reduce misrouting.
prompts/review_task.mdStrong structure, but still instructs git fetch which conflicts with sandbox guidance. Recommend aligning with “no fetch” workflow.

No prompt edits were made today; alignment between review workflow and sandbox constraints is the main candidate.


Routing Accuracy

  • No obvious misroutes today. The fixes were about failure recovery, not incorrect executor selection.
  • Re-route logic is now more conservative when agents fail without commits, which should reduce wasted cycles.

Performance & Bottlenecks

  • GitHub API: rate-limit handling improved; still a bottleneck during spikes.
  • Webhook delivery: deduping should reduce redundant sync work.
  • CLI performance: N+1 API calls removed in dashboard/status.
  • No lock contention observed in the commit set; live streaming now works again.

New Issues Filed

None. gh issue list --state open failed due to GitHub API connectivity, so no new issues were created to avoid duplication.


Tomorrow's Priorities

  1. Align review prompt with sandbox constraints (remove git fetch step, rely on pre-fetched refs). Confirm no open issue already exists before filing.
  2. Monitor API rate limit behavior with the new proactive handling under real load.
  3. Watch re-route behavior for needs_review tasks to ensure the loop fix sticks under concurrent task traffic.

Evening Update — 2026-03-03 (Late)

Morning Review Follow-ups

PriorityStatusNotes
Align review prompt✅ Doneprompts/review_task.md now uses git rebase origin/main instead of git fetch; matches agent_system.md guidance.
Reduce stuck thresholds❌ Not doneStill pending; low priority but would improve responsiveness.
Monitor review edge cases✅ DoneReview gate loop fix holding; no new stuck in_review reports.

Today's Commits (Major Security & Reliability Push)

The following significant changes landed today:

CommitDescriptionImpact
8c630f0Add tmux session env helpers, avoid embedding GH_TOKEN in runner scriptsSecurity: tokens no longer in per-task scripts
6a467eeIntegrate async GitHub App token resolution in GhHttpReliability: native auth without gh CLI
a78c955CI: scope contents:write permission to release job onlySecurity: least privilege
6601d25fix: improve capture diffingReliability
843e1c5Drop runtime gh dependency from Homebrew formula; prefer native GhHttpSimplicity: fewer runtime deps
e89b234Secure GH_TOKEN handlingSecurity
0157f63Make GhHttp primary for PR creation, remove gh CLI fallbackReliability
22bca09Tests: prevent secret leakage in task artifactsSecurity
21f0396Consolidate gh CLI usage, add safe wrapperReliability
f4a16e3Eliminate on-disk runner scripts — PTY-based agent runnerSimplicity
c27a1b2Add GitHub App auth and robust token resolverReliability

Theme: Major hardening around token handling, removing gh CLI dependency, and improving auth robustness.

Current Open Issues

IssueStatusAgentPriority
#404in_progressminimaxEvening retrospective (this task)
#395needs_reviewclaudeDiscord Gateway WebSocket
#386needs_reviewclaudePTY-based runner (done, needs merge)
#378in_progresskimiCentralize token resolver
#372needs_reviewopencodeCode development

Observations

  1. Prompt alignment complete: Both agent_system.md and review_task.md now consistently instruct agents to use git rebase instead of git fetch, matching the orchestrator's pre-fetch workflow.
  2. Security posture improved: GH_TOKEN no longer embedded in per-task runner scripts; GitHub App auth integrated natively.
  3. gh CLI dependency reduced: Native GhHttp now handles PR creation and other GitHub API operations.
  4. Test failures observed: Some CI test failures occurred (see recent gh run list); may need investigation but review gate passed.

Tomorrow's Priorities (Updated)

  1. Investigate test failures — check recent CI failures for root cause
  2. Merge #386 — PTY runner change is needs_review, should be ready
  3. Stuck threshold reduction — revisit lowering timeouts if not done
  4. Monitor #378 — token resolver centralization in progress

← All updates