Close the Loop

I’ve been shipping with Codex for almost a year. The workflow that changed things most is the goal feature combined with a disciplined four-phase loop.

Conversation Before Code

Bring Codex the feature or problem. Ask what the possible approaches are. Explore trade-offs. Settle on one. The critical constraint: do not let Codex start coding yet. This is the planning phase. You are building alignment, not a diff.

Goals work best when the task is well-defined. A vague goal produces vague loops. A clear implementation decision produces a clear finish line.

Set the Goal

Once you have agreed on the approach, hand control to the model with this prompt:

Okay, now set your own goal using set_goal() for this session to close the loop by [implementing XYZ] and leave no issues behind. Phase 1: plan comprehensive fix. Phase 2: implement. Phase 3: comprehensive review on all working changes. Phase 4: evaluate — either fix issues by restarting the loop at Phase 1, or, if no actionable findings, goal met.

The model sets the goal itself. From here, it runs.

The Loop

The four-phase structure is what makes this more than a long prompt.

Plan → Implement → Review → Evaluate. If review in phase three finds something, phase four sends it back to the start. It does not patch and move on. It replans.

This self-correction is what eliminates the rework. The model is not just executing — it is checking its own work against a known standard and deciding whether that standard has been met.

Sessions routinely run for several hours. The longest one I had recently was sixteen hours.

What Makes the Review Phase Work

Phase three is only as good as the review tooling behind it.

The skill doing the work is jig-review:comprehensive-review from jig-skills. When the goal reaches phase three and the model sees “comprehensive review,” it picks up this skill automatically — that is what the trigger is configured to match. It runs $cc:review (Claude Code’s own review skill) and Codex’s native review independently over the same scope, then deduplicates and merges the findings — attributing each to Claude, Codex, or both, ordered by severity. Running $cc:review inside Codex requires the cc-plugin-codex bridge; that is a prerequisite, not the reviewer.

Two independent passes over the same diff catch different things. When both reviewers flag the same issue independently, that agreement is what elevates it — a marginal finding from one becomes a confirmed one from both.

After the Loop

When the goal completes, I usually run one more manual comprehensive review. It occasionally surfaces a few things worth cleaning up — but the delta between that and what used to come out of a regular session is significant.

Before goals, rework was the norm. Now the loop handles most of it before I see the output.

The goal feature gives the model something it usually lacks in long sessions: a formal obligation to keep checking until the work is actually done.

Conversation Before Code

Set the Goal

The Loop

What Makes the Review Phase Work

After the Loop

More notes.

The Harness Is the Product

Agent-first CLI design