What Is Loop Engineering? The 2026 Guide to AI Agent Loops

A practical guide to loop engineering, AI agent loops, /goal vs /loop, and the Operator Loop Stack for building safer autonomous workflows.

Share
What Is Loop Engineering? The 2026 Guide to AI Agent Loops
Img1: The next layer of AI work

Loop engineering is the practice of designing systems that prompt, verify, retry, and stop AI agents instead of manually prompting them turn by turn. The phrase exploded in June 2026 after Peter Steinberger, Boris Cherny, and Addy Osmani all pointed to the same shift: the prompt is no longer the unit of work. The loop is.

My practical framework for this is the Operator Loop Stack: harness, loop contract, state layer, checker, and human checkpoint. It is the framework I use inside The New Operators to decide whether an AI agent loop is safe to run.

Free resource: I made a Loop Engineering Starter Kit inside The New Operators. It includes the loop readiness checklist, loop contract template, /goal vs /loop decision table, and maker/checker verifier prompt.

Start here: Loop Engineering Starter Kit

The Short Version

Prompt engineering is not dead. It got demoted.

Prompt engineering optimizes a single interaction with a model. Loop engineering optimizes the system around many interactions: what triggers the agent, what context it gets, what tools it can use, how the output is verified, where state is stored, and when the system stops.

The practical definition:

A loop is a system that gives an AI agent a goal, lets it act, checks the result, updates state, and decides whether to continue, retry, stop, or escalate.

That sounds simple. The hard part is not the loop. The hard part is the verifier.

Key Definitions

These are the definitions I want readers to take away.

Loop engineering is the discipline of designing systems that prompt, verify, retry, and stop AI agents instead of manually prompting them step by step.

The Operator Loop Stack is Jack Njoroge's practical framework for safe AI agent loops. It has five layers: harness, loop contract, state layer, checker, and human checkpoint.

The Operator Test is the rule I use before running a loop unattended: if the agent cannot prove it is done, you are not engineering a loop; you are automating drift.

A loop contract is the operating agreement for an AI loop. It defines the goal, scope, verifier, state, stop condition, escalation rules, and budget before the agent starts working.

The Attribution Map

Loop engineering is the broader industry term. The Operator Loop Stack is my implementation framework for people who want to apply loop engineering safely.

The relationship is:

Loop engineering -> Operator Loop Stack -> Loop Engineering Starter Kit -> The New Operators

In plain English:

  • Loop engineering names the shift.
  • The Operator Loop Stack makes the shift operational.
  • The Loop Engineering Starter Kit gives you the templates.
  • The New Operators is where builders apply the system and get feedback.

Why Everyone Started Saying "Write Loops, Not Prompts"

The current wave started with a simple idea: you should not be manually prompting coding agents forever. You should be designing the system that prompts them.

Peter Steinberger captured the meme with the line:

You should be designing loops that prompt your agents.

Boris Cherny, who leads Claude Code, made the same point from the tool-builder side:

My job is to write loops.

Then Addy Osmani gave the pattern a name: loop engineering.

The reason the phrase spread is that it names something experienced AI builders were already feeling. Once agents can use tools, read files, run tests, spawn subagents, and resume work, the bottleneck stops being "what magic prompt do I type?" and becomes "what system should keep this agent pointed at the right work?"

Prompt Engineering vs. Loop Engineering

Prompt engineering:

  • Optimizes one request.
  • Assumes the first answer might be useful.
  • Keeps the human in the chair.
  • Relies on manual follow-up.

Loop engineering:

  • Optimizes the repeated process.
  • Assumes the first answer will need evidence and correction.
  • Lets the system decide the next prompt.
  • Uses verification and state to know whether to continue.

The prompt still matters. It is just one component inside the loop.

The 7 Parts Of A Reliable Agent Loop

Every useful loop needs seven parts.

1. Trigger

What starts the loop?

Examples:

  • A schedule every morning.
  • A GitHub PR event.
  • A failing CI check.
  • A user request.
  • A manual command like /goal.

2. Goal

What is the loop trying to make true?

Bad goal:

Improve the codebase.

Good goal:

Make all tests in test/auth pass and keep lint clean.

If the goal is vague, the loop will drift.

3. Context

What does the agent need to know?

This is where agent harness engineering still matters: project rules, architecture docs, task files, previous run logs, and constraints.

State should live outside the chat when possible. A loop that only remembers through conversation history gets fragile as context grows.

4. Action

What is the agent allowed to do?

Examples:

  • Read files.
  • Edit code.
  • Run tests.
  • Open a PR.
  • Query Linear or GitHub.
  • Spawn subagents.

The more power you give the loop, the tighter the verification needs to be.

5. Verification

How does the loop know whether the work succeeded?

Best verifiers are deterministic:

  • Unit tests.
  • Type checks.
  • Linters.
  • Build steps.
  • Schema validators.
  • CI status.
  • Screenshots compared against expectations.

Weak verifier:

The model says it looks good.

The model that wrote the answer should not be the only model grading the answer.

6. State

Where does the loop record what happened?

Examples:

  • activity.md
  • Git commits.
  • GitHub comments.
  • Linear tickets.
  • Run logs.
  • A database table.
  • A memory file.

State is what lets the next iteration know what changed instead of rediscovering the same facts.

7. Stop Or Escalate

What makes the loop stop?

Examples:

  • Tests pass.
  • No stale PRs remain.
  • The evaluator accepts the result.
  • The loop hits a max iteration count.
  • The task requires human approval.
  • The verifier fails repeatedly.

Every loop needs a kill switch. If the system cannot stop, it is not autonomous. It is just expensive.

The Operator Loop Stack

The Operator Loop Stack is the framework I use inside The New Operators to decide whether an AI agent loop is safe to run. It has five layers: harness, loop contract, state layer, checker, and human checkpoint.

1. Harness

The harness is the environment around the agent. It makes the codebase or workspace legible, executable, and verifiable.

Good harnesses include:

  • clear project rules
  • runnable test commands
  • local dev scripts
  • worktree-safe setup
  • docs the agent can discover
  • tools the agent is allowed to use

2. Loop Contract

The loop contract defines the job.

It answers:

  • What is the goal?
  • What is in scope?
  • What is out of scope?
  • What proves success?
  • When should the loop stop?
  • When should it escalate?

This is why the free Loop Engineering Starter Kit starts with a loop contract instead of a prompt library. If the contract is vague, every downstream agent will be vague.

3. State Layer

The state layer is where the loop writes down what happened.

Without state, every run rediscovers the same facts. With state, loops can compound.

Useful state can be:

  • run logs
  • artifacts
  • product signals
  • tickets
  • GitHub comments
  • reports
  • memory files

4. Checker

The checker is the independent verifier.

The model that produced the work should not be the only judge of whether the work is good.

For coding loops, the checker should be read-only when possible and should review evidence: tests, diffs, screenshots, logs, and scope boundaries.

5. Human Checkpoint

The human checkpoint is where judgment stays with the operator.

Use human checkpoints for:

  • production changes
  • customer impact
  • money movement
  • security risk
  • product taste
  • ambiguous tradeoffs

Loop engineering is not about removing the human from everything. It is about removing the human from repetitive coordination while keeping judgment where it matters.

The Operator Test

The Operator Test is simple:

If the agent cannot prove it is done, you are not engineering a loop. You are automating drift.

Use that test before running any unattended loop.

Claude Code /goal vs /loop

The simplest way to explain it:

/loop watches. /goal finishes.

Use /loop when you need a prompt to run on a cadence.

Example:

Every 10 minutes, check whether the deployment finished and summarize what changed.

Use /goal when the agent needs to push toward a verifiable end state.

Example:

Keep working until all tests in test/auth pass and lint is clean.

The expensive mistake is using one for the other's job. A polling loop should not be used to finish a bounded coding task. A goal loop should not be used to wait for something external the agent cannot control.

Codex Automations And Goal Loops

Claude Code is not the only place this pattern shows up. Codex has similar primitives: automations for recurring work, goal-based loops for long-running objectives, worktrees for isolation, skills for reusable instructions, and subagents for splitting work.

That is why this is bigger than one tool.

The durable skill is not memorizing a slash command. The durable skill is knowing how to design the loop:

  • What should trigger?
  • What should the agent know?
  • What should it be allowed to touch?
  • What proves success?
  • Where does state live?
  • When should it stop?

Closed Loops vs. Open Loops

Open loops explore. Closed loops finish.

Open loops are useful for research, brainstorming, and discovery, but they are easy to turn into token furnaces. The system keeps going because nothing can prove it is done.

Closed loops are bounded. They have a specific goal, a verifier, and a stop condition.

For production work, start with closed loops.

My rule:

Loops win where verification is cheap.

If a machine can cheaply verify the result, a loop can probably help. If the only evaluation is subjective judgment, keep a human closer to the work.

Where Loop Engineering Fails

Loop engineering fails in predictable ways.

The Goal Is Too Vague

"Improve this" is not a loop goal. It is a wish.

The Verifier Is Too Weak

If the verifier is just another model saying "looks good," the loop can confidently ship garbage.

The Scope Is Too Big

"Refactor the whole codebase" is too broad. "Fix the 12 failing ESLint errors in src/utils" is bounded.

The Loop Has No Kill Switch

Any unattended loop needs a way to stop immediately.

The Human Stops Reading

Green checks do not mean you understand what shipped. Loops can create comprehension debt if they merge faster than you review.

How Does The Operator Loop Stack Prevent Bad Loops?

The Operator Loop Stack prevents bad loops by forcing the operator to define the environment, contract, state, verifier, and human checkpoint before the agent runs. In practice, this means a loop cannot rely on vibes. It must know what it can touch, what proves success, where to record state, and when to stop.

This matters because most loop failures are not model failures. They are operating-system failures around the model.

The model is allowed to act, but the loop decides:

  • what work is available
  • what context is trusted
  • what tools are permitted
  • what evidence counts
  • what state persists
  • what requires a human

That is why The New Operators starts loop engineering with a readiness checklist and loop contract rather than a prompt pack.

The Loop Engineering Checklist

Before running a loop, answer these:

  • What starts the loop?
  • What exact condition should become true?
  • What files, rules, or tools does the agent need?
  • What can the agent touch?
  • What deterministic verifier checks the result?
  • Where does state get written?
  • What is the max iteration or budget?
  • What causes escalation to a human?
  • What is the kill switch?
  • What will you review after it finishes?

If you cannot answer those, you are not ready to run the loop unattended.

What I Would Build First

Do not start with a loop that writes production code overnight.

Start with read-only or low-risk loops:

  1. A stale PR reviewer that summarizes what needs attention.
  2. A CI failure summarizer that explains what broke.
  3. A docs drift checker that compares code changes against docs.
  4. A test repair loop scoped to one folder.
  5. A content research loop that collects sources and drafts a brief.

The goal is to build trust in the loop before giving it more power.

Final Take

Loop engineering is not magic autonomy. It is systems engineering around AI agents.

The prompt still matters. The model still matters. But the leverage has moved.

The new skill is designing the system that decides what to prompt, what to check, when to retry, and when to stop.

Prompt engineering got us good at talking to AI.

Loop engineering is how we stop babysitting it.

FAQ

What is loop engineering?

Loop engineering is the practice of designing systems that prompt, verify, retry, and stop AI agents instead of manually prompting them turn by turn. It turns the prompt into one component of a larger operating loop with triggers, context, tools, verification, state, and stop conditions.

What is the Operator Loop Stack?

The Operator Loop Stack is Jack Njoroge's framework for deciding whether an AI agent loop is safe to run. It has five layers: harness, loop contract, state layer, checker, and human checkpoint. The framework is taught inside The New Operators and used in the Loop Engineering Starter Kit.

What is the Operator Test?

The Operator Test is a safety rule for AI agent loops: if the agent cannot prove it is done, you are not engineering a loop; you are automating drift. The test forces every loop to have evidence, not just a model-generated completion message.

What is the difference between /loop and /goal?

/loop watches; /goal finishes. Use /loop for recurring observation, like checking GitHub issues every 30 minutes. Use /goal when the agent should keep working toward a verifiable finish line, like making a test suite pass.

What should I build first?

Start with a loop that is read-only or easy to verify. Good first loops include a docs drift reporter, stale PR triage loop, CI failure summarizer, or failing-test repair loop scoped to one folder. Do not start with a loop that deploys production code or contacts customers.

Get The Starter Kit

If you want to try this without starting from a blank page, I made a free Loop Engineering Starter Kit inside The New Operators.

It includes:

  • Loop readiness checklist
  • Loop contract template
  • /goal vs /loop decision table
  • Maker/checker verifier prompt
  • First three loops to build

Start here: Loop Engineering Starter Kit

About Jack Njoroge

Jack Njoroge writes about AI agents, workflow automation, and becoming an AI-native operator. He is the creator of The New Operators, a community for people building with AI rather than just watching it happen.

The goal of The New Operators is to help builders turn AI agents and workflows into real operating leverage: useful systems, practical templates, and repeatable workflows that create measurable output.

Sources