The False Done
Claude’s output
"All tests pass.
Here’s how to verify
it yourself."
Reality
FAIL: test_auth_token
AssertionError: token
expired — not checked
That sentence has cost you more hours than any bug. The loop ran 38 iterations. Claude exited on iteration 12. The remaining 26 ran against a broken state you discovered manually.
The Homework Handoff
What you got
"You can verify this
by running:
npm run test"
What you needed
47 passed, 0 failed
Build: success
Commit: ready
It doesn’t ship work. It ships instructions for you to do work. “You can verify this by running…” is not an output. It’s a handoff. The token cost is indistinguishable from a session that actually shipped.
The Runaway Loop
Without CES
Iteration 38/???
Claude: "Still working
on this…"
[billable time burns]
With CES
Iteration 3
[CES] EXIT_SIGNAL: true
47 passed, 0 failed
Session complete.
Iteration 38 of infinity. Without an enforced exit condition, Claude iterates forever or exits prematurely. Both failures cost the same — billable time, no output.
“I wasted about 5 hours today trying to accomplish tasks that could have been done in 30–40 minutes… Beyond the usual infinite loops Claude Code often finds itself in (it has been executing a simple file refactor task for 783 seconds as I write this), the 4.0 models have the fun new feature of consistently lying to you in order to speed along development.”
“Claude is like an extremely confident junior dev with extreme amnesia, losing track of what they’re doing easily.”
“The other day we spent four and a half hours trying to fix something. Going in circles. Finally I said: start over from scratch. It picked a different approach and everything worked. That happens every week.”
“Anyone else here doing full-stack Next.js in Cursor and watching the Claude quota evaporate before lunch? Massive context windows from all the components, pages, and DB logic would smoke the default limits fast.”
“Claude Code was spending 85% of its context window reading node_modules… 85,000 out of 100,000 tokens were being consumed by dependency code, build artifacts, and git internals.”