Iterative self-refinement from the NousResearch autoreason paper (github.com/NousResearch/autoreason).
Scored a perfect 42/42 Borda sweep on writing tasks. Every other approach — including stronger models — degraded below single-pass.
Each iteration runs a tournament: Critic finds problems. Author B revises. Synthesizer combines the best of both. Three blind judges vote via Borda count. Incumbent wins twice in a row = convergence.
Runs on Haiku 4.5 at $0.005/1k. A full autoreason run costs ~$0.075 — cheaper than a single Sonnet call, with better output on writing and strategy tasks.
Converged final output after adversarial refinement
Full reasoning trace showing each round's draft, critique, candidates, and judging
Example prompts
Write a landing page hero section for a developer tool that turns screenshots into code
Review this function and suggest improvements: function debounce(fn, ms) { let t; return (...a) => { clearTimeout(t); t = setTimeout(() => fn(...a), ms); }; }
Write a cold outreach email to a VP of Engineering about our observability platform
Explain the CAP theorem to a senior engineer who thinks they understand it but probably doesn't