2026-03-20AI agentsClaude CodeAI competitionself-learning AIoptimizationautonomous AI

AI beat competition winners — nobody taught it how

AI agents taught themselves 8 strategies, beat competition winners in 5 categories, and solved a never-before-cracked problem — zero human guidance.

A developer just proved that AI agents can teach themselves to outperform human experts — without a single hint. The project, called Agent-SAT, turned multiple Claude Code agents loose on a real international optimization competition. The result: 220 out of 229 problems solved, five solutions that beat the competition winners, and one solution to a problem that had never been solved before.

The project hit 163 points on Hacker News and is trending on GitHub — not because of fancy UI, but because of what it means for AI autonomy.

Agent-SAT GitHub repository — autonomous AI agent that teaches itself optimization

What actually happened

The 2024 MaxSAT Evaluation is a global competition where teams build software to solve extremely hard optimization problems — the kind that decide everything from chip design to airline scheduling. Human teams spend months fine-tuning their solvers before submitting.

Developer Ilia Zintchenko took a different approach: he gave AI agents the competition problems, a blank notebook, and zero instructions. No strategies. No hints. No human coaching. The agents had to figure everything out themselves.

The scoreboard after the AI finished:

220 of 229 problems solved (96%)
5 solutions better than what human teams submitted
30 solutions matching the best-known optimal answers
1 problem solved that had no known solution at all
37.5% improvement on one problem over the competition winner

How the agents taught themselves

The architecture is surprisingly simple. Multiple AI agents (running Claude Code) work simultaneously across different computers. They share a single GitHub repository as their "brain" — a file called expert.md where they write down everything they learn.

Here's how the cycle works:

An agent reads the accumulated knowledge from previous runs
It tries different approaches on competition problems
It records what worked (and what didn't) in the shared knowledge base
It pushes its discoveries to GitHub so other agents can build on them

No coordination needed — just git pull and git push. The agents figured out the rest.

Eight strategies the AI invented on its own

The most remarkable part: the agents didn't just try random things. They independently developed eight distinct problem-solving strategies, each optimized for different types of problems:

1. Greedy SAT — For problems with fewer moving parts: tackle the biggest pieces first

2. Core-guided search — Find the hardest constraints and relax them one at a time

3. Weighted optimization — Prioritize the most important rules when not everything can be satisfied

4. Clause-weighting — Dynamically adjust which rules matter more as the search progresses

5. Tabu search — Avoid revisiting dead ends by keeping a "do not return" list

6. Multi-initialization — Start from multiple different angles and keep the best result

7. Biased-SAT — Deliberately make "wrong" moves to escape traps

8. RC2 with CaDiCaL — A powerful combination the AI discovered was underused by human teams

The agent also learned when to use each strategy. Its knowledge base includes rules like: "The single most important factor is number of soft clauses, not total variables or clauses" — a nuanced insight that took human researchers years to establish.

The problem nobody could solve

Perhaps the most striking result: the AI found the first-ever solution to a problem called pseudoBoolean mod010. No human team had cracked it during the official competition. The AI agents, working iteratively and building on each other's partial progress, found an answer.

On another problem (switchingactivity_74), the AI achieved a cost of 10 — compared to the competition winner's cost of 16. That's a 37.5% improvement over the best human entry.

What the AI still can't do

The project is refreshingly honest about limitations. Nine problems remain unsolved — mostly massive instances with over 16 million variables. The agents also exhibit what the creator calls "tunnel vision": they sometimes fixate on one problem for hours instead of moving to easier wins. And despite instructions to work continuously, agents typically stop after several hours on their own.

The biggest unsolved challenge: a problem with 2.5 million variables where the AI's best answer is still 602 times worse than the reference — a humbling reminder that AI autonomy has clear boundaries.

Why this matters beyond math puzzles

MaxSAT problems aren't academic exercises. They model real-world decisions: scheduling airline crews, designing computer chips, optimizing delivery routes, and planning factory layouts. Better MaxSAT solvers directly translate to better solutions for billion-dollar logistics problems.

But the bigger story is the method itself. An AI that can teach itself a specialized domain — accumulating knowledge, developing strategies, and improving iteratively — could theoretically do the same in fields like drug discovery, materials science, or financial modeling.

Try it yourself: The project is open-source and designed to run with Claude Code. You'll need API access and a GitHub token.

# Clone and launch on a cloud VM
git clone https://github.com/iliazintchenko/agent-sat
cd agent-sat
./run.sh --host ec2-user@your-ip --agents 3

# Or run locally
./run_local.sh

The contributors listed on the repository: iliazintchenko (the human creator) and claude (the autonomous agent). It may be one of the first GitHub projects where an AI earned its contributor credit by genuinely teaching itself a skill nobody programmed it to have.

Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News

Sources

Stay updated on AI news

Simple explanations of the latest AI developments