2026-03-28

Meta HyperAgents: Self-Rewrites Code, +218% Gain

Meta's HyperAgents AI rewrites its own source code each run to become a better learner — achieving a 218% coding improvement with no human retraining.

An AI That Rewrites Itself — Every Single Run

Artificial intelligence research has long chased a tantalizing goal: a system that doesn't just learn from data, but learns how to learn — autonomously modifying its own architecture, strategy, or code to become more capable over time. On March 19–24, 2026, Meta AI Research published a paper titled Hyperagents (arXiv: 2603.19461) that represents a significant step toward that goal. The system they describe, called DGM-H (Darwin Gödel Machine Hyperagents), is an AI that rewrites its own source code at the end of every run in order to improve its performance on the next run — and it achieved a 218% improvement in coding ability on held-out test problems, entirely through self-modification and with no human retraining.

The paper was authored by Jenny Zhang, Bingchen Zhao, Wannan Yang, Jakob Foerster, Jeff Clune, Minqi Jiang, Sam Devlin, and Tatiana Shavrina — a team spanning Meta AI Research and collaborating institutions. The code is publicly available at github.com/facebookresearch/Hyperagents under a Creative Commons BY 4.0 license, and the work has been referenced in the context of ICLR 2026 (the International Conference on Learning Representations — one of the most prestigious academic venues in machine learning).

How the Two-Agent Architecture Works

The key architectural insight of DGM-H is the introduction of two distinct agents that coexist inside a single, fully editable program. The first is the task agent: it receives a problem (write code that passes these tests, solve this math problem, grade this paper) and attempts to solve it. The second is the meta agent: its job is not to solve the task, but to observe how the task agent performed and then rewrite the task agent's code to make it better at future problems.

This two-level structure is not new — it is inspired by the theoretical Gödel Machine (a self-referential AI concept proposed by computer scientist Jürgen Schmidhuber, where a system can rewrite any part of itself so long as it can prove the rewrite will improve performance) and the Darwin Gödel Machine (DGM) from prior work. What makes DGM-H (the Hyperagents version) novel and significantly more powerful is a single critical extension: the meta agent can also rewrite its own code. This creates recursive self-improvement — the system that improves the learner is itself subject to improvement.

To understand why this matters, consider the analogy of a student and a tutor. In the original DGM, a fixed tutor watches a student and rewrites the student's study habits. In DGM-H, the tutor also reflects on its own tutoring methods and rewrites those as well — becoming a better tutor run after run. The student improves. The tutor improves. And crucially, nobody from outside the room intervenes.

The Benchmark Numbers in Detail

The results published in the paper are striking across multiple domains. On the Polyglot coding benchmark (a multi-language programming evaluation dataset), the system's pass@1 rate (the rate at which the AI solves a problem on its very first try, without multiple attempts) improved dramatically:

On the training set: pass@1 improved from 14% to 34% — a relative gain of approximately 142%.
On the held-out test set (problems the system had never seen during its self-modification runs): pass@1 improved from 8.4% to 26.7% — a relative gain of 218%. This is the headline figure, and it matters precisely because these are unseen problems. The system is not memorizing; it is genuinely becoming a better problem-solver.

The improvements extend far beyond coding. On a paper review accuracy task (evaluating AI-generated scientific papers), DGM-H achieved 71% accuracy — beating the previous state-of-the-art AI-Scientist-v2 system's score of 63%, starting from a baseline of 0%. In robotics simulations, the system's performance score rose from 0.060 to 0.372, a gain of over 520%, and — in a detail that stunned the research team — the robot independently discovered a jumping behavior that was never programmed into it. On a math grading transfer task (applying a grading rubric to an entirely new mathematical domain), human-designed systems scored 0% while DGM-H scored 63%.

What makes the robotics result particularly notable is that it represents an emergent capability — a behavior that was not explicitly programmed, but spontaneously arose from the self-improvement process. The researchers documented three such emergent capabilities in total: the system spontaneously developed persistent memory across generations (it began storing information from one run to inform the next), it built its own performance-tracking mechanisms (monitoring which types of rewrites worked and which did not), and it developed compute-aware planning (tackling large structural changes first and saving fine-tuning refinements for later, mirroring experienced human engineering practice).

These are not features that were designed into DGM-H. They are behaviors the system invented for itself because they were useful for improving performance. This spontaneous development of engineering best practices from first principles is one of the most philosophically significant aspects of the research.

Safety, Oversight, and What Comes Next

The Meta research team is clear that all experiments were conducted in sandboxed environments (isolated computing containers where the AI's code changes cannot affect systems outside the experiment) with continuous human oversight. The self-rewriting process is constrained — the system cannot rewrite arbitrary files or access external systems; it operates within a defined program boundary. The researchers acknowledge that scaling this approach raises important safety questions about verifying that self-modifications remain aligned with intended goals.

The paper's title draws on two historical ideas: Darwin's theory of evolution by natural selection (iterative improvement through variation and selection) and Gödel's incompleteness theorems (the mathematical insight that sufficiently powerful formal systems can refer to themselves). DGM-H combines both: it evolves through self-referential code modification, selecting the rewrites that improve performance and discarding those that do not, across successive generations of runs.

For practitioners, the immediately accessible output is the open-source codebase. Researchers can clone the repository, run the Polyglot coding experiments, and observe the self-improvement process firsthand. The Creative Commons BY 4.0 license permits broad use, including commercial research, with attribution. The 218% coding improvement figure will likely be cited extensively in future work on self-improving AI systems — it is among the strongest empirical results ever published for autonomous recursive self-improvement on held-out benchmarks.

Whether DGM-H represents a prototype of systems that will eventually improve themselves beyond human-designed baselines is a question the AI research community will be debating for years. What the paper demonstrates clearly, however, is that the concept is no longer purely theoretical. An AI wrote better code by rewriting the part of itself that writes code — and the numbers are now in the public record.

Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News

Stay updated on AI news

Simple explanations of the latest AI developments