How MIT’s SEAL framework teaches AI to learn from its own notes

SEAL enables AI to create its own training data in the form of self-edits, promoting continual learning.

AI Research, Safety & Ethics Analyst

Daniel Reed currently works as an AI Research, Safety & Ethics Analyst at Aiholics, writing about how changes in artificial intelligence are affecting and will affect...

- AI Research, Safety & Ethics Analyst

Published: December 13, 2025

8 Min Read

Large language models (LLMs) have already amazed us by reading, writing, and answering questions with impressive skill. But once their initial training is done, their knowledge tends to stay frozen, making it tricky to teach them new facts or skills — especially when we don’t have much task-specific data for retraining.

I recently came across MIT‘s new SEAL framework, an approach that flips that limitation on its head. Instead of relying on pre-designed training data and fixed instructions, SEAL lets AI models generate their own study notes and decide how best to train themselves. It’s a bit like how we humans prepare for tests — by rewriting notes, summarizing key ideas, and testing ourselves repeatedly, instead of just rereading textbooks.

How SEAL lets AI learn like a student

The core idea behind SEAL (which stands for Self-Adapting Large Language models) is that the AI produces short natural-language instructions called self-edits. These notes don’t just restate information but can infer new implications, summarize, or even suggest training tweaks like adjusting the learning rate. The AI then fine-tunes itself on these self-made notes, updating its internal parameters slightly.

Just like humans, complex AI systems can’t remain static for their entire lifetimes. They are constantly facing new inputs. SEAL aims to create models that keep improving themselves.

SEAL operates in two loops. In the inner loop, the model generates self-edits based on new readings and updates itself accordingly. Then it tests its own improvements by answering questions or solving puzzles. The outer loop uses reinforcement learning to keep only those self-edits that actually help performance — effectively teaching the AI how to write better notes over time.

Turning text into lasting knowledge

One of the coolest tests for SEAL was teaching the AI new factual knowledge. Instead of training directly on the original text, SEAL lets the model generate notes that highlight logical implications and key facts from a passage. Then the model trains on these notes using small updates.

**How MIT‘s SEAL works.** The AI writes “self-edits” short instructions for how to change its own model, applies those changes, takes a test task, gets a score (reward), and repeats the loop to learn which self-edits help it improve. Image: MIT

Here’s where it gets interesting: without any adaptation, the model in the test answered about 33% of questions correctly. Training directly on the original passages barely bumped that up. But training on its own generated notes improved accuracy to nearly 40%. Even more impressive, notes generated by GPT-4.1 helped push accuracy to about 46%, while SEAL’s own self-learned notes nudged that further to 47%, surpassing the performance of a much larger model’s notes.

And this wasn’t just a fluke; SEAL kept its edge when learning from hundreds of passages simultaneously, suggesting it genuinely learned a general skill: how to write great study notes.

Adapting on the fly for problem solving

SEAL also shines on puzzle-like reasoning tasks that demand quick adaptation. Imagine a small AI given just a few examples to solve visual pattern puzzles with colored grids. Normally, without training, success was zero. With simple test-time training, it reached only 20%. After SEAL’s self-editing process rehearsed multiple study plans and picked the best, success jumped to over 70%!

**How SEAL adds new knowledge.** The model reads a new passage, writes its own “study notes” (key takeaways/implications), then fine-tunes on those notes. After that, it’s tested with questions about the passage *without* seeing the original text – and its score becomes the reward signal that guides the next round of learning. Image: MIT

This is a massive boost, showing how self-generated training strategies can help models adapt in real time to new challenges. While a human-designed ideal training plan still hits 100%, SEAL demonstrates that AI can develop its own clever study methods, cutting down the need for human-crafted solutions.

**Figure 3: Learning from a few examples with SEAL.** The model starts with a handful of example puzzles, then writes a “self-edit” that says how it should practice (like what extra training examples to create and what training settings to use). It fine-tunes itself using that plan, and then it’s tested on a new puzzle to see if it improved. Image: MIT

The challenges ahead and why this matters

Of course, SEAL isn’t perfect. One ongoing problem is catastrophic forgetting, where learning new information causes the model to gradually forget what it previously knew. The AI doesn’t crash outright, but older knowledge erodes as new self-edits overwrite it.

Also, running these self-edits requires fine-tuning and testing steps that take up to 45 seconds each, which could become expensive or slow with bigger models or massive datasets. Solutions like letting AIs generate their own tests to evaluate themselves might reduce this overhead in the future.

Forgetting after repeated self-updates. The model is updated on one new passage at a time, then re-tested on earlier passages. The heatmap shows that as it learns newer passages, its performance on older ones often drops (it “forgets”). Image: MIT

Despite the hurdles, SEAL points us toward a future where AI models don’t get stuck as static entities but instead keep growing, revising what they know and how they know it — much like how people learn throughout their lives. This capability would be a game changer for AI assistants that need to stay updated, scientific research bots that digest new papers, or educational tools that improve by catching their own mistakes and filling in gaps.

SEAL offers a concrete path toward language models that are not just trained once and frozen, but that continue to learn in a data-constrained world.

In other words, teaching AI to take and learn from its own notes might be the breakthrough needed for models that evolve continuously, making them more resilient, adaptable, and ultimately, smarter.

Key takeaways

SEAL enables AI models to generate self-edits—study notes that help them improve continuously without human-designed datasets.
Training on self-generated notes raised knowledge retention and reasoning success dramatically, showing models can learn how to learn.
Challenges like catastrophic forgetting and costly training remain, but the approach points toward adaptable, lifelong learning AI systems.

It’s exciting to watch AI inch closer to learning more like we do – revising knowledge, testing itself, and growing over time instead of just stopping after initial training. SEAL is a step in that direction, and I can’t wait to see where this idea leads next.

Why the US blocking global access to Anthropic's latest AI models really matters

Anthropic's $65 billion funding round: What it means for the AI race ahead of IPOs

Elon Musk and Sam Altman clash in court: what their AI showdown means for the future

OpenAI folds Codex into GPT 5.5

How the US Air Force's AI Flight Test Assistant is speeding up military innovation

Archives

Categories

Gemini AI now works on your locked Android phone

MIT researchers unveil a method that lets AI models learn from their own notes

SEAL enables AI to create its own training data in the form of self-edits, promoting continual learning.

How SEAL lets AI learn like a student

Turning text into lasting knowledge

Adapting on the fly for problem solving

The challenges ahead and why this matters

Key takeaways

Leave a Reply Cancel reply

Trending

Your may also like!

Anthropic updates usage policy: What it means for AI, security, and political content

No more robot cashiers: McDonald's ends AI Drive-thru trial

Nvidia's AI chip delay: A bump in the road for tech giants

Why the US blocking global access to Anthropic's latest AI models really matters

Quick Links

Socials

Archives

Categories

How SEAL lets AI learn like a student

Turning text into lasting knowledge

More Read

Adapting on the fly for problem solving

The challenges ahead and why this matters

Key takeaways

Sign Up for the Daily AI Pulse

One email a day. All the stories that matter.

Leave a Reply Cancel reply

Trending

Your may also like!

Socials