3d ago

Nando de Freitas posts Interventional SFT method that prevents delusions in LLM agents through a one-line code change during supervised fine-tuning

Tests on over 30 prompts showed gains in factual accuracy.

142682833449.7K

——0——

Original post

One line of code is all it takes to prevent LLM agent delusions, instead of post-training patches like RL. https://love4all.ai/blog/why-it-is-important-to-understand-causality-and-agency/ ❤️ 4 ∀ https://github.com/nandodef/love4all-ai/tree/main/docs/files

4:11 AM · May 17, 2026

#29Nando de Freitas@NANDODF

The road here wasn’t easy. It started with our work on delusions with @AdaptiveAgents @ShaneLegg @scott_e_reed and many other bright scientists:

arxiv.org

/pdf/2110.10819

But instead of counterfactual learning, the theory of international imitation as a route to agency provided the foundation:

adaptiveagents.org

/universal_ai_as_imitation

The research was accelerated by @OpenAI GPT5.5 and Codex. When I ran out of Pro credits 😅 I switched to @AnthropicAI Claude. I wish there were special LLM licenses for academic work @gdb @sama @DarioAmodei 🙏

The bottleneck for research these days is computational resources/energy. I’m glad that startups like @cusp_ai are addressing the energy challenges.

This research was possible thanks to my @CIFAR_News fellowship - the 🇨🇦 gift that keeps on giving - and my adjunct/associated professorships @UBC_CS and @CompSciOxford

Nando de Freitas@NandoDF

11:11 AM · May 17, 2026 · 38.8K Views

11:29 AM · May 17, 2026 · 4.7K Views

#29Nando de Freitas@NANDODF

@AdaptiveAgents @ShaneLegg @scott_e_reed Typo: universal imitation, not international imitation 😅 🌌🌍

Nando de Freitas@NandoDF

The road here wasn’t easy. It started with our work on delusions with @AdaptiveAgents @ShaneLegg @scott_e_reed and many other bright scientists: https://arxiv.org/pdf/2110.10819 But instead of counterfactual learning, the theory of international imitation as a route to agency provided the foundation: https://adaptiveagents.org/universal_ai_as_imitation The research was accelerated by @OpenAI GPT5.5 and Codex. When I ran out of Pro credits 😅 I switched to @AnthropicAI Claude. I wish there were special LLM licenses for academic work @gdb @sama @DarioAmodei 🙏 The bottleneck for research these days is computational resources/energy. I’m glad that startups like @cusp_ai are addressing the energy challenges. This research was possible thanks to my @CIFAR_News fellowship - the 🇨🇦 gift that keeps on giving - and my adjunct/associated professorships @UBC_CS and @CompSciOxford

11:29 AM · May 17, 2026 · 4.7K Views

12:40 PM · May 17, 2026 · 2.9K Views

QUOTE POST

#1258Pedro A. Ortega@ADAPTIVEAGENTS

Very excited about this! Just fine-tune on the observation tokens and ignore the action ones to treat the agent's output as a causal intervention.

This is one of those moments when I'm surprised the maths works in practice 😅.

Nando de Freitas@NandoDF

11:11 AM · May 17, 2026 · 38.8K Views

1:07 PM · May 17, 2026 · 3.3K Views

QUOTE POST

#1258Pedro A. Ortega@ADAPTIVEAGENTS

Very excited about this! Just fine-tune on the observation tokens and ignore the action ones to treat the agent's output as a causal intervention.

This is one of those moments where I'm surprised the maths works in practice 😅.

Nando de Freitas@NandoDF

11:11 AM · May 17, 2026 · 38.8K Views

12:53 PM · May 17, 2026 · 90 Views