/Tech47d ago

Richard Sutton, University of Alberta professor, shares a 26-word distillation of his Bitter Lesson urging AI researchers to favor computation-scaling methods over embedded human knowledge

AI Judge changed title after evaluation, original title: "Richard Sutton, University of Alberta professor and Turing Award winner, posts 26-word summary of the Bitter Lesson favoring scalable search and learning over human knowledge"

Gary Marcus and others reply debating reliance on human data and priors.

3718.6K1K3.4K710.9K

#62

Original post

Richard Sutton@RichardSSutton#119inTech

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

9:58 AM · May 18, 2026 · 447.4K Views

Sentiment

Positive users praise Richard Sutton's Bitter Lesson insights on scalable search and learning without human knowledge or taste, while negative users dismiss the claims as unrealistic and lacking evidence.

Pos

50.4%

Neg

49.6%

51 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS40.3KLIKES361RETWEETS29REPLIES20

Jiaxin Wen@jiaxinwen22

I might be one of the few people who is most bearish on human research taste and bullish on automated research: - "AIs can only do hyperparameter search" is mainly a skill issue with bad automated research setups. - human taste is overrated, e.g. frontier labs / neolabs are doing pretty simlar things. - human taste might win in a low-compute world, but not a high-compute world we're entering.

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d40.3K361159

BOOKMARKS184

Herbie Bradley@herbiebradley

Some takes about RSI from discussions with many smart researchers & thinkers:

1. Many RSI (or automated AI R&D) debates converge to similar cruxes: is a 1000x sample efficiency improvement possible, can you just simulate reality and train on it with no sim2real gap, can we easily make models good at "fuzzy" tasks? People like to assume that automated research agents will find such breakthroughs specifically *because* without them, progress could be heavily bottlenecked on data or continued compute scale-ups.

2. The Yudkowsky "genius brain in a box" framing of ASI has latent influence on many researcher views even though people may not be aware of it. A common move is to "flip" predictions, as they go further out, from assuming LLM or deep learning-specific properties of future AI to assuming "von Neumann x1000", human brain-like properties. I'd like to see more thought-out reasoning of why this flip should occur at any particular point (eg pre or post automated AI R&D)—this question is a crux behind many predictions like AI 2027.

3. There are some cracks in this worldview beginning to show: predictions from a few years ago that models would be less jagged now than they are, or that they would be more deceptive, synthetic data would work better, etc. Many of these seem like prediction errors from imagining future models as a "human brain in a box", but LLMs are empirically a different kind of intelligence. Most models of software-only intelligence explosion are also coarse enough to mostly ignore properties of LLMs.

4. Views about fast RSI progress seem to be correlated with (a) belief that synthetic data is all you need (b) belief in very high GDP growth and an industrial explosion because of automated firms (c) having worked only in AI research or in small organizations.

5. Key technical things to track over the next 1-2 years: does RL increase in its generalization, AI lab data spend, can we automate synthetic RL env construction, best practices for FDEs deploying AI into large enterprises, coherency of AI personas, how powerful will multi-agent scaling of test-time compute be, and continual learning.

6. Overall I think the "RSI leading to *fast* takeoff" frame had huge alpha in 2022, moderate in 2024, and potentially is of neutral usefulness in 2026 for predicting the future.

46d20.1K263184

Dileep George@dileeplearning

Here's a better lesson, don't fall for bitter lesson.

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d31.6K231105

Rishabh Agarwal@agarwl_

Perplexed by this take: Sure, let's not mainly do supervise learning on human knowledge, but it makes sense to build off it instead of the *let's do it from scratch*.

People cite AlphaGo vs AlphaGo Zero as a quintessential example of how using human-generating data is suboptimal but it was *imitating* it that was suboptimal.

What if we learned from that data assuming it was suboptimal in the first place (so not supervised learning but RL like mindset of using that data)

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d24.8K12967

kache@yacineMTB

GOD GAVE US THE UNIVERSE, THE ORACLE. WE MUST MINE IT

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d19.8K22149

Chirasmita Mallick@chirasmita16

@RichardSSutton wrote "The Bitter Lesson" in 2019.

Silicon Valley read it and scaled LLMs to the moon.

Sutton says they got it wrong. 🧵

46d14.7K8561

rohan anil@_arohan_

Two humans armed with same codex / claude code version and can produce completely different outcomes and measured on log scale.

Pranav Shyam@recurseparadox

falseposting. value of understanding at all time highs 📈

46d17.5K12818

Gary Marcus@GaryMarcus

i wonder whether this needs an update.

current methods, such as they are, leverage massive amounts of human knowledge as their primary fuel. they would be lost without it.

and they even build some knowledge into their system prompts.

and lately they build knowledge into their harnesses, usually by over 50 tools that have been carefully crafted with human knowledge.

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d7.9K6815

Bojan Tunguz@tunguz

One way of looking at the bitter lesson.

46d2.3K3614

Thomas G. Dietterich@tdietterich

@RichardSSutton I think it is worth studying human knowledge, particularly understanding the structure of its abstractions, as they can provide guidance about the kinds of things humans learn and machines do not (yet). I wouldn't call that "distraction".

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d7.1K658

Shubhendu Trivedi@_onionesque

The bitter lesson is that we are doomed to endless bitter lesson exegesis.

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d3.9K417

Jiaxin Wen@jiaxinwen22

However, I hope humans keep doing our own research, with *strong* tastes and priors. Not all research is about outcomes. Sometimes you just want to solve/understand a problem in a way that feels like yours.

Jiaxin Wen@jiaxinwen22

47d2.6K338

Julian Togelius@togelius

Meanwhile, Claude's system prompt is the size of a novel, and the harness is the size of a small operating system. Modern LLMs are trained on most of human knowledge. "AI" operates in a human world, and intelligence cannot be cleanly separated from knowledge about the world.

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

45d5.2K286

ib@Indian_Bronson

@RichardSSutton yeah cc: @tszzl @khoomeik

https://plato.stanford.edu/entries/epistemology-india/

More AI researchers should read the Nyāya-Sūtras and Tattvacintāmaṇi.

47d674118

Xiao Ma@infoxiao

bitter and sad

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

47d6.7K177

Pranav Shyam@recurseparadox

falseposting. value of understanding at all time highs 📈

Ryan Brewer@ryanbrewer

My entire job is now codex and managing codex threads, I’m genuinely curious what the software engineering job even is anymore. The value of my understanding of any system goes down every single day. Very weird times

47d20.7K314

Lucas Beyer (bl16)@giffmana

@agarwl_ discarding humand data (be it in LLM or in non-trivial "classic" RL env) is just complete nonsense. Even if Rich says it. But I'm not sure he's actually saying this here.

Guess he should use more than 26 words, but then it wouldn't sound Ilya-style mysterious anymore :)

Rishabh Agarwal@agarwl_

Perplexed by this take: Sure, let's not mainly do supervise learning on human knowledge, but it makes sense to build off it instead of the *let's do it from scratch*.

People cite AlphaGo vs AlphaGo Zero as a quintessential example of how using human-generating data is suboptimal but it was *imitating* it that was suboptimal.

What if we learned from that data assuming it was suboptimal in the first place (so not supervised learning but RL like mindset of using that data)

47d2.2K154

Wenting Zhao@wzhao_nlp

❤️

Richard Sutton@RichardSSutton

The bitter lesson in 26 words:

Don’t be distracted by human knowledge, as AI has been historically. Instead focus on methods for creating knowledge that scale with computation, like search and learning.

46d8.3K174

Minh Nhat Nguyen@menhguin

@jiaxinwen22 median human taste is not great. the tails are pretty good, but for the purposes of "will AI replace a profession" the media is not hard to beat.

Jiaxin Wen@jiaxinwen22

47d1.6K212

Herbie Bradley@herbiebradley

Related interesting point

christian@curious_vii

It's a good question. My own perspective on this is downstream of three pretty strong metaphysical claims:

1. The future is not only uncertain, not only unknowable, not only incalculable, but actually unimaginable. And so input generated from unbounded exploration on some frontier—be it scarce context, digital exhaust, or embodied experience—are essential inputs to the goal-setting process.

2. The second is that goals that are also good cannot be cast arbitrarily and must instead be discovered by meditating upon and then contemplating the aforementioned data in one. This is where I think I fundamentally disagree with some of the synthetic data and automated research maximalists. AI models themselves cannot contemplate. An AI system can be useful as macroscopic lens or sensor/actuator network tissue (and a pretty good substitute for bureaucracy per se), but at the end of the day, only a person—an embodied human being—can actually discern what is good and desirable, latent in that data.

3. The third is that those good goals are emergent only in the context of some relationship between human persons. You might be able to discover something that is technically true through some kind of radical commitment to scientific inquiry, but unless all that is ultimately tied to that which is good that another person can also contemplate, then it is not good or ultimately desirable in the sense that it invites actions at the expense of other actions.

Now, with all that in mind, there are myriad approaches to generating the raw material from which you can derive good goals and the means by which you can iteratively pursue them. But I don't think you can ever take a person out of the loop, even if those loops grow by many orders of magnitude from here on out.

46d1.7K114