Original post
Alex Rives@alexrives#1087inAI

Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology.

The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics.

We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity.

We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures.

ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences.

A world model of protein biology emerges through language modeling.

We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins.

The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science.

This understanding emerges without prior knowledge, just from language modeling of protein sequences.

Language models are becoming a powerful substrate to understand and program biology.

The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders.

I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.

5:22 AM · May 27, 2026 · 283.3K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
VIEWS60.1K
biohub@biohub

Proteins are the machinery of life. Scientists have cataloged billions of protein sequences—but their biology is still mostly unknown.

Today we're releasing a world model of protein biology: a scientific engine for prediction, design, and discovery that consists of ESMFold2, ESMC, and ESM Atlas. Together, they're helping to open up a new way for researchers to design proteins and speed up scientific discovery.

Our mission is to cure or prevent disease. To do that, we need to accelerate science. That's why we're releasing all three openly. https://bit.ly/3PGf1dk

7dViews 60.1KLikes 326Bookmarks 141
BOOKMARKS143

today was a massive day for protein engineering.

esmfold2 dropped—next gen of the esm series, fully open on @huggingscience. 1.1 billion predicted structures, 6.8 billion sequences. 800m more entries than the alphafold db, and reportedly edging out alphafold3 on protein complexes, including antibody–antigen binding.

alongside it: the new esm atlas. a huge expansion of known protein space, heavy on metagenomic sequences from soil, ocean, and the parts of biology that have been least characterised (until now!!)

and if that weren't enough, litefold dropped the fineweb of proteins, so every major protein database (pdb included) aggregated, cleaned, and made plug-and-play in one place.

these are the releases that push the whole field forward, and the pace of open science right now is almost motion-sickness inducing

all of it on http://huggingscience.co (and ofc @huggingface)

7dViews 30.8KLikes 319Bookmarks 143
LIKES370RETWEETS98
nature@Nature

A newly released AI tool has generated an atlas of more than one billion predicted protein structures and billions more protein sequences.

https://go.nature.com/4fblM0Z

7dViews 41.1KLikes 370Bookmarks 143
REPLIES17
Roshan Rao@proteinrosh

Announcing ESMFold2, our new state-of-the-art structure prediction model capable of predicting structure from single sequences or MSAs. ESMFold2 improves on benchmarks of protein-protein interaction and is particularly strong on predictions of antibody-antigen complexes.

Alex Rives@alexrives

Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology.

The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics.

We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity.

We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures.

ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences.

A world model of protein biology emerges through language modeling.

We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins.

The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science.

This understanding emerges without prior knowledge, just from language modeling of protein sequences.

Language models are becoming a powerful substrate to understand and program biology.

The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders.

I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.

7dViews 41.2KLikes 297Bookmarks 56