Cursor AI releases Composer 2.5, its most advanced coding model that scores 62 on the Artificial Analysis Coding Agent Index at 10-60 times lower cost than leading rivals

QUOTE POST

#23Sasha Rush@SRUSH_NLP

Huh, so you’re saying it’s smart, cheap, and fast.

1:59 AM · May 21, 2026 · 11.7K Views

QUOTE POST

#23Sasha Rush@SRUSH_NLP

Been working on text feedback / OPSD in Composer. Really interesting space, and a much more to be explored.

7:03 PM · May 18, 2026 · 2.3K Views

REPLY

#23Sasha Rush@SRUSH_NLP

@eliebakouch Unfortunately can’t give a precise answer. Both were scaled significantly.

elie@eliebakouch

@srush_nlp really cool work congrats to the team. if you can answer, do you have rough estimate on how the compute was allocated between RL and continual pt in composer 2 -> 2.5?

7:13 PM · May 18, 2026 · 528 Views

7:15 PM · May 18, 2026 · 189 Views

QUOTE POST

#23Sasha Rush@SRUSH_NLP

Been working on text feedback / OPSD in Composer. Really interesting space, and much more to be explored.

7:39 PM · May 18, 2026 · 30.3K Views

REPLY

#23Sasha Rush@SRUSH_NLP

@adityagrover_ @siyan_zhao Thanks for publishing! We like your paper a lot.

Siyan, I think we had a chance to meet at NeurIPS. Let us know if you ever are in NYC and want to come by and chat.

Aditya Grover@adityagrover_

.@siyan_zhao will be talking more about our work introducing OPSD at ICML in July. https://siyan-zhao.github.io/blog/2026/opsd/

5:08 PM · May 19, 2026 · 1.4K Views

6:23 PM · May 19, 2026 · 841 Views

QUOTE POST

#68clem 🤗@CLEMENTDELANGUE

Very cool to see Cursor doubling down on training great models. In my opinion, ultimately all serious companies in AI will want to train models themselves, based on open-source instead of outsourcing AI to others via APIs!

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

5:27 PM · May 18, 2026 · 23.6K Views

QUOTE POST

#76Elon Musk@ELONMUSK

Try Composer 2.5 on Cursor!

Michael Truell@mntruell

Composer 2.5 is now the most-chosen model in Cursor. We're giving everyone 10x usage for the rest of the day. Enjoy!

4:54 PM · May 19, 2026 · 33.8M Views

7:25 PM · May 19, 2026 · 36M Views

QUOTE POST

#76Elon Musk@ELONMUSK

Try it out!

(Partially trained on Colossus 2)

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

5:10 PM · May 18, 2026 · 10.8M Views

QUOTE POST

#76Elon Musk@ELONMUSK

Try Composer 2.5

8:14 AM · May 21, 2026 · 6.2M Views

QUOTE POST

#83rohan anil@_AROHAN_

Same pretrain base but little more intelligent on the tasks. Very cool work with Muon (yes, I didn’t bring up Shampoo b2=0.0 yet), cool to see textual feedback in RL.

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

11:59 PM · May 18, 2026 · 7.3K Views

QUOTE POST

#167Emad@EMOSTAQUE

This is such an interesting chart layout, like it a lot!

Congrats to @cursor_ai team on the 2.5 launch 🚀

6:12 PM · May 18, 2026 · 8.8K Views

REPLY

#259Teknium 🪽@TEKNIUM

@elonmusk Is it accessible through the SuperGrok subscription too?

Elon Musk@elonmusk

Try it out! (Partially trained on Colossus 2)

5:10 PM · May 18, 2026 · 10.8M Views

11:06 PM · May 18, 2026 · 3.3K Views

QUOTE POST

#357Boris Power@BORISMPOWER

Great work - exciting to see you training a very powerful coding model!

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

5:39 PM · May 18, 2026 · 3.3K Views

QUOTE POST

#413Aditya Grover@ADITYAGROVER_

OPSD now being used at frontier scale to train Composer 2.5. Great work by @srush_nlp and team!

My feed has been filled with commentary in the last few days speculating the behavior of OPSD at scale. Refreshing to see some verification🙂

Sasha Rush@srush_nlp

Been working on text feedback / OPSD in Composer. Really interesting space, and much more to be explored.

7:39 PM · May 18, 2026 · 30.3K Views

5:08 PM · May 19, 2026 · 7.6K Views

REPLY

#413Aditya Grover@ADITYAGROVER_

.@siyan_zhao will be talking more about our work introducing OPSD at ICML in July. https://siyan-zhao.github.io/blog/2026/opsd/

Aditya Grover@adityagrover_

OPSD now being used at frontier scale to train Composer 2.5. Great work by @srush_nlp and team! My feed has been filled with commentary in the last few days speculating the behavior of OPSD at scale. Refreshing to see some verification🙂

5:08 PM · May 19, 2026 · 7.6K Views

5:08 PM · May 19, 2026 · 1.4K Views

POST

#427Yuchen Jin@YUCHENJ_UW

Cursor’s Composer 2.5 stirred up the coding war.

Now we have 3 labs capable of training strong coding models: Anthropic, OpenAI, SpaceX (+Cursor).

Wouldn’t be surprised if Google drops a strong coding model tomorrow at I/O.

This is the chatbot war all over again: OpenAI leads, then the market gets divided by other AI labs. Same thing is happening to coding models.

3:35 AM · May 19, 2026 · 26K Views

QUOTE POST

#506Michael Truell@MNTRUELL

Composer 2.5 is a significant step up from Composer 2.

This is the very start of our work with SpaceXAI. Hope to have more improvements out soon.

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

4:57 PM · May 18, 2026 · 995.1K Views

QUOTE POST

#506Michael Truell@MNTRUELL

Composer 2.5 is now the most-chosen model in Cursor.

We're giving everyone 10x usage for the rest of the day. Enjoy!

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

4:54 PM · May 19, 2026 · 33.8M Views

QUOTE POST

#583Niklas Muennighoff@MUENNIGHOFF

Composer 2.5 sits on the Pareto frontier

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

5:38 PM · May 18, 2026 · 5.4K Views

QUOTE POST

#612Ravid Shwartz Ziv@ZIV_RAVID

Self distillation everywhere 🥳

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

8:19 PM · May 18, 2026 · 2.9K Views

QUOTE POST

#694Dan Fu@REALDANFU

🎼2⃣5⃣

Together AI@togethercompute

Congrats to the @cursor_ai team on Composer 2.5 — a huge milestone for agentic coding models. Together AI, the AI Native Cloud, is proud to partner on this launch. Composer 2.5 is pushing the frontier for coding agents and turning heads for its speed and quality. Excited to keep building with the Cursor team!

9:58 PM · May 18, 2026 · 9.8K Views

10:00 PM · May 18, 2026 · 761 Views

QUOTE POST

#716elie@ELIEBAKOUCH

CursorBench vs Artificial Analysis Coding index

cursor harness seems to almost always improve score for opus, but makes the cost/task higher

opposite for gpt5.5 (codex scores are higher but cost/task with cursor is lower)

2:55 AM · May 21, 2026 · 6.6K Views

REPLY

#716elie@ELIEBAKOUCH

we don't have the data point for model without cursor harness so i just put the same score, but the reality is likely lower. also very small number of data point not sure we can do some strong conclusion here but sharing anyways!

elie@eliebakouch

CursorBench vs Artificial Analysis Coding index cursor harness seems to almost always improve score for opus, but makes the cost/task higher opposite for gpt5.5 (codex scores are higher but cost/task with cursor is lower)

2:55 AM · May 21, 2026 · 6.6K Views

2:56 AM · May 21, 2026 · 297 Views

QUOTE POST

#716elie@ELIEBAKOUCH

v1 is here

elie@eliebakouch

correlation between CursorBench and Artificial Analysis reported scores benchmarks like IFBench or tau2 show ~0 correlation with CursorBench. opus 4.7 (max effort) performs relatively better on CursorBench than on other benchmarks, gpt 5.5 shows the opposite pattern

9:31 PM · May 20, 2026 · 23.3K Views

3:02 AM · May 21, 2026 · 1.2K Views

REPLY

#716elie@ELIEBAKOUCH

@srush_nlp oh i need to update the plot wait

Sasha Rush@srush_nlp

Huh, so you’re saying it’s smart, cheap, and fast.

1:59 AM · May 21, 2026 · 11.7K Views

2:04 AM · May 21, 2026 · 431 Views

QUOTE POST

#716elie@ELIEBAKOUCH

@srush_nlp here you go, but honeslty not enough data point to make strong conclusion imo, just cool to see this plot imo

elie@eliebakouch

CursorBench vs Artificial Analysis Coding index cursor harness seems to almost always improve score for opus, but makes the cost/task higher opposite for gpt5.5 (codex scores are higher but cost/task with cursor is lower)

2:55 AM · May 21, 2026 · 6.6K Views

2:57 AM · May 21, 2026 · 464 Views

REPLY

#716elie@ELIEBAKOUCH

@srush_nlp really cool work congrats to the team. if you can answer, do you have rough estimate on how the compute was allocated between RL and continual pt in composer 2 -> 2.5?

Sasha Rush@srush_nlp

Been working on text feedback / OPSD in Composer. Really interesting space, and a much more to be explored.

7:03 PM · May 18, 2026 · 2.3K Views

7:13 PM · May 18, 2026 · 528 Views

QUOTE POST

#716elie@ELIEBAKOUCH

cursor is at frontier scale, both in terms of performance and compute

if composer 2.5's budget was put into a pre-train: ~6.3T total, 200B active trained on ~56T tokens

if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens.

assumptions are a lower bound: 35% MFU, FP8, ~3-4% sparsity like K2, H100 efficiency. model/token allocation is the mean between K2+K2.5 data point and Inclusion AI compute optimal rules for MoE

really impressed by the progression between composer 2 and composer 2.5

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

6:54 PM · May 18, 2026 · 63.9K Views

REPLY

#716elie@ELIEBAKOUCH

> if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens.

i meant pre/mid/continual training here (whatever is not RL), and also put 10% estimation in the attached picture

elie@eliebakouch

cursor is at frontier scale, both in terms of performance and compute if composer 2.5's budget was put into a pre-train: ~6.3T total, 200B active trained on ~56T tokens if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens. assumptions are a lower bound: 35% MFU, FP8, ~3-4% sparsity like K2, H100 efficiency. model/token allocation is the mean between K2+K2.5 data point and Inclusion AI compute optimal rules for MoE really impressed by the progression between composer 2 and composer 2.5

6:54 PM · May 18, 2026 · 63.9K Views

6:59 PM · May 18, 2026 · 3.3K Views

QUOTE POST

#716elie@ELIEBAKOUCH

when you do continual pre training at this scale on traces that look like RL rollouts, does it hurt RL if the mid training data is very similar to what you RL on? what if it's the same data but with different rollouts from another model?

intuitively i'd say yes, same intuition as what minimax M1 found with SFT cold start/RL overlap, but does this change at ~T token scale?

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

11:05 PM · May 18, 2026 · 9.6K Views

REPLY

#716elie@ELIEBAKOUCH

maybe cc @agarwl_ @_lewtun if you have some thought here 👀

elie@eliebakouch

when you do continual pre training at this scale on traces that look like RL rollouts, does it hurt RL if the mid training data is very similar to what you RL on? what if it's the same data but with different rollouts from another model? intuitively i'd say yes, same intuition as what minimax M1 found with SFT cold start/RL overlap, but does this change at ~T token scale?

11:05 PM · May 18, 2026 · 9.6K Views

5:13 PM · May 21, 2026 · 233 Views

QUOTE POST

#839Beff (e/acc)@BEFFJEZOS

This is very bullish for SpaceXAI

4:20 PM · May 19, 2026 · 122.3K Views

REPLY

#839Beff (e/acc)@BEFFJEZOS

@elonmusk Back in the top 3 🔥🔥

Elon Musk@elonmusk

Try Composer 2.5

8:14 AM · May 21, 2026 · 6.2M Views

8:31 AM · May 21, 2026 · 1.6K Views

REPLY

#934Philipp Schmid@_PHILSCHMID

@ericzakariasson looks good

eric zakariasson@ericzakariasson

composer 1 was fast composer 2 was fast and intelligent composer N:

4:52 PM · May 18, 2026 · 105.4K Views

5:18 PM · May 18, 2026 · 953 Views

QUOTE POST

#978Jacob Jackson@JBFJA

Our new model is out. It stacks up nicely against the frontier!

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

5:59 PM · May 18, 2026 · 3.6K Views

QUOTE POST

#980Lisan al Gaib@SCALING01

yeah that's pretty good

xAI might be able to cook with Cursor data + 10T model

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

5:36 PM · May 18, 2026 · 689.7K Views

QUOTE POST

#1086Ashvin Nair@ASHVINAIR

We just trained another really really good model, please try it! Frontier intelligence + very fast

5:01 PM · May 18, 2026 · 9.2K Views

QUOTE POST

#1245Alex Volkov@ALTRYNE

New Composer from Cursor team! Great to see their ack. to the Kimi base + how much they moved the model forward!

This isn't the one they are training on XAI Colossus, that one is coming and would likely slap hard!

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

4:58 PM · May 18, 2026 · 1.5K Views

REPLY

#1262samsja@SAMSJA19

@srush_nlp Congrats guys

Sasha Rush@srush_nlp

Huh, so you’re saying it’s smart, cheap, and fast.

1:59 AM · May 21, 2026 · 11.7K Views

2:00 AM · May 21, 2026 · 204 Views

QUOTE POST

#1284Kevin Frans@KVFRANS

We used a pretty cool "RL with text feedback" formulation to train this one (see blog post for some details). As RL tasks get longer in horizon, I think it's a ripe time to think about ways we can extract signals that avoid the variance explosion.

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

7:14 PM · May 18, 2026 · 18K Views

QUOTE POST

#1318Sam Whitmore@SJWHITMORE

composer 2.5 is really really great. I had it on last week for some testing, forgot that it was on, & totally didn’t realize I wasn’t on gpt 5.5 (my usual) for a while. the team did a fantastic job!!

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

4:50 PM · May 18, 2026 · 9.3K Views

REPLY

#1333Shaun Maguire@SHAUNMMAGUIRE

@mntruell Congrats!

Michael Truell@mntruell

Composer 2.5 is a significant step up from Composer 2. This is the very start of our work with SpaceXAI. Hope to have more improvements out soon.

4:57 PM · May 18, 2026 · 995.1K Views

6:49 PM · May 18, 2026 · 732 Views

QUOTE POST

#1496Chubby♨️@KIMMONISMUS

Intelligence too cheap to meter. This is the real deal. Composer 2.5 is an efficiency-beast

Chubby♨️@kimmonismus

Huge, did NOT expect that release. Evals looks very solid, significant jump compared to composer 2! But: it’s 10x more efficient than the competition. Looks really exciting. Need to try it out

9:55 PM · May 18, 2026 · 74.1K Views

9:57 PM · May 18, 2026 · 37.9K Views

QUOTE POST

#1496Chubby♨️@KIMMONISMUS

Huge, did NOT expect that release. Evals looks very solid, significant jump compared to composer 2!

But: it’s 10x more efficient than the competition. Looks really exciting. Need to try it out

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

9:55 PM · May 18, 2026 · 74.1K Views

REPLY

#1711🍓🍓🍓@IRULETHEWORLDMO

it’s rather close already ^^

🍓🍓🍓@iruletheworldmo

the xai cursor combo is creating the first real challenge to openai and anthropic doesn’t hurt that anthropic are essentially giving their direct competitors billions each day expect composer to become sota quite soon.

1:25 PM · May 21, 2026 · 15.9K Views

1:26 PM · May 21, 2026 · 4K Views

QUOTE POST

#1733Ryo Lu@RYOLU_

frontier smart extremely efficient Composer 2.5 is here

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

4:52 PM · May 18, 2026 · 31.2K Views

QUOTE POST

#1749Sualeh Asif@SUALEHASIF996

We've gotten really really good at RL. Composer 2.5 is fighting well-above its weight class.

Very excited for the next release as we scale model sizes and FLOPs with @SpaceXAI!

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

5:28 PM · May 18, 2026 · 677K Views

REPLY

#1894Nick Dobos@NICKADOBOS

@ericzakariasson Fat slow and genius

eric zakariasson@ericzakariasson

composer 1 was fast composer 2 was fast and intelligent composer N:

4:52 PM · May 18, 2026 · 105.4K Views

6:08 PM · May 18, 2026 · 1.4K Views

QUOTE POST

#1965eric zakariasson@ERICZAKARIASSON

i wrote a guide on optimizing context usage 6 months ago that i never posted. back then with the models available, you could only pick 2 of 3:

1. intelligent 2. fast 3. cheap

intelligent + fast = expensive fast + cheap = dumb cheap + intelligent = slow

now, with composer 2.5, this is no longer true and the post is obsolete. looking at TPS, avg cost per task, and score from cursorbench, it's clearly capable of all three

but benchmarks are just benchmarks. what matters is how it feels to use and if it can actually accomplish your tasks. from the feedback so far, that's very much the case

go try it out if you haven't already

1:02 PM · May 21, 2026 · 23K Views

REPLY

#1965eric zakariasson@ERICZAKARIASSON

@VictorTaelin just going to leave this here

Taelin@VictorTaelin

@ericzakariasson I'd love to try it but unfortunately I had to sell my eyes to cover Anthropic bills, so now I can only use textual models via an API. Please let me know if you support this!

2:21 PM · May 21, 2026 · 1.8K Views

3:15 PM · May 21, 2026 · 759 Views

QUOTE POST

#1965eric zakariasson@ERICZAKARIASSON

composer 1 was fast composer 2 was fast and intelligent composer N:

Cursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views

4:52 PM · May 18, 2026 · 105.4K Views

Cursor AI releases Composer 2.5, its most advanced coding model that scores 62 on the Artificial Analysis Coding Agent Index at 10-60 times lower cost than leading rivals

Sentiment

Cluster engagement

Digg Depth