3d ago

Cursor AI releases Composer 2.5, its most advanced coding model that scores 62 on the Artificial Analysis Coding Agent Index at 10-60 times lower cost than leading rivals

The launch begins expanded scaling collaboration with SpaceXAI.

1
Original post

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

9:43 AM · May 18, 2026 View on X
Reposted by

Huh, so you’re saying it’s smart, cheap, and fast.

1:59 AM · May 21, 2026 · 11.7K Views

Been working on text feedback / OPSD in Composer. Really interesting space, and a much more to be explored.

7:03 PM · May 18, 2026 · 2.3K Views

@eliebakouch Unfortunately can’t give a precise answer. Both were scaled significantly.

elieelie@eliebakouch

@srush_nlp really cool work congrats to the team. if you can answer, do you have rough estimate on how the compute was allocated between RL and continual pt in composer 2 -> 2.5?

7:13 PM · May 18, 2026 · 528 Views
7:15 PM · May 18, 2026 · 189 Views

Been working on text feedback / OPSD in Composer. Really interesting space, and much more to be explored.

7:39 PM · May 18, 2026 · 30.3K Views

@adityagrover_ @siyan_zhao Thanks for publishing! We like your paper a lot.

Siyan, I think we had a chance to meet at NeurIPS. Let us know if you ever are in NYC and want to come by and chat.

Aditya GroverAditya Grover@adityagrover_

.@siyan_zhao will be talking more about our work introducing OPSD at ICML in July. https://siyan-zhao.github.io/blog/2026/opsd/

5:08 PM · May 19, 2026 · 1.4K Views
6:23 PM · May 19, 2026 · 841 Views

Very cool to see Cursor doubling down on training great models. In my opinion, ultimately all serious companies in AI will want to train models themselves, based on open-source instead of outsourcing AI to others via APIs!

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
5:27 PM · May 18, 2026 · 23.6K Views

Try Composer 2.5 on Cursor!

Michael TruellMichael Truell@mntruell

Composer 2.5 is now the most-chosen model in Cursor. We're giving everyone 10x usage for the rest of the day. Enjoy!

4:54 PM · May 19, 2026 · 33.8M Views
7:25 PM · May 19, 2026 · 36M Views

Try it out!

(Partially trained on Colossus 2)

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
5:10 PM · May 18, 2026 · 10.8M Views

Same pretrain base but little more intelligent on the tasks. Very cool work with Muon (yes, I didn’t bring up Shampoo b2=0.0 yet), cool to see textual feedback in RL.

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
11:59 PM · May 18, 2026 · 7.3K Views

This is such an interesting chart layout, like it a lot!

Congrats to @cursor_ai team on the 2.5 launch 🚀

6:12 PM · May 18, 2026 · 8.8K Views

@elonmusk Is it accessible through the SuperGrok subscription too?

Elon MuskElon Musk@elonmusk

Try it out! (Partially trained on Colossus 2)

5:10 PM · May 18, 2026 · 10.8M Views
11:06 PM · May 18, 2026 · 3.3K Views

Great work - exciting to see you training a very powerful coding model!

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
5:39 PM · May 18, 2026 · 3.3K Views

OPSD now being used at frontier scale to train Composer 2.5. Great work by @srush_nlp and team!

My feed has been filled with commentary in the last few days speculating the behavior of OPSD at scale. Refreshing to see some verification🙂

Sasha RushSasha Rush@srush_nlp

Been working on text feedback / OPSD in Composer. Really interesting space, and much more to be explored.

7:39 PM · May 18, 2026 · 30.3K Views
5:08 PM · May 19, 2026 · 7.6K Views

.@siyan_zhao will be talking more about our work introducing OPSD at ICML in July. https://siyan-zhao.github.io/blog/2026/opsd/

Aditya GroverAditya Grover@adityagrover_

OPSD now being used at frontier scale to train Composer 2.5. Great work by @srush_nlp and team! My feed has been filled with commentary in the last few days speculating the behavior of OPSD at scale. Refreshing to see some verification🙂

5:08 PM · May 19, 2026 · 7.6K Views
5:08 PM · May 19, 2026 · 1.4K Views

Cursor’s Composer 2.5 stirred up the coding war.

Now we have 3 labs capable of training strong coding models: Anthropic, OpenAI, SpaceX (+Cursor).

Wouldn’t be surprised if Google drops a strong coding model tomorrow at I/O.

This is the chatbot war all over again: OpenAI leads, then the market gets divided by other AI labs. Same thing is happening to coding models.

3:35 AM · May 19, 2026 · 26K Views

Composer 2.5 is a significant step up from Composer 2.

This is the very start of our work with SpaceXAI. Hope to have more improvements out soon.

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
4:57 PM · May 18, 2026 · 995.1K Views

Composer 2.5 is now the most-chosen model in Cursor.

We're giving everyone 10x usage for the rest of the day. Enjoy!

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
4:54 PM · May 19, 2026 · 33.8M Views

Composer 2.5 sits on the Pareto frontier

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
5:38 PM · May 18, 2026 · 5.4K Views

Self distillation everywhere 🥳

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
8:19 PM · May 18, 2026 · 2.9K Views

🎼2⃣5⃣

Together AITogether AI@togethercompute

Congrats to the @cursor_ai team on Composer 2.5 — a huge milestone for agentic coding models. Together AI, the AI Native Cloud, is proud to partner on this launch. Composer 2.5 is pushing the frontier for coding agents and turning heads for its speed and quality. Excited to keep building with the Cursor team!

9:58 PM · May 18, 2026 · 9.8K Views
10:00 PM · May 18, 2026 · 761 Views

CursorBench vs Artificial Analysis Coding index

cursor harness seems to almost always improve score for opus, but makes the cost/task higher

opposite for gpt5.5 (codex scores are higher but cost/task with cursor is lower)

2:55 AM · May 21, 2026 · 6.6K Views

we don't have the data point for model without cursor harness so i just put the same score, but the reality is likely lower. also very small number of data point not sure we can do some strong conclusion here but sharing anyways!

elieelie@eliebakouch

CursorBench vs Artificial Analysis Coding index cursor harness seems to almost always improve score for opus, but makes the cost/task higher opposite for gpt5.5 (codex scores are higher but cost/task with cursor is lower)

2:55 AM · May 21, 2026 · 6.6K Views
2:56 AM · May 21, 2026 · 297 Views

v1 is here

elieelie@eliebakouch

correlation between CursorBench and Artificial Analysis reported scores benchmarks like IFBench or tau2 show ~0 correlation with CursorBench. opus 4.7 (max effort) performs relatively better on CursorBench than on other benchmarks, gpt 5.5 shows the opposite pattern

9:31 PM · May 20, 2026 · 23.3K Views
3:02 AM · May 21, 2026 · 1.2K Views

@srush_nlp oh i need to update the plot wait

Sasha RushSasha Rush@srush_nlp

Huh, so you’re saying it’s smart, cheap, and fast.

1:59 AM · May 21, 2026 · 11.7K Views
2:04 AM · May 21, 2026 · 431 Views

@srush_nlp here you go, but honeslty not enough data point to make strong conclusion imo, just cool to see this plot imo

elieelie@eliebakouch

CursorBench vs Artificial Analysis Coding index cursor harness seems to almost always improve score for opus, but makes the cost/task higher opposite for gpt5.5 (codex scores are higher but cost/task with cursor is lower)

2:55 AM · May 21, 2026 · 6.6K Views
2:57 AM · May 21, 2026 · 464 Views

@srush_nlp really cool work congrats to the team. if you can answer, do you have rough estimate on how the compute was allocated between RL and continual pt in composer 2 -> 2.5?

Sasha RushSasha Rush@srush_nlp

Been working on text feedback / OPSD in Composer. Really interesting space, and a much more to be explored.

7:03 PM · May 18, 2026 · 2.3K Views
7:13 PM · May 18, 2026 · 528 Views

cursor is at frontier scale, both in terms of performance and compute

if composer 2.5's budget was put into a pre-train: ~6.3T total, 200B active trained on ~56T tokens

if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens.

assumptions are a lower bound: 35% MFU, FP8, ~3-4% sparsity like K2, H100 efficiency. model/token allocation is the mean between K2+K2.5 data point and Inclusion AI compute optimal rules for MoE

really impressed by the progression between composer 2 and composer 2.5

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
6:54 PM · May 18, 2026 · 63.9K Views

> if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens.

i meant pre/mid/continual training here (whatever is not RL), and also put 10% estimation in the attached picture

elieelie@eliebakouch

cursor is at frontier scale, both in terms of performance and compute if composer 2.5's budget was put into a pre-train: ~6.3T total, 200B active trained on ~56T tokens if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens. assumptions are a lower bound: 35% MFU, FP8, ~3-4% sparsity like K2, H100 efficiency. model/token allocation is the mean between K2+K2.5 data point and Inclusion AI compute optimal rules for MoE really impressed by the progression between composer 2 and composer 2.5

6:54 PM · May 18, 2026 · 63.9K Views
6:59 PM · May 18, 2026 · 3.3K Views

when you do continual pre training at this scale on traces that look like RL rollouts, does it hurt RL if the mid training data is very similar to what you RL on? what if it's the same data but with different rollouts from another model?

intuitively i'd say yes, same intuition as what minimax M1 found with SFT cold start/RL overlap, but does this change at ~T token scale?

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
11:05 PM · May 18, 2026 · 9.6K Views

maybe cc @agarwl_ @_lewtun if you have some thought here 👀

elieelie@eliebakouch

when you do continual pre training at this scale on traces that look like RL rollouts, does it hurt RL if the mid training data is very similar to what you RL on? what if it's the same data but with different rollouts from another model? intuitively i'd say yes, same intuition as what minimax M1 found with SFT cold start/RL overlap, but does this change at ~T token scale?

11:05 PM · May 18, 2026 · 9.6K Views
5:13 PM · May 21, 2026 · 233 Views

@elonmusk Back in the top 3 🔥🔥

Elon MuskElon Musk@elonmusk

Try Composer 2.5

8:14 AM · May 21, 2026 · 6.2M Views
8:31 AM · May 21, 2026 · 1.6K Views

@ericzakariasson looks good

eric zakariassoneric zakariasson@ericzakariasson

composer 1 was fast composer 2 was fast and intelligent composer N:

4:52 PM · May 18, 2026 · 105.4K Views
5:18 PM · May 18, 2026 · 953 Views

Our new model is out. It stacks up nicely against the frontier!

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
5:59 PM · May 18, 2026 · 3.6K Views

yeah that's pretty good

xAI might be able to cook with Cursor data + 10T model

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
5:36 PM · May 18, 2026 · 689.7K Views

We just trained another really really good model, please try it! Frontier intelligence + very fast

5:01 PM · May 18, 2026 · 9.2K Views

New Composer from Cursor team! Great to see their ack. to the Kimi base + how much they moved the model forward!

This isn't the one they are training on XAI Colossus, that one is coming and would likely slap hard!

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
4:58 PM · May 18, 2026 · 1.5K Views

@srush_nlp Congrats guys

Sasha RushSasha Rush@srush_nlp

Huh, so you’re saying it’s smart, cheap, and fast.

1:59 AM · May 21, 2026 · 11.7K Views
2:00 AM · May 21, 2026 · 204 Views

We used a pretty cool "RL with text feedback" formulation to train this one (see blog post for some details). As RL tasks get longer in horizon, I think it's a ripe time to think about ways we can extract signals that avoid the variance explosion.

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
7:14 PM · May 18, 2026 · 18K Views

composer 2.5 is really really great. I had it on last week for some testing, forgot that it was on, & totally didn’t realize I wasn’t on gpt 5.5 (my usual) for a while. the team did a fantastic job!!

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
4:50 PM · May 18, 2026 · 9.3K Views

@mntruell Congrats!

Michael TruellMichael Truell@mntruell

Composer 2.5 is a significant step up from Composer 2. This is the very start of our work with SpaceXAI. Hope to have more improvements out soon.

4:57 PM · May 18, 2026 · 995.1K Views
6:49 PM · May 18, 2026 · 732 Views

Intelligence too cheap to meter. This is the real deal. Composer 2.5 is an efficiency-beast

Chubby♨️Chubby♨️@kimmonismus

Huge, did NOT expect that release. Evals looks very solid, significant jump compared to composer 2! But: it’s 10x more efficient than the competition. Looks really exciting. Need to try it out

9:55 PM · May 18, 2026 · 74.1K Views
9:57 PM · May 18, 2026 · 37.9K Views

Huge, did NOT expect that release. Evals looks very solid, significant jump compared to composer 2!

But: it’s 10x more efficient than the competition. Looks really exciting. Need to try it out

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
9:55 PM · May 18, 2026 · 74.1K Views

it’s rather close already ^^

🍓🍓🍓🍓🍓🍓@iruletheworldmo

the xai cursor combo is creating the first real challenge to openai and anthropic doesn’t hurt that anthropic are essentially giving their direct competitors billions each day expect composer to become sota quite soon.

1:25 PM · May 21, 2026 · 15.9K Views
1:26 PM · May 21, 2026 · 4K Views

frontier smart extremely efficient Composer 2.5 is here

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
4:52 PM · May 18, 2026 · 31.2K Views

We've gotten really really good at RL. Composer 2.5 is fighting well-above its weight class.

Very excited for the next release as we scale model sizes and FLOPs with @SpaceXAI!

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
5:28 PM · May 18, 2026 · 677K Views

@ericzakariasson Fat slow and genius

eric zakariassoneric zakariasson@ericzakariasson

composer 1 was fast composer 2 was fast and intelligent composer N:

4:52 PM · May 18, 2026 · 105.4K Views
6:08 PM · May 18, 2026 · 1.4K Views

i wrote a guide on optimizing context usage 6 months ago that i never posted. back then with the models available, you could only pick 2 of 3:

1. intelligent 2. fast 3. cheap

intelligent + fast = expensive fast + cheap = dumb cheap + intelligent = slow

now, with composer 2.5, this is no longer true and the post is obsolete. looking at TPS, avg cost per task, and score from cursorbench, it's clearly capable of all three

but benchmarks are just benchmarks. what matters is how it feels to use and if it can actually accomplish your tasks. from the feedback so far, that's very much the case

go try it out if you haven't already

1:02 PM · May 21, 2026 · 23K Views

@VictorTaelin just going to leave this here

TaelinTaelin@VictorTaelin

@ericzakariasson I'd love to try it but unfortunately I had to sell my eyes to cover Anthropic bills, so now I can only use textual models via an API. Please let me know if you support this!

2:21 PM · May 21, 2026 · 1.8K Views
3:15 PM · May 21, 2026 · 759 Views

composer 1 was fast composer 2 was fast and intelligent composer N:

CursorCursor@cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

4:43 PM · May 18, 2026 · 19.4M Views
4:52 PM · May 18, 2026 · 105.4K Views
Cursor AI releases Composer 2.5, its most advanced coding model that scores 62 on the Artificial Analysis Coding Agent Index at 10-60 times lower cost than leading rivals · Digg