Cursor Composer 2.5: What It Actually Costs
Composer 2.5 finishes a coding task for about $0.07, 10-60x under Claude Opus and GPT-5.5 at near-equal benchmark scores. What the cheap headline leaves out.
By Capital & Compute
Cursor’s in-house coding model, Composer 2.5, finishes a coding task for about $0.07 in its standard mode. The frontier coding agents cost several dollars to do the same class of work: in the most recent independent head-to-head, the Claude Opus and GPT-5.5 agents came in at $4.10 and $4.82. That is a 10x to 60x cost gap for a model that scores within four points of both on an independent coding leaderboard. On cost to finish the work, the price is the headline.
The catch is that $0.07 is the floor, not the bill. It is the standard tier, metered against a Cursor subscription rather than billed straight per token. The Fast tier that interactive sessions reach for costs roughly six times more per task. The launch-week promotion that doubled usage has expired. And the leaderboard parity rests on Cursor running the model inside its own harness. This is a review of what Composer 2.5 actually costs once those qualifiers are priced in.
What it actually costs
Cursor’s launch announcement, Introducing Composer 2.5 (May 2026), lists two tiers. Standard runs at $0.50 per million input tokens and $2.50 per million output tokens. Fast, tuned for low-latency interactive sessions, runs at $3.00 and $15.00. Set those against the frontier rivals and the gap is an order of magnitude: Claude Opus 4.8, Anthropic’s current flagship, lists at $5.00/$25.00 and GPT-5.5 at $5.00/$30.00 per million tokens, both unchanged from the prior Opus 4.7. Standard Composer 2.5 output is a tenth of GPT-5.5’s.
List price is the opening bid, not the cost. The number that clears is the cost to finish a representative task, thinking tokens and tool-calling loops included. Artificial Analysis, an independent benchmarking firm, publishes exactly that for its Coding Agent Index: the dollar cost for each model to complete the suite inside the harness it ships in. Composer 2.5 standard came in at $0.07 per task, Fast at $0.44. Opus 4.7 at max effort inside Claude Code cost $4.10, and GPT-5.5 at xhigh effort inside Codex cost $4.82. Artificial Analysis ran this index in May 2026, the week before Claude Opus 4.8 shipped on May 28; 4.8 carries the same standard list price as 4.7 ($5.00/$25.00), so the gap holds against the current flagship, and no independent re-run on 4.8 had been published as of this writing.
| Item | Value |
|---|---|
| Composer 2.5 (standard) | $0.07 |
| Composer 2.5 (Fast) | $0.44 |
| Opus 4.7 (Claude Code) | $4.10 |
| GPT-5.5 (Codex) | $4.82 |
For a Cursor Pro subscriber the mechanics are gentler still: Composer 2.5 draws against the plan’s usage allowance rather than a per-token meter, which makes the spend predictable in a way pay-as-you-go API billing is not. The per-task figures above are the right unit for comparison because they are cost to finish the work, not cost per token, which is the only honest way to price an agentic workload.
Why it is this cheap
Composer 2.5 is not a frontier model trained from scratch. Per Cursor’s announcement, it is built on Moonshot AI’s open-source Kimi K2.5 checkpoint, with Cursor applying roughly 85% of the total compute after the base model in post-training: about 25x more synthetic coding tasks than its predecessor plus a targeted reinforcement-learning pass. Starting from an open checkpoint instead of a from-scratch pre-train is most of the reason the economics work. Cursor is paying to specialize a strong open model for its own harness, not to discover one.
That origin also explains the shape of the result: very strong inside Cursor’s agent loop, where the post-training was aimed, and merely competitive on general benchmarks that the specialization did not target.
What you give up: capability
The cost gap is real. So is a capability gap, and an honest review has to hold both. On Cursor’s own reported benchmarks the model is close to the frontier: 79.8% on SWE-Bench Multilingual against Opus 4.7’s 80.5%, and 63.2% on CursorBench v3.1 against Opus 4.7’s 64.8% at max effort, as compiled in a DataCamp comparison (2026). On terminal work the gap widens sharply: 69.3% on Terminal-Bench 2.0 against GPT-5.5’s 82.7%, a 13-point deficit.
Independent measurement tells the same story a notch lower. On Artificial Analysis’s Coding Agent Index, Composer 2.5 placed third at a score of 62, behind Opus 4.7 (66) and GPT-5.5 (65). Put that next to the cost chart and the trade reads at a glance: the capability bars are nearly level while the cost bars are not close.
| Item | Value |
|---|---|
| Composer 2.5 | 62 |
| GPT-5.5 | 65 |
| Opus 4.7 | 66 |
There is a measurement caveat worth naming, because it is the same one that runs through the whole cost-per-task picture. The Coding Agent Index runs each model in the harness it ships in: Composer inside Cursor, Opus inside Claude Code, GPT-5.5 inside Codex. So the figures bundle the model and its loop together. That is the right comparison if the question is “what does this product cost me to use,” and the wrong one if the question is “which raw model is best,” because the harness around a model moves its score and its bill as much as the weights do. Composer 2.5’s numbers are a verdict on Cursor-plus-Composer, not on the checkpoint alone.
The catch in the headline number
None of this makes the model expensive. It makes the single advertised figure a floor that a real workload rarely sits on. The pattern is familiar: a low sticker number that the actual bill drifts above once consumption, tier, and promotions are counted. It is the same gap a 2026 Microsoft Research preprint measured across frontier models, where the cheaper-listed model finished the work at a higher cost in roughly a third of matchups. Composer 2.5 does not reverse on its rivals, the gap is far too wide, but the headline-to-bill drift is the same mechanism, and it is why the honest unit is cost to finish a representative slice of your own tasks.
Who should run Composer 2.5
The model earns its place when the work lives inside Cursor and volume is the constraint. High-throughput, well-scoped coding (refactors, test scaffolding, routine fixes, large batches of small changes) is where a 10x to 60x cost cut compounds into real money and the few points of missing capability rarely bite. For a Pro subscriber the subscription-metered billing also removes the per-token anxiety that makes teams ration a frontier model.
The right question is not whether it is cheaper. It is whether the work fits the harness it was tuned for.
Reach for a frontier model instead when the task is the kind Composer 2.5 measurably trails on. Terminal-heavy automation and shell orchestration favor GPT-5.5 by more than ten points on Terminal-Bench. The hardest multi-file reasoning, long-context fidelity, and anything where a single wrong change is expensive still favor Claude Opus 4.8 or GPT-5.5, where the capability premium buys down risk. And if your workflow is not inside Cursor, the headline cost-per-task does not transfer: the figure is a property of the Cursor harness, not a portable per-token rate you can replicate elsewhere.
The verdict
Composer 2.5 is the clearest cost-per-task bargain in the current coding-agent field, and the bargain is not a mirage: an independent benchmark, not just Cursor’s marketing, puts it within four points of the frontier at a tenth to a sixtieth of the cost to run. The asterisks matter (the cheap tier is not the default, the promo has expired, the parity is harness-bound), but none of them erase the core result. For volume coding inside Cursor it is the rational default, and the moment to escalate to Claude Opus 4.8 or GPT-5.5 is task-specific, not blanket.
The broader signal is the one worth watching. A model post-trained from an open Kimi checkpoint now competes with from-scratch frontier models on coding, at a fraction of the price, by specializing hard for one harness. That is a different cost curve than the from-scratch frontier race the rest of the 2026 landscape is running, and if it holds, the pressure it puts on per-task pricing is the story, not the benchmark deltas.
Frequently asked questions
- How much does Cursor Composer 2.5 cost?
- Composer 2.5 lists at $0.50 per million input tokens and $2.50 output on the standard tier, and $3.00/$15.00 on the Fast tier. On Artificial Analysis's independent Coding Agent Index it finished a representative task for $0.07 on standard and $0.44 on Fast. For Cursor Pro subscribers it draws against the plan's usage allowance rather than a per-token meter.
- Is Composer 2.5 as good as Claude Opus or GPT-5.5?
- Close, not equal. On Artificial Analysis's Coding Agent Index it scored 62, three to four points behind GPT-5.5 (65) and Opus 4.7 (66). The gap widens on terminal work, where it scored 69.3% on Terminal-Bench 2.0 against GPT-5.5's 82.7%.
- Why is Cursor Composer 2.5 so cheap?
- It is post-trained from Moonshot AI's open-source Kimi K2.5 checkpoint rather than trained from scratch, with Cursor specializing it for its own agent harness. Starting from an open checkpoint instead of a from-scratch pre-train is most of why the economics work.
- When should I use a frontier model instead of Composer 2.5?
- Reach for GPT-5.5 on terminal-heavy automation, where it leads by more than ten points, and for Claude Opus 4.8 or GPT-5.5 on the hardest multi-file reasoning where a single wrong change is expensive. The cost-per-task advantage also only holds inside Cursor; the figure does not transfer to other harnesses.
- Does the $0.07 cost-per-task figure hold up in practice?
- It is the cheapest path, not the default one. The Fast tier that interactive editing reaches for costs about six times more ($0.44), the launch-week usage promotion has expired, and the figure is measured inside Cursor's own harness where the model was tuned. Treat $0.07 as a floor, not the typical bill.
Sources
- Cursor (Anysphere). (2026). Introducing Composer 2.5 [vendor announcement]. cursor.com/blog/composer-2-5
- Cursor (Anysphere). (2026). Changelog [vendor documentation; confirms Composer 2.5 is the current model and that retired
composer-2slugs route to it]. cursor.com/changelog - Artificial Analysis. (2026). Cursor’s Composer 2.5: third on the Coding Agent Index and ~10-60x lower cost than rivals [independent benchmark]. artificialanalysis.ai/articles/cursor-composer-2-5-coding-agent-index
- DataCamp. (2026). Composer 2.5: Benchmarks, Pricing, and How It Compares [secondary analysis]. datacamp.com/blog/composer-2-5
- Artificial Analysis. (2026). Claude Opus 4.8 (max): Intelligence, Performance & Price Analysis [independent model profile; standard list price $5.00/$25.00 per M tokens; released May 28, 2026]. artificialanalysis.ai/models/claude-opus-4-8