Skip to content
Capital & Compute
Tracker· Updated June 2026

AI Model Release Tracker

The latest and upcoming AI models in 2026, with release dates, per-token pricing, and a link to every provider's own page. Released models are verified against official pricing; upcoming ones appear only when a primary source confirms them. To turn a sticker rate into what a task actually costs, use thecost-per-task calculator or put two models head to head in the model comparison, and for monthly plan prices see the AI coding plan comparison.

What new AI models launched in 2026?

As of June 2026, the models tracked here include Anthropic's Claude Fable 5 (released June 9, 2026), Claude Opus 4.8, Sonnet 4.6 and Haiku 4.5; OpenAI's GPT-5.5; Google's Gemini 3.1 Pro, 3.5 Flash and 3 Flash; and the leading China-built coding models DeepSeek V4, Alibaba's Qwen3.7 Max, Moonshot's Kimi K2.7 Code and Zhipu's GLM-5.2. Fable 5 is the most recent flagship. Each row links to the provider's official pricing page and is dated to when it was last verified.

Released and available

Models you can call today, newest first. Rates are USD per million tokens (Mtok), standard non-batch pricing, each verified against the provider's official page.

Provider · ModelStatusReleased / expectedInput / output (per Mtok)Source
Anthropic
Claude Fable 5
ReleasedJun 2026$10 / $50Source ↗
Cohere
North Mini Code
ReleasedJun 2026Not yet pricedSource ↗
Anthropic
Claude Opus 4.8
Released$5 / $25Source ↗
Anthropic
Claude Sonnet 4.6
Released$3 / $15Source ↗
Anthropic
Claude Haiku 4.5
Released$1 / $5Source ↗
OpenAI
GPT-5.5
Released$5 / $30Source ↗
Google
Gemini 3.1 Pro
Released$2 / $12Source ↗
Google
Gemini 3.5 Flash
Released$1.5 / $9Source ↗
Google
Gemini 3 Flash
Released$0.5 / $3Source ↗
DeepSeekChina
DeepSeek V4
Released$0.435 / $0.87Source ↗
AlibabaChina
Qwen3.7 Max
Released$2.5 / $7.5Source ↗
MoonshotChina
Kimi K2.7 Code
Released$0.95 / $4Source ↗
ZhipuChina
GLM-5.2
Released$1.4 / $4.4Source ↗

The rate card is only the starting point. Model any of these in the cost-per-task calculator to see what a real coding task costs, and why the cheapest per-token model is often not the cheapest to finish the job.

Upcoming AI models

The next wave is already taking shape. Every model below is either officially announced or surfaced by credible reporting, and each links to its source. The status badge says which is which, and no release date appears unless a provider has stated one. The day a model ships, its confirmed pricing and date move up to the tables above.

Provider · ModelStatusReleased / expectedSource
Google
Gemini 3.5 Pro
AnnouncedComing soonSource ↗
OpenAI
GPT-5.6
RumoredExpected June 2026 (unconfirmed)Source ↗

How this tracker stays honest

Every released model's per-token rate is read off the provider's own API pricing page, stored with that exact source URL, and dated to the day it was checked. A model reaches the upcoming table only when the provider has officially announced it or credible reporting has surfaced it, and that entry carries the reporting source. No release date is published unless a primary source states it, and no rumor appears without a citation. When a number cannot be confirmed against an official page, it is left off rather than guessed.

Released, preview, announced, rumored: what the labels mean

  • Released: generally available, callable today, with a verified rate card.
  • Preview: available to use but still rolling out or limited, so the rate may move.
  • Announced: the provider has officially confirmed the model and, usually, a window, but it is not yet callable.
  • Rumored: surfaced by credible reporting rather than the provider, with no confirmed date. Treated as a lead, attributed to its source, never as fact.

The sticker rate is not the bill

A low per-token price does not make a model cheap to run. What a coding task costs is set by how many tokens the agent burns reading context, reasoning, and generating output, and that swings by more than an order of magnitude between models and runs. A 2026 Microsoft Research preprint found the cheaper-listed model finished the same work at a higher cost in roughly a third of matchups, theprice reversal phenomenon. The honest unit is cost to finish a representative slice of your own tasks: thecost-per-task calculator models it for every row in this tracker, and the Claude Code cost breakdown walks the math end to end.

Frequently asked questions

What new AI models launched in 2026?

As of June 2026, the models tracked here include Anthropic's Claude Fable 5 (released June 9, 2026), Claude Opus 4.8, Sonnet 4.6 and Haiku 4.5; OpenAI's GPT-5.5; Google's Gemini 3.1 Pro, 3.5 Flash and 3 Flash; and the leading China-built coding models DeepSeek V4, Alibaba's Qwen3.7 Max, Moonshot's Kimi K2.7 Code and Zhipu's GLM-5.2. Fable 5 is the most recent flagship. Each row links to the provider's official pricing page and is dated to when it was last verified.

What AI models are coming next?

As of June 2026 the tracker is watching two: Google's Gemini 3.5 Pro, which Google has announced and lists as "coming soon", and OpenAI's GPT-5.6, which OpenAI has not officially announced but which reporting and prediction markets widely expect in June 2026. Each entry is attributed to its source. The tracker never publishes a release date a provider has not stated, and never lists a rumor without a citation.

How much do new AI models cost to use?

Per-token API rates span more than an order of magnitude, from DeepSeek V4 at roughly $0.44 per million input tokens to Claude Fable 5 at $10. But the sticker rate is not the bill: what a task actually costs is set by how many tokens an agent burns, and the cheapest per-token model is often not the cheapest to finish the work. Model any row in the cost-per-task calculator to see the real figure.

Where does this AI model data come from?

Every per-token rate is read off the provider's own official API pricing page, stored with that source URL, and stamped with the date it was checked. Release and announcement dates are recorded only when a primary source states them. Nothing here is taken from memory or an unsourced tracker.

Sources

Per-token rates for the released models are grounded to each provider's official API pricing page, verified June 2026:

← Back to Capital & Compute