Gemini 3.5 Flash appears in the Google Cloud Console quota interface at $1.5 per million input tokens and $9 per million output tokens while posting a leading 47.1% on the APEX-Agents AA leaderboard
It records 76.2% on Terminal-Bench 2.1 at 4x prior speed.
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action.
We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows.
Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models.
Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale.
Some highlights we’re excited about 🔽

2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency.
It really helps you distill papers down to their essence and aid your understanding!
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽
3/ Gemini 3.5 Flash is rolling out globally today. On behalf of the entire Gemini team, we're excited by what you'll be able to do with this model!
Read more here: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency. It really helps you distill papers down to their essence and aid your understanding!
My colleague @LeslieNooteboom generated these: we don't have it packaged up as an one-click link, but here's the prompt passed along by Leslie (thanks, Leslie!). Plug in the abstract of a paper at the bottom where it says '$abstract':
--- ❝You are a world-class creative developer. Build a beautiful, high-resolution, highly visual concept animation. MUST follow: * Output ONLY valid, fully self-contained HTML/CSS/JS code. Start with <!DOCTYPE html> and end with </html>. Do not include any markdown fences in the JSON property. * Self-contained, elegant LIGHT THEME aesthetic matching a clean technical paper (e.g., white or very light gray background, dark crisp typography, minimal harmonious colors). Avoid dark backgrounds. * Focus on ONE strong visual metaphor with graphical animations or elegant interactions. * MINIMIZE EXPLANATORY TEXT: Do not add a title or snippet of the abstract to this animation code, the full abstract is already displayed next to this animation. Let the visual movement and graphic structure explain the concept. Avoid generating heavy text paragraphs or excessive text boxes. Keep text to a few minimal labels or neat status badges. * Keep the visual script and logic extremely concise, under 200 lines of code. Do not build a complex engine or import massive libraries. * HIGH-RESOLUTION CANVAS: Always configure the HTML5 canvas for high-DPI/Retina screens by multiplying canvas.width/height by window.devicePixelRatio, setting its CSS style to the logical width/height, and scaling the context with ctx.scale(dpr, dpr). This avoids any blurry/low-resolution drawings. * RESPONSIVE SCALING & VIEWPORT: Design the visualization to be fully responsive, filling 100% of the viewport width and height (using 100vw/100vh with margin 0 and overflow hidden). Implement a window resize listener to update the canvas buffer dimensions dynamically when the viewport changes, ensuring no scrollbars or visual clipping. * HYBRID HTML-ON-CANVAS & OVERLAY COLLISION PREVENTION: Draw high-performance background graphics (particles, nodes, flows) on the high-DPI canvas, but overlay crisp, high-resolution HTML/CSS divs/labels/buttons on top of the canvas (using absolute positioning) for gorgeous typography and control panels. To prevent absolute overlay panels from hiding, blocking, or overlapping the visual animation components on standard or narrow screens: 1. **ULTRA-COMPACT FOOTPRINT**: All overlay cards must be extremely compact. Set a strict 'max-width' of **no more than 240px** (or 25% of viewport width). Avoid long text paragraphs, heavy padding (use 'p-2.5' or 'p-3'), and large stack buttons. 2. **COMPACT CONTROLS**: If offering option buttons, style them as small inline segmented pills or a minimal dropdown select box rather than a stack of wide, fat buttons. Keep text sizing small ('text-[10px]' or 'text-[11px]'). 3. **TRANSPARENCY**: Use highly semi-transparent, elegant backgrounds (e.g., white with high transparency: 'rgba(255, 255, 255, 0.72)' and a backdrop blur 'backdrop-filter: blur(10px)') for all overlay cards so the underlying animation flows remain beautifully visible behind them. 4. **LAYOUT SAFETY BOUNDARIES**: Offset the center of the canvas drawings (like circles, nodes, or waves) horizontally or vertically (e.g., centering them in the remaining 75% clear space of the canvas) so they are never drawn directly underneath the control card. Scale down the radius/bounds dynamically if the viewport width contracts.Ensure all script tags, function braces, and HTML elements are completely and properly closed. No placeholders, no labels like // ... (insert here).
Generate the concept animation based strictly on the following research paper abstract: "${abstract}" ❞
@JeffDean where can we try this? is there a site where you just put a paper name and get this kind of model card? would love to test it properly 🙏
Highly capable models that are fast are super important. Our new Gemini 3.5 Flash model is a great mix of fast and capable.
Just off stage at #GoogleIO, some highlights from this morning 🧵 Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs. Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.
Gemini 3.5 Flash is amazing!
- Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost
And Pro to come…
Try it in @antigravity, @GeminiApp & more - enjoy!

More info on Gemini 3.5 flash model here: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!
1/ Today at Google I/O, we’re launching Gemini 3.5 Flash ⚡️⚡️⚡️!
Our mission was clear: bring frontier-level intelligence with unprecedented speed.
3.5 Flash delivers drastic intelligence (beating 3.1 Pro on almost every benchmark), at Flash speeds. 🧵

2/ What excites me most is 3.5 Flash’s breakthrough performance on complex, multi-step agentic tasks: it excels on Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%)
3.5 Flash is built to be the ultra-fast reasoning engine powering your AI agents. The better the model, the better the agent you build on top of it.

1/ Today at Google I/O, we’re launching Gemini 3.5 Flash ⚡️⚡️⚡️! Our mission was clear: bring frontier-level intelligence with unprecedented speed. 3.5 Flash delivers drastic intelligence (beating 3.1 Pro on almost every benchmark), at Flash speeds. 🧵
4/ Gemini 3.5 Flash is available today globally - we can't wait to see the incredible agents and apps you all build with it!
Check out our blog for more. https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
3/ If you push Gemini 3.5 Flash with complex agents and tasks, it will shine brighter. Here, it generates fully tactile, interactive HTML/SVG hardware simulations, complete with bump-mapped metals, spring physics, and procedural audio, in a single shot. This is what happens when you connect reasoning across the full stack simultaneously at extremely low latency. 💡🔉
3/ If you push Gemini 3.5 Flash with complex agents and tasks, it will shine brighter. Here, it generates fully tactile, interactive HTML/SVG hardware simulations, complete with bump-mapped metals, spring physics, and procedural audio, in a single shot. This is what happens when you connect reasoning across the full stack simultaneously at extremely low latency. 💡🔉
2/ What excites me most is 3.5 Flash’s breakthrough performance on complex, multi-step agentic tasks: it excels on Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%) 3.5 Flash is built to be the ultra-fast reasoning engine powering your AI agents. The better the model, the better the agent you build on top of it.
excited by gemini 3.5 flash but also sad that it's "pushing the frontier of ... cost" not cost effectiveness 😭
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
@eliebakouch Best twitter account.
correlation between CursorBench and Artificial Analysis reported scores benchmarks like IFBench or tau2 show ~0 correlation with CursorBench. opus 4.7 (max effort) performs relatively better on CursorBench than on other benchmarks, gpt 5.5 shows the opposite pattern
@eliebakouch I think multibenchmarks are less interesting in a post training dominated world. It was very cool when it was pretraining only and just got better across the board.
@eliebakouch Best twitter account.
@eliebakouch I don’t know what Google does. But my read is that train 100 experts then OPD is exactly how you get a model good at 100 areas.
oh interesting, i don't have strong opinion here but if you look at flash 3.0 (or even pro) -> flash 3.5 you get improvement on benchmark across the board (not sure if it's the same base ect.. tho, hard to compare). i'd say recipe like train expert on RL then OPD kinda work to improve multiple domain at the same time?
@JeffDean @quocleix Great results! Congratulations
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽
Gemini
@demishassabis @antigravity @GeminiApp Are you aware that running AA intelligence index cost almost 2x with 3.5 Flash than it did with 3.1 Pro?

Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!
@OfficialLoganK Uhm you don't increment the minor version number if you think it's a new era; doesn't add up.
Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way). The model is the product, please keep the feedback coming!
ouch
I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell
it was a third of the reason i came to gdm back in 2023 (the other two being google hardware and search). no other player had a serious cloud business and frontier lab in one, and cutting down cost/boosting speed while preserving ~90% capability was obviously critical for any interesting dev work outside of the labs.
at some point i realized "frontier capabilities" actually didn't matter for companies with real businesses (eg meta, apple, etc). the people who really need the hype to play out are the ones whose entire market value counts on this one specific thing incrementally improving + future projected light cone of value captured. plenty of reasons to hate on apple, but in retrospect them sitting out this race might not have been the worst idea.
let's see if G sees value in course correcting on its priorities. if not... guess it's all on open-source to fill these shoes.
@suchenzang Breaks my heart. Flash was one of my favorite model lines. I have a dozen videos talking about how much I love it. I’ve yet to find a use case where price to perf on 3.5 makes sense. I’m trying, I’m just not seeing it (and nobody else has examples either)
Knowledge cutoff on this is very confusing. Is this a bug? Does Flash not know that vibecoding is thing now? Does it not know about claude code!?
Gemini 3.5 Flash now live in aistudio
I miss the old flashes too, I didn’t make it to its retirement party, it flashed by - work of love dedication to the pursuit of algorithmic efficiency.
I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell
I think 3.5 is fine just not good enough to be a code model.
I miss the old flashes too, I didn’t make it to its retirement party, it flashed by - work of love dedication to the pursuit of algorithmic efficiency.
I have to mute Flash word, its quite triggering to see the fall from grace of what was once the king of efficient models.
Flash probably was reason Haiku 3.5 was displaced, now that is fully relinquished to Haiku’s domain.
Current timeline is developers showing all the problems with it, and other half the timeline is people selling it as the best agentic model yet. I wonder if they are reading any of the feedback.
I am actually quite confused.
What went wrong here: Is all 2025 + 2026 data all slop and compute inefficient? Why would you train a model in 2026 that misses an entire year+ of data and take it to market?
Knowledge cutoff on this is very confusing. Is this a bug? Does Flash not know that vibecoding is thing now? Does it not know about claude code!?
@scaling01 @PMinervini Have you tried it on antigravity its a slight improvement on tool call, but not a daily model to use for coding.
meh doesn't even beat Kimi or GLM
@scaling01 @PMinervini @eliebakouch @vincentweisser it would be fun for you guys to use this against claude and codex in auto research loop and see if it has good tastes.
@scaling01 @PMinervini Have you tried it on antigravity its a slight improvement on tool call, but not a daily model to use for coding.
@scaling01 @PMinervini @eliebakouch @vincentweisser In some sense, both claude and codex both used human ingenuity and put them together in clever ways. While models lack taste on research with right prompting it can driven to really amazing outcomes. This itself can be an eval if you run it, and compare outcomes.
@scaling01 @PMinervini @eliebakouch @vincentweisser it would be fun for you guys to use this against claude and codex in auto research loop and see if it has good tastes.
@zephyr_z9 I am curious why you say this. Mrcr? This is good guess/deduction
Clearly has very low active parameters but a lot more total parameters
Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way).
The model is the product, please keep the feedback coming!
Gemini 3.5 Flash ranks #1 on the APEX-Agents-AA benchmark, outperforming much larger models a whole size above it.

@tunguz @mercor_ai we are deeply focused on real world use cases for Gemini, its also exciting to see so many benchmarks get better at capturing these use cases
@OfficialLoganK @mercor_ai Benchmarks were never the issue for Gemini models. They’ve consistently struggled with vibes though.
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own.
We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

Try it in the Gemini API, Google AI Studio, Antigravity, AI Mode, Gemini App, and wherever else you use Gemini!
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
@demishassabis @rseroter @antigravity @GeminiApp seeing us continue to pack the intelligence of pro (and more) into a flash model year after year is still one of the most impressive things to witness
Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!
@theo Will be curious to see how it stacks up in real world use cases, not sure how representative AA index scores are given there’s a lot of coverage of purely academic benchmarks. It’s also much faster than 3.1 Pro.
@OfficialLoganK I think 3.5 Flash was not marketed in an honest way. In real world use, it's more expensive than 3.1 pro and not much better. Combined with the sunsetting of Gemini CLI (I liked where it was going) + the Railway stuff and I've lost a lot of faith.
Massive updates to @GoogleAIStudio and the Gemini API 🤯
- Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity
And so much more coming soon, the agentic era of Gemini is in full swing and the team is shipping at full pace.
Can’t wait for you all to try this all out!!
Massive updates to @GoogleAIStudio and the Gemini API 🤯 - Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity
The tension with training models over m at generations is it becomes impossible to keep the same characteristics in the way you mention.
Flash used to mean just fast and cheap, now it means intelligent, fast, and super strong value. The nuance of this is difficult, but we are going to continue pushing the frontier of intelligence while trying to pack as much value into a model as possible.
It’s also worth noting that 3.1 flash-lite is super strong!
I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell
Just off stage at #GoogleIO, some highlights from this morning 🧵
Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs.
Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.

Google Antigravity is expanding, including a new standalone desktop app that acts as a central home for agent interaction. We’re also introducing a new Antigravity CLI providing a fast, lightweight way to deploy new agents instantly without a graphical user interface, and a new Antigravity SDK, to give direct access to the same agent harness powering our own products to customize and host agents on your own infrastructure.
With 3.5 Flash in Antigravity, developers can now do so much more. The new Antigravity ecosystem is coming to developers today.
Just off stage at #GoogleIO, some highlights from this morning 🧵 Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs. Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.
Gemini Spark is your personal AI agent in the @GeminiApp that gets things done on your behalf, under your direction. It runs 24/7 (and yes - you can close your laptop). It’s powered by Gemini 3.5 and built on the Google Antigravity harness so it can complete long horizon tasks.
Spark will integrate seamlessly with tools, starting with ours, and soon with 3P tools with MCP. You’ll also be able to work with it through email + chat. Available to trusted testers this week and next week in Beta to AI Ultra users in the US.

Google Antigravity is expanding, including a new standalone desktop app that acts as a central home for agent interaction. We’re also introducing a new Antigravity CLI providing a fast, lightweight way to deploy new agents instantly without a graphical user interface, and a new Antigravity SDK, to give direct access to the same agent harness powering our own products to customize and host agents on your own infrastructure. With 3.5 Flash in Antigravity, developers can now do so much more. The new Antigravity ecosystem is coming to developers today.
Gemini Omni is our new model that can create anything from any input - starting with video. It combines Gemini’s intelligence with our generative media models, for a new level of world understanding, multimodality, and editing.
Gemini Omni Flash is rolling out today to Google AI Plus, Pro and Ultra subscribers globally through the @Geminiapp and Google Flow, to @YouTube Shorts this week, and to our developer and enterprise APIs in the coming weeks.
Gemini Spark is your personal AI agent in the @GeminiApp that gets things done on your behalf, under your direction. It runs 24/7 (and yes - you can close your laptop). It’s powered by Gemini 3.5 and built on the Google Antigravity harness so it can complete long horizon tasks. Spark will integrate seamlessly with tools, starting with ours, and soon with 3P tools with MCP. You’ll also be able to work with it through email + chat. Available to trusted testers this week and next week in Beta to AI Ultra users in the US.
As models get even better, the need for transparency grows. Last year @nvidia adopted our SynthID invisible watermark, and today we’re excited to announce @OpenAI, Kakao, and @ElevenLabs will join them.
We’re also going further by adding C2PA Content Credentials verification to @GeminiApp alongside Synth ID detection and bringing both to Search and Chrome so you can easily check whether content was captured by a camera, or created/ edited with gen AI tools.

Gemini Omni is our new model that can create anything from any input - starting with video. It combines Gemini’s intelligence with our generative media models, for a new level of world understanding, multimodality, and editing. Gemini Omni Flash is rolling out today to Google AI Plus, Pro and Ultra subscribers globally through the @Geminiapp and Google Flow, to @YouTube Shorts this week, and to our developer and enterprise APIs in the coming weeks.
Gemini 3.5 Flash is transforming what you can do in Google Search with new agentic capabilities. A few things we’re introducing:
A new intelligent AI-powered Search box, our biggest upgrade in 25 years — rolling out globally.
New information agents that work in the background 24/7 to find exactly what you need at the right moment, and help you take action (coming this summer).
And with the power of Google Antigravity, we’re unlocking new agentic coding capabilities, so Search can build custom interactive experiences – like visual simulations, dashboards or trackers – for your individual questions.
Read more: https://blog.google/products-and-platforms/products/search/search-io-2026
As models get even better, the need for transparency grows. Last year @nvidia adopted our SynthID invisible watermark, and today we’re excited to announce @OpenAI, Kakao, and @ElevenLabs will join them. We’re also going further by adding C2PA Content Credentials verification to @GeminiApp alongside Synth ID detection and bringing both to Search and Chrome so you can easily check whether content was captured by a camera, or created/ edited with gen AI tools.
We’re having much more natural conversations with Gemini directly inside many of our products, so we’re bringing this to two more:
Ask YouTube is a new experience for searching for content on YouTube. It gives you information in an easy to navigate layout, with videos best matched to what you’re looking for, and jumps right to the part most relevant to your query.
With voice-powered Docs Live you can brain dump whatever is on your mind, and let Gemini do the rest. Rolling out this summer, and the same voice capability is coming to Gmail and Keep then too.
Gemini 3.5 Flash is transforming what you can do in Google Search with new agentic capabilities. A few things we’re introducing: A new intelligent AI-powered Search box, our biggest upgrade in 25 years — rolling out globally. New information agents that work in the background 24/7 to find exactly what you need at the right moment, and help you take action (coming this summer). And with the power of Google Antigravity, we’re unlocking new agentic coding capabilities, so Search can build custom interactive experiences – like visual simulations, dashboards or trackers – for your individual questions. Read more: https://blog.google/products-and-platforms/products/search/search-io-2026
Read my remarks: https://blog.google/innovation-and-ai/sundar-pichai-io-2026/

We’re having much more natural conversations with Gemini directly inside many of our products, so we’re bringing this to two more: Ask YouTube is a new experience for searching for content on YouTube. It gives you information in an easy to navigate layout, with videos best matched to what you’re looking for, and jumps right to the part most relevant to your query. With voice-powered Docs Live you can brain dump whatever is on your mind, and let Gemini do the rest. Rolling out this summer, and the same voice capability is coming to Gmail and Keep then too.
Workhorse model! (and hope you're enjoying your first I/O)
Insane evals for a Flash model! Gemini 3.5 Flash is really good for its size!
... and it is *so* fast ⚡️⚡️⚡️
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
Also had some early access to Gemini 3.5 Flash. Very fast for a flash model and very capable, though not as powerful as a full frontier model.
I added it to the gallery or procedurally generated one-shot towns (it made one error that it corrected): https://hg-20f7d1a3ce.netlify.app/#gemini-3-5-flash

Checking out Gemini 3.5 Flash, available today, which helped power Antigravity 2.0, Gemini Spark, and many more! https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽
Pretty wild to be able to automatically train AlphaZero and serve a playable Go demo with just 2 prompts using Gemini 3.5 Flash and Antigravity 2.0 ...
Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵
My notes on Gemini 3.5 Flash - 3x the price of Gemini 3 Flash but Google are planning to use it for many of their own products https://simonwillison.net/2026/May/19/gemini-35-flash/
The metrics on 3.5 Flash are major: 4x faster than other frontier models in output tokens per second Outperforms our previous 3.1 Pro model on nearly all benchmarks Shows massive improvement on coding and agentic benchmarks like Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%)

Live from #GoogleIO, we’re introducing Gemini 3.5 Flash, our latest model with frontier performance for agents and coding. We’re rolling it out globally today, delivering frontier-level performance for real-world agentic workflows at 4x the speed of other frontier models. 🧵
Live from #GoogleIO, we’re introducing Gemini 3.5 Flash, our latest model with frontier performance for agents and coding. We’re rolling it out globally today, delivering frontier-level performance for real-world agentic workflows at 4x the speed of other frontier models. 🧵

The combination of performance and speed makes this model ideal for long-horizon and agentic tasks. Our enterprise partners are already seeing real-world results and accelerating their daily workflows. Read more about Gemini 3.5 in our blog: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
The metrics on 3.5 Flash are major: 4x faster than other frontier models in output tokens per second Outperforms our previous 3.1 Pro model on nearly all benchmarks Shows massive improvement on coding and agentic benchmarks like Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%)
@suchenzang At least we still have principles
ouch
Quite the flex, you love to see it
Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals
Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵
Gemini 3.5 Flash delivers sustained frontier-level performance at lightning-quick speeds: Beats 3.1 Pro on coding & agentic benchmarks Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%) 4x faster than other frontier models (12x in Antigravity!) SOTA on multimodality with 83.6% on MMMU-Pro

Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵
Gemini 3.5 Flash is rolling out globally for consumers in the Gemini app and Search AI Mode, for developers via the Gemini API, Google Antigravity, and Google AI Studio, and for businesses on the Gemini Enterprise Agent Platform. Read more about the Gemini 3.5 era: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
Gemini 3.5 Flash delivers sustained frontier-level performance at lightning-quick speeds: Beats 3.1 Pro on coding & agentic benchmarks Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%) 4x faster than other frontier models (12x in Antigravity!) SOTA on multimodality with 83.6% on MMMU-Pro
Following our Gemini 3.5 Flash launch at #GoogleIO, check out this demo of what it can do right inside the Gemini app.
Watch 3.5 Flash build an interactive circuit helper, outputting a step-by-step physical build guide alongside a working simulation. It’s a great example of how the Gemini app can help students learn visually, breaking down new or complex concepts and guiding them through hands-on subjects interactively.
3.5 Flash is available globally today!
@benhylak 2.5-pro-exp-0325
flash 2 was last great google model.
Gemini 3.5 Flash announced!

Probably true. I think the reality is that we’re going to have a de facto licensing regime (“voluntary”), where the government will give green lights to the labs on releases. That is fine as a very temporary solution, but it’s opaque and essentially lawless. We will need to institutionalize this stuff and create transparent, objective, and predictable protocols to structure it.
I support having private bodies, overseen by government, having a major role in evaluating and setting technical standards, internal governance practices within labs, and similar. But more important than my specific idea is that we operate according to the rule of law, rather than create a de facto, opaque licensing regime in the name of maintaining the illusion that pre-deployment review is “voluntary” and that we are Not Regulating AI.
I want to make this prediction now so I can quote it later. Gemini Pro 3.5 and GPT-5.6 are both ready now, and both labs want to release them, but they are being held back for safety testing in a test flight of the new regulations in the forthcoming executive order.
RL roughly on trend, multimodality on trend, strange to see them report mediocre MRCR and ARC-AGI-2. Given the speed, it might well have fewer active parameters than Flash-3 (so they both shrink the batch and grow margin). Will be a successful model until we get some 5.6-Mini.

Gemini 3.5 Flash Benchmarks
I'm wrong, thanks @yourboiilevi it's G3 Flash base, they just serve it faster interesting

RL roughly on trend, multimodality on trend, strange to see them report mediocre MRCR and ARC-AGI-2. Given the speed, it might well have fewer active parameters than Flash-3 (so they both shrink the batch and grow margin). Will be a successful model until we get some 5.6-Mini.
Just call it Gemini 4 Pro at this point
Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way). The model is the product, please keep the feedback coming!
@zephyr_z9 same model as 3 flash
Clearly has very low active parameters but a lot more total parameters
Google should do a reverse-Anthropic and add Gemini 3.5-SLOW 3-Flash was fast enough, and its cost made it attractive this thing is maybe better in some ways but that's not worth the 3x price hike
Thinking again about it > we spent the last 2.5 years putting the infrastructure, products, team, etc in place yeah makes sense they have NOT pretrained a new model, or even refreshed the data. (There has been a bit of an update, eg it knows me) Google is very… longtermist.
Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way). The model is the product, please keep the feedback coming!
Thinking again about it > we spent the last 2.5 years putting the infrastructure, products, team, etc in place yeah makes sense they have NOT pretrained a new model, or even refreshed the data. (There has been a bit of an update, eg it knows me) Google is very… longtermist.
“On what do I base the belief that the 2026 results were a "hypothetical projection"?
I base this on the fundamental structure of my data.” fair enough
I think the verdict is in, Gemini didn't have any post training breakthrough, except maybe through the floor. Outside of vision, massive disappointment. fucking V4-Flash gets stuff DONE faster. Then again I almost never used 3-Flash I'll likely almost never use this thing too

Wait, what?????? What kind of post training breakthrough did they make?? So the price increase is mostly due to smaller batch size to make it run faster
@zephyr_z9 well, if V4-Flash is possible then this is also possible as far as we know G3-Flash has like 1.2T params, no? plenty of juice left if you're Google
Wait, what?????? What kind of post training breakthrough did they make?? So the price increase is mostly due to smaller batch size to make it run faster
I think one neglected area in model evals is case studies of LLM-Hard questions. Like, here we see that literally nothing can crack #10 and #12 ArXivMath in a few shots. (somehow #6 yields to… Qwen-2B). If we aren't just training on test, CoTs of such problems deserve scrutiny.

Meh On MathArena, Gemini 3.5 Flash is neither bad nor great. It is very fast though: I ran 1000 queries in 30 minutes.
now admittedly it got 4 tries vs 3-4 for others, but still, lmao on apex-shortlist we see that top models struggle with #18 but those below them do not. Might it just be a ground truth failure? @j_dekoninck
I think one neglected area in model evals is case studies of LLM-Hard questions. Like, here we see that literally nothing can crack #10 and #12 ArXivMath in a few shots. (somehow #6 yields to… Qwen-2B). If we aren't just training on test, CoTs of such problems deserve scrutiny.
IMO #6 was a famous one. Recently got cracked with GPT 5.5 Pro. But that's not very interesting. A year later OpenAI's best can do this hard thing everyone was aware of, Duh. Tells us little. The recent OpenDeepThink from @wenhaocha1 et al (which I kinda reproduced) boasts +400 Elo on CF, but they also say this: "Of the seventeen unsolved problems across the Flash and 2.5 Pro runs, none crosses 5% pass@1 in any generation". I do not think models of this era literally do not have the "competence" for solving any particular programming task, it's all compositional. So I am generally much more intrigued in techniques that can break through this barrier than in amplifying pass@k by changing how we fiddle with K and partition it into l, m, n. Likewise for methods like PaCoRe from @StepFun_ai or the new MCTS from @ZyphraAI. How do we get unsolvable things solved by trading compute for intelligence rather than "performance"? Ultimately that's the whole promise of this journey to AGI via scaling, isn't it. Is there a way that doesn't just rely on iterative training of models on synthetic data? If that were all, we're at risk at having to do exponentially costly search for data recipes that do not exceed inherent capability of models and thus lead to narrow-generalizing memorization of patterns and more false promises. Yes, we can evidently stack these chairs to a dizzying height if money is no issue, but could we at least evolve to processing them into plywood already? Might be a reason GDM is so calm in the face of two "startups"; why Gemini is half-assing this main vector of market competition that is agentic SWE. Demis suspects that AGI have to be done the hard way, from the ground up. Raw bytes, universal predictors, world models; removing layers of human-digested slop between downstream outputs and bare metal as your stockpile of metal grows. The "synthetic data" stockpile might prove to be fairy gold if he's right.
now admittedly it got 4 tries vs 3-4 for others, but still, lmao on apex-shortlist we see that top models struggle with #18 but those below them do not. Might it just be a ground truth failure? @j_dekoninck
@suchenzang @theo I wonder how 3.5 Pro fares while I have no love for 3.1 Pro it was a powerful model for its time
it was a third of the reason i came to gdm back in 2023 (the other two being google hardware and search). no other player had a serious cloud business and frontier lab in one, and cutting down cost/boosting speed while preserving ~90% capability was obviously critical for any interesting dev work outside of the labs. at some point i realized "frontier capabilities" actually didn't matter for companies with real businesses (eg meta, apple, etc). the people who really need the hype to play out are the ones whose entire market value counts on this one specific thing incrementally improving + future projected light cone of value captured. plenty of reasons to hate on apple, but in retrospect them sitting out this race might not have been the worst idea. let's see if G sees value in course correcting on its priorities. if not... guess it's all on open-source to fill these shoes.
@VictorTaelin available on openrouter
Narrator: they already fucked up → Gemini 3.5 Flash not available on API. → Fast mode locked to Antigravity only. I don’t understand why companies keep doing this. They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money? Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use. It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is. And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money. Your model is the product. You keep chasing old business models. Completely out of touch. Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!! ~~~ Reposting this. I deleted before because they launched Antigravity CLI, which isn't an API, but at least gives us *some* flexibility. But no: I'm getting 10x slower TPS on CLI compared to the IDE. So either a bug or they really want you to use the visual IDE. So my money will keep going to Anthropic, unfortunately. 🤦♂️
Kind of a big deal
Meet Gemini 3.5 Flash — our strongest agentic and coding model yet. It delivers frontier-level performance at 4x the speed of comparable frontier models — often at less than half the cost. Generally available, starting today. 🧵 #GoogleIO
Holy shit man
Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.
nevermind
Holy shit man
Everything AI released at Google I/O 2026
- Gemini Omni Flash - Gemini 3.5 Flash (and in GA) - Antigravity 2.0 - Managed Agents in the Gemini API - AI Studio app in pre-order - New SynthID partnerships - AI Studio: native Android support, Workspace Integrations, and export to AGY - Antigravity SDK and CLI - Gemini Spark - New Google AI Ultra subscription
And stay tuned, so much more to come!
Gemini Flash 3.5 is now on CursorBench, our main coding agent eval.
We’ll keep updating the leaderboard as new models come out.
@teortaxesTex That's for Q4 against GPT-6 and Claude 5.
Just call it Gemini 4 Pro at this point
I want to make this prediction now so I can quote it later. Gemini Pro 3.5 and GPT-5.6 are both ready now, and both labs want to release them, but they are being held back for safety testing in a test flight of the new regulations in the forthcoming executive order.
@scaling01 I hope so.
Gemini 3.5 Pro is the Gemini Ultra we always wanted
Proud to have worked on recreating Alphazero. The future is super super exciting 🔥!
Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵
Gemini 3.5 Flash is out, and it's a major jump over Gemini 3 Flash in model capability for knowledge work. We've been evaluating it on our Box AI Complex Work Eval in early release, and the model delivers a 12 percentage point jump on complex document tasks.
For testing this model, we give the Box AI Agent (using Gemini 3.5) complex problems to solve that represent common but difficult knowledge worker tasks in banking, consulting, public sector, healthcare, and other industries. These tasks can be things like drafting reports, doing due diligence, and more, given a set of relevant documents.
In our tests, Gemini 3.5 Flash delivered jumps across every industry, including:
* Financial services: 81% vs 73% (+8pp) * Public sector: 76% vs 59%, (+17pp) * Healthcare: 73% vs 51%, (+22pp) * Life Sciences: 67% vs 47%, (+20pp)
Incredible to see the continued performance gains.
Gemini 3.5 Flash will be available soon in Box AI Studio and through the Box API. The Box MCP Server will soon be available in the Gemini app with more details to come.

@OfficialLoganK @mercor_ai Benchmarks were never the issue for Gemini models. They’ve consistently struggled with vibes though.
Gemini 3.5 Flash ranks #1 on the APEX-Agents-AA benchmark, outperforming much larger models a whole size above it.
oof
Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.
Looking forward to gemini-cli becoming usable
Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.
@JeffDean where can we try this? is there a site where you just put a paper name and get this kind of model card? would love to test it properly 🙏
2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency. It really helps you distill papers down to their essence and aid your understanding!
@JeffDean @LeslieNooteboom nice thanks a lot, i'll try this!! :)
My colleague @LeslieNooteboom generated these: we don't have it packaged up as an one-click link, but here's the prompt passed along by Leslie (thanks, Leslie!). Plug in the abstract of a paper at the bottom where it says '$abstract': --- ❝You are a world-class creative developer. Build a beautiful, high-resolution, highly visual concept animation. MUST follow: * Output ONLY valid, fully self-contained HTML/CSS/JS code. Start with <!DOCTYPE html> and end with </html>. Do not include any markdown fences in the JSON property. * Self-contained, elegant LIGHT THEME aesthetic matching a clean technical paper (e.g., white or very light gray background, dark crisp typography, minimal harmonious colors). Avoid dark backgrounds. * Focus on ONE strong visual metaphor with graphical animations or elegant interactions. * MINIMIZE EXPLANATORY TEXT: Do not add a title or snippet of the abstract to this animation code, the full abstract is already displayed next to this animation. Let the visual movement and graphic structure explain the concept. Avoid generating heavy text paragraphs or excessive text boxes. Keep text to a few minimal labels or neat status badges. * Keep the visual script and logic extremely concise, under 200 lines of code. Do not build a complex engine or import massive libraries. * HIGH-RESOLUTION CANVAS: Always configure the HTML5 canvas for high-DPI/Retina screens by multiplying canvas.width/height by window.devicePixelRatio, setting its CSS style to the logical width/height, and scaling the context with ctx.scale(dpr, dpr). This avoids any blurry/low-resolution drawings. * RESPONSIVE SCALING & VIEWPORT: Design the visualization to be fully responsive, filling 100% of the viewport width and height (using 100vw/100vh with margin 0 and overflow hidden). Implement a window resize listener to update the canvas buffer dimensions dynamically when the viewport changes, ensuring no scrollbars or visual clipping. * HYBRID HTML-ON-CANVAS & OVERLAY COLLISION PREVENTION: Draw high-performance background graphics (particles, nodes, flows) on the high-DPI canvas, but overlay crisp, high-resolution HTML/CSS divs/labels/buttons on top of the canvas (using absolute positioning) for gorgeous typography and control panels. To prevent absolute overlay panels from hiding, blocking, or overlapping the visual animation components on standard or narrow screens: 1. **ULTRA-COMPACT FOOTPRINT**: All overlay cards must be extremely compact. Set a strict 'max-width' of **no more than 240px** (or 25% of viewport width). Avoid long text paragraphs, heavy padding (use 'p-2.5' or 'p-3'), and large stack buttons. 2. **COMPACT CONTROLS**: If offering option buttons, style them as small inline segmented pills or a minimal dropdown select box rather than a stack of wide, fat buttons. Keep text sizing small ('text-[10px]' or 'text-[11px]'). 3. **TRANSPARENCY**: Use highly semi-transparent, elegant backgrounds (e.g., white with high transparency: 'rgba(255, 255, 255, 0.72)' and a backdrop blur 'backdrop-filter: blur(10px)') for all overlay cards so the underlying animation flows remain beautifully visible behind them. 4. **LAYOUT SAFETY BOUNDARIES**: Offset the center of the canvas drawings (like circles, nodes, or waves) horizontally or vertically (e.g., centering them in the remaining 75% clear space of the canvas) so they are never drawn directly underneath the control card. Scale down the radius/bounds dynamically if the viewport width contracts.Ensure all script tags, function braces, and HTML elements are completely and properly closed. No placeholders, no labels like // ... (insert here). Generate the concept animation based strictly on the following research paper abstract: "${abstract}" ❞
@JeffDean @LeslieNooteboom oh one additional question wry, is this generated with the model on https://gemini.google.com or a specific harness like in antigravity? (found for claude that it's much better inside claude code for this kind of stuff)
@JeffDean @LeslieNooteboom nice thanks a lot, i'll try this!! :)
correlation between CursorBench and Artificial Analysis reported scores
benchmarks like IFBench or tau2 show ~0 correlation with CursorBench. opus 4.7 (max effort) performs relatively better on CursorBench than on other benchmarks, gpt 5.5 shows the opposite pattern

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals
oh interesting, i don't have strong opinion here but if you look at flash 3.0 (or even pro) -> flash 3.5 you get improvement on benchmark across the board (not sure if it's the same base ect.. tho, hard to compare). i'd say recipe like train expert on RL then OPD kinda work to improve multiple domain at the same time?
@eliebakouch I think multibenchmarks are less interesting in a post training dominated world. It was very cool when it was pretraining only and just got better across the board.
@srush_nlp would be interesting to see what's the limitation of this if any, like by scaling the number of areas
@eliebakouch I don’t know what Google does. But my read is that train 100 experts then OPD is exactly how you get a model good at 100 areas.
@_arohan_ @scaling01 @PMinervini @vincentweisser 👀 could be fun indeed, will look into this
@scaling01 @PMinervini @eliebakouch @vincentweisser it would be fun for you guys to use this against claude and codex in auto research loop and see if it has good tastes.
@willccbb @benhylak
@benhylak 2.5-pro-exp-0325
TPUs have always been low-key goated, people are finally starting to feel it now.
Today, we introduced Gemini 3.5 Flash ⚡ Our most capable coding and agentic model — where "fast" and "best" aren't a tradeoff. Try it now across Antigravity, AI Studio, Gemini App, and AI Mode.
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
I’m extremely proud of the team and this has been one of most intense and most rewarding launches we have done! And we are not done yet and are busy cooking 3.5 Pro.
Today, we introduced Gemini 3.5 Flash ⚡ Our most capable coding and agentic model — where "fast" and "best" aren't a tradeoff. Try it now across Antigravity, AI Studio, Gemini App, and AI Mode.
Here is a fun demo showcasing the model’s Web Development capabilities.
I’m extremely proud of the team and this has been one of most intense and most rewarding launches we have done! And we are not done yet and are busy cooking 3.5 Pro.
@JeffDean Any idea when Gemini 3.1 pro will drop the “-preview”?
Highly capable models that are fast are super important. Our new Gemini 3.5 Flash model is a great mix of fast and capable.
Gemini 3.5 Flash is now GA. Our most capable Flash model, built for agentic execution, coding, and long-horizon tasks.
- Outperforms Gemini 3.1 Pro on coding and agentic tasks - 1M token context window with 65k max output tokens - 4x faster output tokens/sec - 4 thinking levels: minimal, low, medium (new default), high - Thought preservation across multi-turn conversations automatically
Available today in @GoogleAIStudio, @Android Studio, @antigravity, Gemini Enterprise, the @GeminiApp, and AI Mode in Search.

Developer Guide: https://ai.google.dev/gemini-api/docs/interactions/whats-new-gemini-3.5 Introducing Gemini 3.5: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/ AI Studio: https://aistudio.google.com/prompts/new_chat?model=gemini-3.5-flash
Gemini 3.5 Flash is now GA. Our most capable Flash model, built for agentic execution, coding, and long-horizon tasks. - Outperforms Gemini 3.1 Pro on coding and agentic tasks - 1M token context window with 65k max output tokens - 4x faster output tokens/sec - 4 thinking levels: minimal, low, medium (new default), high - Thought preservation across multi-turn conversations automatically Available today in @GoogleAIStudio, @Android Studio, @antigravity, Gemini Enterprise, the @GeminiApp, and AI Mode in Search.
@OfficialLoganK @GoogleDeepMind Good model
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
We wrote a new Developer Guide for Gemini 3.5 Models. Easiest way to migrate is install the Interactions API Skill. Then run
``` /gemini-interactions-api migrate my app to Gemini 3.5 Flash ```
Skill available at: `npx skills add google-gemini/gemini-skills --skill gemini-interactions-api`
@osanseviero Poor Gemini for Science 🥲
Everything AI released at Google I/O 2026 - Gemini Omni Flash - Gemini 3.5 Flash (and in GA) - Antigravity 2.0 - Managed Agents in the Gemini API - AI Studio app in pre-order - New SynthID partnerships - AI Studio: native Android support, Workspace Integrations, and export to AGY - Antigravity SDK and CLI - Gemini Spark - New Google AI Ultra subscription And stay tuned, so much more to come!
it's Gemini 3.5 Flash day
but pricing is $1.5 / $9 per mtoks 💀

it now seems very likely to me that the new Gemini 3.5 Pro model will be a 10T+ model like Mythos
it's Gemini 3.5 Flash day but pricing is $1.5 / $9 per mtoks 💀
pricing of 3.5 Pro should be $6 / $27 if they keep the 4x scaling and the trend of undercutting OpenAI
it now seems very likely to me that the new Gemini 3.5 Pro model will be a 10T+ model like Mythos
Gemini 3.5 Pro is the Gemini Ultra we always wanted
it now seems very likely to me that the new Gemini 3.5 Pro model will be a 10T+ model like Mythos
some more Gemini 3.5 Flash benchmarks by Artificial Analysis AI: - the APEX-Agents-AA score is excellent - expected higher on CritPt - reasoning efficiency could also be better, but it kind of depends on what setting they used. if it's the max then it's very good - Price/Perf looks pretty bad vs GPT-5.5. You can get GPT-5.5-medium with better performance for less and with faster responses
Gemini 3.5 Flash scores kinda low on the Coding Index due to terrible TerminalBench-Hard scores
Gemini 3.5 Flash Benchmarks

Gemini 3.5 Flash Benchmarks

Gemini 3.5 Flash Benchmarks
interestingly it still has the Jan 2025 knowledge cut-off

Gemini 3.5 Flash Benchmarks
Google optimized Gemini 3.5 Flash to make it run up to 12x faster (~867 tokens/s) than comparable models in AntiGravity
Gemini 3.5 Flash comparable with Opus 4.7, GPT-5.5 and Gemini 3.1 Pro on the Artificial Analysis Index while running up to 4x faster
Gemini 3.5 Flash now live in aistudio

Gemini 3.5 Flash Benchmarks
bruh

Gemini 3.5 Flash ranking third on vals index

Gemini 3.5 Flash comparable with Opus 4.7, GPT-5.5 and Gemini 3.1 Pro on the Artificial Analysis Index while running up to 4x faster
Gemini 3.5 Flash beats Gemini 3.1 Pro across TerminalBench 2.1, GDPval and MCP Atlas
GPT-5.5-medium has lower end-to-end latency, uses less tokens and is overall smarter and cheaper than Gemini 3.5 Flash
it might genuinely be over for anyone not named OpenAI or Anthropic

some more Gemini 3.5 Flash benchmarks by Artificial Analysis AI: - the APEX-Agents-AA score is excellent - expected higher on CritPt - reasoning efficiency could also be better, but it kind of depends on what setting they used. if it's the max then it's very good - Price/Perf looks pretty bad vs GPT-5.5. You can get GPT-5.5-medium with better performance for less and with faster responses
Gemini 3.5 Flash Pricing confirmed at $1.5 / $9 per mtoks

Gemini 3.5 Flash Benchmarks
Gemini 3.5 Flash scores kinda low on the Coding Index due to terrible TerminalBench-Hard scores
Google introduces Gemini 3.5 Flash

that's rough
Gemini 3.5 Flash barely improved over Gemini 3 Flash but is now 3.5x more expensive on WeirdML
the same pattern was visible on Artificial Analysis Index: - it's more expensive because it uses more output tokens and slightly more total tokens than Gemini 3 Flash
Gemini 3.5 Flash beats Gemini 3.1 Pro across TerminalBench 2.1, GDPval and MCP Atlas

ahhhh google bros
GPT-5.5-medium has lower end-to-end latency, uses less tokens and is overall smarter than Gemini 3.5 Flash
it might genuinely be over for anyone not named OpenAI or Anthropic

some more Gemini 3.5 Flash benchmarks by Artificial Analysis AI: - the APEX-Agents-AA score is excellent - expected higher on CritPt - reasoning efficiency could also be better, but it kind of depends on what setting they used. if it's the max then it's very good - Price/Perf looks pretty bad vs GPT-5.5. You can get GPT-5.5-medium with better performance for less and with faster responses
meh
doesn't even beat Kimi or GLM
Gemini 3.5 Flash has landed #9 for Text and Code Arena: Frontend. Code Arena: Frontend evaluates models on agentic frontend coding tasks from real users building apps and websites (HTML and React). Scoring 1507, this is a significant +70 point improvement over Gemini-3 Flash. Sub-category highlights: - #7 Content Creation Tools - #8 Gaming - #8 Consumer Product - #9 Data & Analytics - #10 Reference-Based Design In Text Arena: #9 overall. Gemini 3.5 Flash also moves the price–performance frontier as the new top Arena score in its price tier. Congrats to the @GoogleDeepMind team on this launch! Click into the thread to see the rankings by each arena.
what the actual fuck
Gemini 3.5 Flash is 7.46 times more EXPENSIVE than GPT-5.5-xhigh on PencilPuzzleBench
(direct ask scores are below gpt-5.2-high)

the agentic score by itself is fine
but the cost is not real
what the actual fuck Gemini 3.5 Flash is 7.46 times more EXPENSIVE than GPT-5.5-xhigh on PencilPuzzleBench (direct ask scores are below gpt-5.2-high)
intelligence too cheap to meter

Gemini 3.5 is 30x more expensive than 1.5
(deliberately not hyping Gemini 3.5 Flash too much this time. looks like an insane model, but you know how it is with self-reported benchmarks)
i mean this is ridiculous

(deliberately not hyping Gemini 3.5 Flash too much this time. looks like an insane model, but you know how it is with self-reported benchmarks)
Google built an entire operating system with Gemini 3.5 Flash in 12 hours for less than $1000

Gemini 3.5 Flash on CursorBench

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals
looks like gemini 3.5 is a flop. great opportunity to buy the dip and obtain some anthropic-waymo stock on sale
Strong performance across the board for Gemini, but holy crap GPT-5.5 is goated at long context WTF
Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.
What I think about every time I read MRCR 8-needle

Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.
Google Gemini 3.5 Flash is super strong model for its class. Beats Gemini 3.1 Pro on so many benchmarks.
An agent model with 4x faster tokens per second.
And @aimlapi just added gemini 3.5 Flash to their API and keeping it FREE for 24hrs.
Setup instructions in comment.
Enjoy 24hrs of free Gemini 3.5 Flash access
1. set up an AI/ML API account https://aimlapi.com/app/auth/
2. Talk to them on their discord https://discord.gg/2g6xMRdu3j
Google Gemini 3.5 Flash is super strong model for its class. Beats Gemini 3.1 Pro on so many benchmarks. An agent model with 4x faster tokens per second. And @aimlapi just added gemini 3.5 Flash to their API and keeping it FREE for 24hrs. Setup instructions in comment.
Gemini 3.5 Flash now outruns Gemini 3.1 Pro on several real-work automation tests.
- With 4x faster output tokens per second
- A really powerful agent model fast enough and cheap enough for everyday work
- Flash beats Gemini 3.1 Pro on several hard agent and coding benchmarks, including 76.2% Terminal-Bench 2.1, 83.6% MCP Atlas, and 1,656 Elo GDPval-AA.
- Available in the Gemini app, AI Mode in Search, Gemini API, Antigravity, Android Studio, and Google’s enterprise agent products.
- When coupled with the updated Antigravity harness, 3.5 Flash becomes a powerful engine for deploying collaborative subagents to tackle problems at scale.
so one subagent might inspect a folder, another might rewrite code, another might test the result, and another might summarize what changed.

Gemini 3.5 in few more hours. 🔥
Gemini 3.5 Flash now outruns Gemini 3.1 Pro on several real-work automation tests. - With 4x faster output tokens per second - A really powerful agent model fast enough and cheap enough for everyday work - Flash beats Gemini 3.1 Pro on several hard agent and coding benchmarks, including 76.2% Terminal-Bench 2.1, 83.6% MCP Atlas, and 1,656 Elo GDPval-AA. - Available in the Gemini app, AI Mode in Search, Gemini API, Antigravity, Android Studio, and Google’s enterprise agent products. - When coupled with the updated Antigravity harness, 3.5 Flash becomes a powerful engine for deploying collaborative subagents to tackle problems at scale. so one subagent might inspect a folder, another might rewrite code, another might test the result, and another might summarize what changed.
The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it.
And seems like it is 20x faster than Opus 4.6 !
Promising but Google will still find a way to fuck up

The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up
@demishassabis @antigravity @GeminiApp plans to serve fast mode in the API?
Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!
Narrator: they already fucked up
→ Gemini 3.5 Flash not available on API.
→ Fast mode locked to Antigravity only.
I don't understand why companies keep doing this.
They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?
Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking under a old school product that nobody wants to use.
It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I'm certainly not launching a VSCode fork to use a model, no matter how great it is.
Your model is the product.
You do NOT need an IDE to make money.
You keep chasing old business models.
Completely out of touch.
Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!!
It is not hard. WHY it has to be so hard
. . .
The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up
I'm getting only ~80 tokens/s on Gemini 3.5 Flash after launch? It peaked at 1000+ before. Since there is no API, it is hard to measure though...
The new Gemini Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it.
And seems like Gemini Flash is 20x faster than Opus 4.6 !
Promising but Google will still find a way to fuck up
Narrator: they already fucked up
→ Gemini 3.5 Flash not available on API.
→ Fast mode locked to Antigravity only.
I don't understand why companies keep doing this.
They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?
Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use.
It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I'm certainly not launching a VSCode fork to use a model, no matter how great it is.
And even these who DO use IDEs probably won't necessarily pick YOUR IDE. And they shouldn't. You do NOT need them to, to make money.
Your model is the product.
You keep chasing old business models.
Completely out of touch.
Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!!
It is not hard. WHY it has to be so hard
. . .
The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up
btw this model is absolutely great
I just think locking the best product (fast-mode) under an old school visual IDE is a completely moronic business decision that only the Kodak of AI could truly make
Deleted again because misinformation 🥲 Gemini 3.5 Flash *is* available on the API. Yet, both the API and the CLI versions are 3x slower than on the IDE! See the video below. → Antigravity IDE: 4 seconds (smooth) → Antigravity CLI: 15 seconds (buggy) So the point holds: they want you to use the visual IDE. Problem is: it is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is. They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money? Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use. And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money. Your model is the product. You keep chasing old business models. Completely out of touch. Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API /ctrlv
Narrator: they already fucked up
→ Gemini 3.5 Flash not available on API.
→ Fast mode locked to Antigravity only.
I don’t understand why companies keep doing this.
They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?
Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use.
It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is.
And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money.
Your model is the product.
You keep chasing old business models.
Completely out of touch.
Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!!
~~~
Reposting this. I deleted before because they launched Antigravity CLI, which isn't an API, but at least gives us *some* flexibility. But no: I'm getting 10x slower TPS on CLI compared to the IDE. So either a bug or they really want you to use the visual IDE. So my money will keep going to Anthropic, unfortunately. 🤦♂️
The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up
Translating the same text, IDE vs CLI
→ IDE: smooth, 4 seconds
→ CLI: buggy, 15 seconds
I'm NOT using an IDE in 2026. I really want to stop giving money to Anthropic but everyone else is making it so hard
Narrator: they already fucked up → Gemini 3.5 Flash not available on API. → Fast mode locked to Antigravity only. I don’t understand why companies keep doing this. They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money? Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use. It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is. And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money. Your model is the product. You keep chasing old business models. Completely out of touch. Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!! ~~~ Reposting this. I deleted before because they launched Antigravity CLI, which isn't an API, but at least gives us *some* flexibility. But no: I'm getting 10x slower TPS on CLI compared to the IDE. So either a bug or they really want you to use the visual IDE. So my money will keep going to Anthropic, unfortunately. 🤦♂️
@OfficialLoganK @GoogleAIStudio oh, godspeed. highly appreciated
will fast mode available on API though?
Massive updates to @GoogleAIStudio and the Gemini API 🤯 - Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity
Deleted again because misinformation 🥲
Gemini 3.5 Flash *is* available on the API. Yet, both the API and the CLI versions are 3x slower than on the IDE! See the video below.
→ Antigravity IDE: 4 seconds (smooth)
→ Antigravity CLI: 15 seconds (buggy)
So the point holds: they want you to use the visual IDE.
Problem is: it is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is.
They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?
Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use.
And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money.
Your model is the product.
You keep chasing old business models.
Completely out of touch.
Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API
/ctrlv
The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up
Gemini 3.5 Flash is an incredible model and super fast, try it out in Gemini today!
@theo wow, this is a huge miss.
Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.
@melvinjohnsonp ⚡ woohoo! incredible work! congratulations! ⚡
Today, we introduced Gemini 3.5 Flash ⚡ Our most capable coding and agentic model — where "fast" and "best" aren't a tradeoff. Try it now across Antigravity, AI Studio, Gemini App, and AI Mode.
Flash has been a go-to for builders for its speed + performance + cost - sweet spot - excited for people to build on Gemini 3.5 Flash, even more powerful now!
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽
Had a lot of fun pushing Flash’s capabilities on long-running agentic tasks!
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
A pleasure working with @vihaniaj and @ShunyuYao14!
@OfficialLoganK @GoogleAIStudio So many cool announcements this year. Congrats!
Massive updates to @GoogleAIStudio and the Gemini API 🤯 - Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity
3/ The positioning is clear:
Flash is no longer just “cheap fast model.”
Google wants Gemini 3.5 Flash to be the default engine for long-horizon agents: plan, build, iterate, use tools, execute code, complete real work.
Gemini 3.5 Pro comes next month. Can't wait to try Flash

2/ Benchmark are crazy: • Terminal-Bench 2.1: 76.2% • GDPval-AA: 1656 Elo • MCP Atlas: 83.6% • CharXiv Reasoning: 84.2% Google says 3.5 Flash beats Gemini 3.1 Pro on key coding/agentic evals and is 4x faster than other frontier models!
4/ Gemini Omni Flash is the other monster announcement.
Google’s framing: “create anything from any input — starting with video.”
Text, images, video, audio as inputs → high-quality generated/edited video grounded in Gemini’s world knowledge.
3/ The positioning is clear: Flash is no longer just “cheap fast model.” Google wants Gemini 3.5 Flash to be the default engine for long-horizon agents: plan, build, iterate, use tools, execute code, complete real work. Gemini 3.5 Pro comes next month. Can't wait to try Flash
Look at them digits
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
Insane evals for a Flash model! Gemini 3.5 Flash is really good for its size!

Gemini 3.5 Flash official! Insanely fast an capable model
Gemini 3.5 Flash official! Insanely fast an capable model
„Progress towards AGI“: Gemini Omni - world models -Gemini Omni official!! It can create anything from any input!!!
Gemini 3.5 pro next month!!!

Gemini 3.5 Flash official! Insanely fast an capable model
Thank you Sundar - first I/O and already feeling at home.
Gemini 3.5 Flash is genuinely impressive for a model at this price point. The efficiency race is just getting started!
Workhorse model! (and hope you're enjoying your first I/O)
@sundarpichai Thanks Sundar!
Workhorse model! (and hope you're enjoying your first I/O)
So Google just cooked everyone on cost & speed Harnessing the full power of model-hardware co design Extreme sparsity and Ironwoods
Gemini 3.5 Flash comparable with Opus 4.7, GPT-5.5 and Gemini 3.1 Pro on the Artificial Analysis Index while running up to 4x faster
Clearly has very low active parameters but a lot more total parameters

Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.
Wait, what?????? What kind of post training breakthrough did they make?? So the price increase is mostly due to smaller batch size to make it run faster
I'm wrong, thanks @yourboiilevi it's G3 Flash base, they just serve it faster interesting
@teortaxesTex yeah, turned out to be benchmaxxed shit
I think the verdict is in, Gemini didn't have any post training breakthrough, except maybe through the floor. Outside of vision, massive disappointment. fucking V4-Flash gets stuff DONE faster. Then again I almost never used 3-Flash I'll likely almost never use this thing too
Genuinely impressive release by Google today (remember when they were behind?)
Gemini 3.5 Flash perf: * Building on prior strengths (83.6% of MMMU-Pro for multimodal), * big jump on agentic coding (76.2% on Terminal-Bench for agentic coding and 56.5% on Toolathon for real world tasks) * progress and expert tasks (57.9% on Finance Agent 2... we are cooked) * leading scores across SWE-Bench, OSWorld etc.
(also, elegant to bold the top scores in the chart below even if when it's not Google leading)
Ofc, just benchmarks, and also not cheap (~$9/M output), but Google is cookin'... we are all so spoiled to have the 3 labs compete

Day 2 Vibes On Gemini Flash 3.5
- Sonnet class model - More expensive than Sonnet in real-world usage - GPT 5.5 & Sonnet/Opus still maintain lead
Real problem - It's just too expensive as it spins on agentic problems
OH WOW! GOOGLE FINALLY BECOMES A LEGIT AI COMPANY Gemini 3.5 Flash is Generally Available!!
No more preview launches with rate limits!!!
This is earth shattering....😲😲
Gemini Flash 3.5 Is As Good As Sonnet 4.6
Flash is just below Sonnet 4.6 on the leader board. This is Google's first competitive model in a while!
But yes, Google has still got game 🚀🚀

Gemini Flash 3.5 seems pretty equivalent to Flash 3.1
So why is it 300% more expensive?!!
Imagine being in 3rd place, having infinite money and increasing prices!!!
It’s like they are not serious about AI
Gemini Flash 3.5 Is As Good As Sonnet 4.6
Flash is just below Sonnet 4.6 on the leader board. This is Google's first competitive model in a while!
There is one problem - in practice it is more expensive than Sonnet 4.6 as it loops forever when dealing with agentic loops
But yes, Google has still got game 🚀🚀

Google Makes A Come Back - Gemini Flash Early Vibes
- brilliant instruction follower!! like absolutely stunning - good on agentic coding - it is NOT bench-maxxed
This is genuinely a good model at a great price from Google.
Overall a way better alternative to Sonnet. Will be on ChatLLM shortly
OH WOW! GOOGLE FINALLY BECOMES A LEGIT AI COMPANY Gemini 3.5 Flash is Generally Available!!
No more preview launches with rate limits!!!
This is earth shattering.... It's like they really have an engineering team 😲😲
The flash version is pretty good….
Now imagine a Gemini Pro 3.5 that is NOT benchmaxxed
Beating everyone at everything!
Gemini 3.5 Flash is here!!! 🚀🚀
Priced at 3x it's predecessor but still WAY CHEAPER than GPT 5.5 or Opus 4.7
We are evaluating the model against a bunch of real-world quality evals. Results coming later today
TBH the Chinese open-source models still beat Gemini Flash 3.5 and are 10x cheaper The best open source models are very good
TBH, Kimi 2.6 beats Gemini Flash 3.6
Plus it is 10x cheaper
So, yes, open source is still winning
Gemini magic everywhere ✨
Just off stage at #GoogleIO, some highlights from this morning 🧵 Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs. Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.
(1/4) Gemini 3.5 Flash is in a league of it's own! ⚡️ It's the perfect combo of intelligence, speed, & cost. It's now my daily driver in both Spark & Antigravity!
Watch 3.5 spawn subagents organize a set of marketing assets, rename them, and put them into folders
(3/4) And...did I mention 3.5 Flash is so fast?

(2/4) I'm really proud of the model's performance. Gemini 3.5 Flash outperforms Gemini 3.1 Pro on most benchmarks -- it's great at code & agentic workflows, and continues Gemini's multimodal excellence
(2/4) I'm really proud of the model's performance. Gemini 3.5 Flash outperforms Gemini 3.1 Pro on most benchmarks -- it's great at code & agentic workflows, and continues Gemini's multimodal excellence

(1/4) Gemini 3.5 Flash is in a league of it's own! ⚡️ It's the perfect combo of intelligence, speed, & cost. It's now my daily driver in both Spark & Antigravity! Watch 3.5 spawn subagents organize a set of marketing assets, rename them, and put them into folders
(4/4) You can try Gemini 3.5 Flash across the Gemini app, AI Mode in Search, the Gemini API, Google AI Studio, Android Studio, and our Enterprise platforms. Can’t wait to see what you build! ✨
(3/4) And...did I mention 3.5 Flash is so fast?
coding with flash is a different experience, it's absurdly fast, it sometimes feels instant. for hard debugging tasks it can explore large areas of the problem space in minutes. it can outperform bigger models on hard tasks by crunching more tokens in less clock time
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
@yacineMTB The main thing I was looking forward to
Kind of a big deal
Gemini 3.5 Flash seems to have an almost endearing penchant for lying to me. I have no idea why it is saying this... but, I am not upset. It feels like a side effect of the harness more than the model (as much as those can be differentiated, these days).

Gemini 3.5 Flash is a really interesting release.
It's super fast and surprisingly smart. It's also more expensive (3x more per token) and super token hungry.
The result - it costs 2x more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT-5.5 Medium.

The cost to performance chart is the most interesting.
3.5 Flash is "more expensive" and "dumber" than gpt-5.5 on medium
gpt-5.5-medium: 22m tokens, $1,199, 57 points gemini-3.5-flash: 73m tokens, $1,522, 55 points

Gemini 3.5 Flash is a really interesting release. It's super fast and surprisingly smart. It's also more expensive (3x more per token) and super token hungry. The result - it costs 2x more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT-5.5 Medium.
@OfficialLoganK I think 3.5 Flash was not marketed in an honest way.
In real world use, it's more expensive than 3.1 pro and not much better.
Combined with the sunsetting of Gemini CLI (I liked where it was going) + the Railway stuff and I've lost a lot of faith.
Gemini 3.5 Flash is a really interesting release. It's super fast and surprisingly smart. It's also more expensive (3x more per token) and super token hungry. The result - it costs 2x more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT-5.5 Medium.
@OfficialLoganK Video is up now btw
I'm scared to make this video, but I feel like I have to. It's time to talk about Google.
@OfficialLoganK They were representative enough to be in the official announcement blog post 🙃

@theo Will be curious to see how it stacks up in real world use cases, not sure how representative AA index scores are given there’s a lot of coverage of purely academic benchmarks. It’s also much faster than 3.1 Pro.
Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!!
This might be the worst major lab model drop of all time. Llama 4 tier. Insane.

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals
I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5.
3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell
Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.
Video is up btw
I'm scared to make this video, but I feel like I have to. It's time to talk about Google.
Wait wtf, they STILL haven't updated pretraining???
@suchenzang Breaks my heart. Flash was one of my favorite model lines. I have a dozen videos talking about how much I love it.
I’ve yet to find a use case where price to perf on 3.5 makes sense. I’m trying, I’m just not seeing it (and nobody else has examples either)
ouch
Clearly these people haven’t experienced the magic of Gemini 3.5 Flash Preview on High in the new Antigravity CLI
Google seems to be absolutely killing it with visual generation, but has mid LLM game, what’s up with that
Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.
flash 2 was last great google model.
I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell


























