2d ago

Gemini 3.5 Flash appears in the Google Cloud Console quota interface at $1.5 per million input tokens and $9 per million output tokens while posting a leading 47.1% on the APEX-Agents AA leaderboard

It records 76.2% on Terminal-Bench 2.1 at 4x prior speed.

35
Original post

Google is held to an unreasonably high standard. Flash would ordinarily be 10% of the cost of GPT-5.5. We're not in the age where Only Google hill-climbs hard math. Even fucking Anthropic ships models that beat… DeepSeek. Everyone is serious now.

3:12 PM · May 18, 2026 View on X
Reposted by

1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action.

We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows.

Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models.

Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale.

Some highlights we’re excited about 🔽

5:45 PM · May 19, 2026 · 119K Views

2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency.

It really helps you distill papers down to their essence and aid your understanding!

Jeff DeanJeff Dean@JeffDean

1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽

5:45 PM · May 19, 2026 · 119K Views
5:46 PM · May 19, 2026 · 83.1K Views

3/ Gemini 3.5 Flash is rolling out globally today. On behalf of the entire Gemini team, we're excited by what you'll be able to do with this model!

Read more here: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/

Jeff DeanJeff Dean@JeffDean

2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency. It really helps you distill papers down to their essence and aid your understanding!

5:46 PM · May 19, 2026 · 83.1K Views
5:49 PM · May 19, 2026 · 11.8K Views

My colleague @LeslieNooteboom generated these: we don't have it packaged up as an one-click link, but here's the prompt passed along by Leslie (thanks, Leslie!). Plug in the abstract of a paper at the bottom where it says '$abstract':

--- ❝You are a world-class creative developer. Build a beautiful, high-resolution, highly visual concept animation. MUST follow: * Output ONLY valid, fully self-contained HTML/CSS/JS code. Start with <!DOCTYPE html> and end with </html>. Do not include any markdown fences in the JSON property. * Self-contained, elegant LIGHT THEME aesthetic matching a clean technical paper (e.g., white or very light gray background, dark crisp typography, minimal harmonious colors). Avoid dark backgrounds. * Focus on ONE strong visual metaphor with graphical animations or elegant interactions. * MINIMIZE EXPLANATORY TEXT: Do not add a title or snippet of the abstract to this animation code, the full abstract is already displayed next to this animation. Let the visual movement and graphic structure explain the concept. Avoid generating heavy text paragraphs or excessive text boxes. Keep text to a few minimal labels or neat status badges. * Keep the visual script and logic extremely concise, under 200 lines of code. Do not build a complex engine or import massive libraries. * HIGH-RESOLUTION CANVAS: Always configure the HTML5 canvas for high-DPI/Retina screens by multiplying canvas.width/height by window.devicePixelRatio, setting its CSS style to the logical width/height, and scaling the context with ctx.scale(dpr, dpr). This avoids any blurry/low-resolution drawings. * RESPONSIVE SCALING & VIEWPORT: Design the visualization to be fully responsive, filling 100% of the viewport width and height (using 100vw/100vh with margin 0 and overflow hidden). Implement a window resize listener to update the canvas buffer dimensions dynamically when the viewport changes, ensuring no scrollbars or visual clipping. * HYBRID HTML-ON-CANVAS & OVERLAY COLLISION PREVENTION: Draw high-performance background graphics (particles, nodes, flows) on the high-DPI canvas, but overlay crisp, high-resolution HTML/CSS divs/labels/buttons on top of the canvas (using absolute positioning) for gorgeous typography and control panels. To prevent absolute overlay panels from hiding, blocking, or overlapping the visual animation components on standard or narrow screens: 1. **ULTRA-COMPACT FOOTPRINT**: All overlay cards must be extremely compact. Set a strict 'max-width' of **no more than 240px** (or 25% of viewport width). Avoid long text paragraphs, heavy padding (use 'p-2.5' or 'p-3'), and large stack buttons. 2. **COMPACT CONTROLS**: If offering option buttons, style them as small inline segmented pills or a minimal dropdown select box rather than a stack of wide, fat buttons. Keep text sizing small ('text-[10px]' or 'text-[11px]'). 3. **TRANSPARENCY**: Use highly semi-transparent, elegant backgrounds (e.g., white with high transparency: 'rgba(255, 255, 255, 0.72)' and a backdrop blur 'backdrop-filter: blur(10px)') for all overlay cards so the underlying animation flows remain beautifully visible behind them. 4. **LAYOUT SAFETY BOUNDARIES**: Offset the center of the canvas drawings (like circles, nodes, or waves) horizontally or vertically (e.g., centering them in the remaining 75% clear space of the canvas) so they are never drawn directly underneath the control card. Scale down the radius/bounds dynamically if the viewport width contracts.Ensure all script tags, function braces, and HTML elements are completely and properly closed. No placeholders, no labels like // ... (insert here).

Generate the concept animation based strictly on the following research paper abstract: "${abstract}" ❞

elieelie@eliebakouch

@JeffDean where can we try this? is there a site where you just put a paper name and get this kind of model card? would love to test it properly 🙏

6:25 PM · May 19, 2026 · 2.3K Views
9:41 PM · May 19, 2026 · 1K Views

Highly capable models that are fast are super important. Our new Gemini 3.5 Flash model is a great mix of fast and capable.

Sundar PichaiSundar Pichai@sundarpichai

Just off stage at #GoogleIO, some highlights from this morning 🧵 Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs. Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.

5:59 PM · May 19, 2026 · 293.4K Views
9:43 PM · May 19, 2026 · 24.2K Views

Gemini 3.5 Flash is amazing!

- Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost

And Pro to come…

Try it in @antigravity, @GeminiApp & more - enjoy!

1:05 AM · May 20, 2026 · 182.8K Views

More info on Gemini 3.5 flash model here: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/

Demis HassabisDemis Hassabis@demishassabis

Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!

1:05 AM · May 20, 2026 · 182.8K Views
1:05 AM · May 20, 2026 · 24.6K Views

1/ Today at Google I/O, we’re launching Gemini 3.5 Flash ⚡️⚡️⚡️!

Our mission was clear: bring frontier-level intelligence with unprecedented speed.

3.5 Flash delivers drastic intelligence (beating 3.1 Pro on almost every benchmark), at Flash speeds. 🧵

5:53 PM · May 19, 2026 · 8.1K Views

2/ What excites me most is 3.5 Flash’s breakthrough performance on complex, multi-step agentic tasks: it excels on Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%)

3.5 Flash is built to be the ultra-fast reasoning engine powering your AI agents. The better the model, the better the agent you build on top of it.

Oriol VinyalsOriol Vinyals@OriolVinyalsML

1/ Today at Google I/O, we’re launching Gemini 3.5 Flash ⚡️⚡️⚡️! Our mission was clear: bring frontier-level intelligence with unprecedented speed. 3.5 Flash delivers drastic intelligence (beating 3.1 Pro on almost every benchmark), at Flash speeds. 🧵

5:53 PM · May 19, 2026 · 8.1K Views
5:53 PM · May 19, 2026 · 3.9K Views

4/ Gemini 3.5 Flash is available today globally - we can't wait to see the incredible agents and apps you all build with it!

Check out our blog for more. https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/

Oriol VinyalsOriol Vinyals@OriolVinyalsML

3/ If you push Gemini 3.5 Flash with complex agents and tasks, it will shine brighter. Here, it generates fully tactile, interactive HTML/SVG hardware simulations, complete with bump-mapped metals, spring physics, and procedural audio, in a single shot. This is what happens when you connect reasoning across the full stack simultaneously at extremely low latency. 💡🔉

5:53 PM · May 19, 2026 · 4.7K Views
5:53 PM · May 19, 2026 · 1.9K Views

3/ If you push Gemini 3.5 Flash with complex agents and tasks, it will shine brighter. Here, it generates fully tactile, interactive HTML/SVG hardware simulations, complete with bump-mapped metals, spring physics, and procedural audio, in a single shot. This is what happens when you connect reasoning across the full stack simultaneously at extremely low latency. 💡🔉

Oriol VinyalsOriol Vinyals@OriolVinyalsML

2/ What excites me most is 3.5 Flash’s breakthrough performance on complex, multi-step agentic tasks: it excels on Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%) 3.5 Flash is built to be the ultra-fast reasoning engine powering your AI agents. The better the model, the better the agent you build on top of it.

5:53 PM · May 19, 2026 · 3.9K Views
5:53 PM · May 19, 2026 · 4.7K Views

excited by gemini 3.5 flash but also sad that it's "pushing the frontier of ... cost" not cost effectiveness 😭

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
6:05 PM · May 19, 2026 · 2.9K Views

@eliebakouch Best twitter account.

elieelie@eliebakouch

correlation between CursorBench and Artificial Analysis reported scores benchmarks like IFBench or tau2 show ~0 correlation with CursorBench. opus 4.7 (max effort) performs relatively better on CursorBench than on other benchmarks, gpt 5.5 shows the opposite pattern

9:31 PM · May 20, 2026 · 23K Views
9:57 PM · May 20, 2026 · 1.4K Views

@eliebakouch I think multibenchmarks are less interesting in a post training dominated world. It was very cool when it was pretraining only and just got better across the board.

Sasha RushSasha Rush@srush_nlp

@eliebakouch Best twitter account.

9:57 PM · May 20, 2026 · 1.4K Views
9:59 PM · May 20, 2026 · 452 Views

@eliebakouch I don’t know what Google does. But my read is that train 100 experts then OPD is exactly how you get a model good at 100 areas.

elieelie@eliebakouch

oh interesting, i don't have strong opinion here but if you look at flash 3.0 (or even pro) -> flash 3.5 you get improvement on benchmark across the board (not sure if it's the same base ect.. tho, hard to compare). i'd say recipe like train expert on RL then OPD kinda work to improve multiple domain at the same time?

10:11 PM · May 20, 2026 · 55 Views
10:13 PM · May 20, 2026 · 106 Views

@JeffDean @quocleix Great results! Congratulations

Jeff DeanJeff Dean@JeffDean

1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽

5:45 PM · May 19, 2026 · 119K Views
8:20 AM · May 20, 2026 · 741 Views

@demishassabis @antigravity @GeminiApp Are you aware that running AA intelligence index cost almost 2x with 3.5 Flash than it did with 3.1 Pro?

Demis HassabisDemis Hassabis@demishassabis

Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!

1:05 AM · May 20, 2026 · 182.8K Views
5:43 PM · May 20, 2026 · 6.5K Views

@OfficialLoganK Uhm you don't increment the minor version number if you think it's a new era; doesn't add up.

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way). The model is the product, please keep the feedback coming!

2:20 PM · May 20, 2026 · 203K Views
2:59 PM · May 20, 2026 · 11.8K Views

ouch

Theo - t3.ggTheo - t3.gg@theo

I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell

4:08 AM · May 20, 2026 · 160.1K Views
5:17 AM · May 20, 2026 · 65.2K Views

it was a third of the reason i came to gdm back in 2023 (the other two being google hardware and search). no other player had a serious cloud business and frontier lab in one, and cutting down cost/boosting speed while preserving ~90% capability was obviously critical for any interesting dev work outside of the labs.

at some point i realized "frontier capabilities" actually didn't matter for companies with real businesses (eg meta, apple, etc). the people who really need the hype to play out are the ones whose entire market value counts on this one specific thing incrementally improving + future projected light cone of value captured. plenty of reasons to hate on apple, but in retrospect them sitting out this race might not have been the worst idea.

let's see if G sees value in course correcting on its priorities. if not... guess it's all on open-source to fill these shoes.

Theo - t3.ggTheo - t3.gg@theo

@suchenzang Breaks my heart. Flash was one of my favorite model lines. I have a dozen videos talking about how much I love it. I’ve yet to find a use case where price to perf on 3.5 makes sense. I’m trying, I’m just not seeing it (and nobody else has examples either)

6:41 AM · May 20, 2026 · 10.6K Views
7:02 AM · May 20, 2026 · 3.9K Views

Knowledge cutoff on this is very confusing. Is this a bug? Does Flash not know that vibecoding is thing now? Does it not know about claude code!?

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash now live in aistudio

5:47 PM · May 19, 2026 · 14K Views
6:08 AM · May 20, 2026 · 56.8K Views

I miss the old flashes too, I didn’t make it to its retirement party, it flashed by - work of love dedication to the pursuit of algorithmic efficiency.

Theo - t3.ggTheo - t3.gg@theo

I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell

4:08 AM · May 20, 2026 · 160.1K Views
5:32 AM · May 20, 2026 · 10.1K Views

I think 3.5 is fine just not good enough to be a code model.

rohan anilrohan anil@_arohan_

I miss the old flashes too, I didn’t make it to its retirement party, it flashed by - work of love dedication to the pursuit of algorithmic efficiency.

5:32 AM · May 20, 2026 · 10.1K Views
5:33 AM · May 20, 2026 · 1.3K Views

I have to mute Flash word, its quite triggering to see the fall from grace of what was once the king of efficient models.

Flash probably was reason Haiku 3.5 was displaced, now that is fully relinquished to Haiku’s domain.

Current timeline is developers showing all the problems with it, and other half the timeline is people selling it as the best agentic model yet. I wonder if they are reading any of the feedback.

3:39 PM · May 20, 2026 · 457 Views

I am actually quite confused.

What went wrong here: Is all 2025 + 2026 data all slop and compute inefficient? Why would you train a model in 2026 that misses an entire year+ of data and take it to market?

rohan anilrohan anil@_arohan_

Knowledge cutoff on this is very confusing. Is this a bug? Does Flash not know that vibecoding is thing now? Does it not know about claude code!?

6:08 AM · May 20, 2026 · 56.8K Views
6:38 AM · May 20, 2026 · 45.1K Views

@scaling01 @PMinervini Have you tried it on antigravity its a slight improvement on tool call, but not a daily model to use for coding.

Lisan al GaibLisan al Gaib@scaling01

meh doesn't even beat Kimi or GLM

5:54 PM · May 19, 2026 · 50.3K Views
6:44 PM · May 19, 2026 · 3.7K Views

@scaling01 @PMinervini @eliebakouch @vincentweisser it would be fun for you guys to use this against claude and codex in auto research loop and see if it has good tastes.

rohan anilrohan anil@_arohan_

@scaling01 @PMinervini Have you tried it on antigravity its a slight improvement on tool call, but not a daily model to use for coding.

6:44 PM · May 19, 2026 · 3.7K Views
7:01 PM · May 19, 2026 · 1.5K Views

@scaling01 @PMinervini @eliebakouch @vincentweisser In some sense, both claude and codex both used human ingenuity and put them together in clever ways. While models lack taste on research with right prompting it can driven to really amazing outcomes. This itself can be an eval if you run it, and compare outcomes.

rohan anilrohan anil@_arohan_

@scaling01 @PMinervini @eliebakouch @vincentweisser it would be fun for you guys to use this against claude and codex in auto research loop and see if it has good tastes.

7:01 PM · May 19, 2026 · 1.5K Views
7:03 PM · May 19, 2026 · 331 Views

@zephyr_z9 I am curious why you say this. Mrcr? This is good guess/deduction

ZephyrZephyr@zephyr_z9

Clearly has very low active parameters but a lot more total parameters

5:38 PM · May 19, 2026 · 40.9K Views
5:58 PM · May 19, 2026 · 5.8K Views

Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way).

The model is the product, please keep the feedback coming!

2:20 PM · May 20, 2026 · 203K Views

Gemini 3.5 Flash ranks #1 on the APEX-Agents-AA benchmark, outperforming much larger models a whole size above it.

1:56 PM · May 21, 2026 · 98.4K Views

@tunguz @mercor_ai we are deeply focused on real world use cases for Gemini, its also exciting to see so many benchmarks get better at capturing these use cases

Bojan TunguzBojan Tunguz@tunguz

@OfficialLoganK @mercor_ai Benchmarks were never the issue for Gemini models. They’ve consistently struggled with vibes though.

2:21 PM · May 21, 2026 · 3.1K Views
2:36 PM · May 21, 2026 · 2.2K Views

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own.

We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views

Try it in the Gemini API, Google AI Studio, Antigravity, AI Mode, Gemini App, and wherever else you use Gemini!

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
5:45 PM · May 19, 2026 · 37.7K Views

@demishassabis @rseroter @antigravity @GeminiApp seeing us continue to pack the intelligence of pro (and more) into a flash model year after year is still one of the most impressive things to witness

Demis HassabisDemis Hassabis@demishassabis

Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!

1:05 AM · May 20, 2026 · 182.8K Views
2:50 AM · May 20, 2026 · 9K Views

@theo Will be curious to see how it stacks up in real world use cases, not sure how representative AA index scores are given there’s a lot of coverage of purely academic benchmarks. It’s also much faster than 3.1 Pro.

Theo - t3.ggTheo - t3.gg@theo

@OfficialLoganK I think 3.5 Flash was not marketed in an honest way. In real world use, it's more expensive than 3.1 pro and not much better. Combined with the sunsetting of Gemini CLI (I liked where it was going) + the Railway stuff and I've lost a lot of faith.

2:59 AM · May 20, 2026 · 26.8K Views
4:47 AM · May 20, 2026 · 3.2K Views

Massive updates to @GoogleAIStudio and the Gemini API 🤯

- Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity

8:53 PM · May 19, 2026 · 67.1K Views

And so much more coming soon, the agentic era of Gemini is in full swing and the team is shipping at full pace.

Can’t wait for you all to try this all out!!

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Massive updates to @GoogleAIStudio and the Gemini API 🤯 - Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity

8:53 PM · May 19, 2026 · 67.1K Views
8:55 PM · May 19, 2026 · 23.4K Views

The tension with training models over m at generations is it becomes impossible to keep the same characteristics in the way you mention.

Flash used to mean just fast and cheap, now it means intelligent, fast, and super strong value. The nuance of this is difficult, but we are going to continue pushing the frontier of intelligence while trying to pack as much value into a model as possible.

It’s also worth noting that 3.1 flash-lite is super strong!

Theo - t3.ggTheo - t3.gg@theo

I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell

4:08 AM · May 20, 2026 · 160.1K Views
1:56 PM · May 20, 2026 · 8.5K Views

Just off stage at #GoogleIO, some highlights from this morning 🧵

Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs.

Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.

5:59 PM · May 19, 2026 · 293.4K Views

Google Antigravity is expanding, including a new standalone desktop app that acts as a central home for agent interaction. We’re also introducing a new Antigravity CLI providing a fast, lightweight way to deploy new agents instantly without a graphical user interface, and a new Antigravity SDK, to give direct access to the same agent harness powering our own products to customize and host agents on your own infrastructure.

With 3.5 Flash in Antigravity, developers can now do so much more. The new Antigravity ecosystem is coming to developers today.

Sundar PichaiSundar Pichai@sundarpichai

Just off stage at #GoogleIO, some highlights from this morning 🧵 Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs. Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.

5:59 PM · May 19, 2026 · 293.4K Views
5:59 PM · May 19, 2026 · 49.9K Views

Gemini Spark is your personal AI agent in the @GeminiApp that gets things done on your behalf, under your direction. It runs 24/7 (and yes - you can close your laptop). It’s powered by Gemini 3.5 and built on the Google Antigravity harness so it can complete long horizon tasks.

Spark will integrate seamlessly with tools, starting with ours, and soon with 3P tools with MCP. You’ll also be able to work with it through email + chat. Available to trusted testers this week and next week in Beta to AI Ultra users in the US.

Sundar PichaiSundar Pichai@sundarpichai

Google Antigravity is expanding, including a new standalone desktop app that acts as a central home for agent interaction. We’re also introducing a new Antigravity CLI providing a fast, lightweight way to deploy new agents instantly without a graphical user interface, and a new Antigravity SDK, to give direct access to the same agent harness powering our own products to customize and host agents on your own infrastructure. With 3.5 Flash in Antigravity, developers can now do so much more. The new Antigravity ecosystem is coming to developers today.

5:59 PM · May 19, 2026 · 49.9K Views
5:59 PM · May 19, 2026 · 35K Views

Gemini Omni is our new model that can create anything from any input - starting with video. It combines Gemini’s intelligence with our generative media models, for a new level of world understanding, multimodality, and editing.

Gemini Omni Flash is rolling out today to Google AI Plus, Pro and Ultra subscribers globally through the @Geminiapp and Google Flow, to @YouTube Shorts this week, and to our developer and enterprise APIs in the coming weeks.

Sundar PichaiSundar Pichai@sundarpichai

Gemini Spark is your personal AI agent in the @GeminiApp that gets things done on your behalf, under your direction. It runs 24/7 (and yes - you can close your laptop). It’s powered by Gemini 3.5 and built on the Google Antigravity harness so it can complete long horizon tasks. Spark will integrate seamlessly with tools, starting with ours, and soon with 3P tools with MCP. You’ll also be able to work with it through email + chat. Available to trusted testers this week and next week in Beta to AI Ultra users in the US.

5:59 PM · May 19, 2026 · 35K Views
5:59 PM · May 19, 2026 · 20.2K Views

As models get even better, the need for transparency grows. Last year @nvidia adopted our SynthID invisible watermark, and today we’re excited to announce @OpenAI, Kakao, and @ElevenLabs will join them.

We’re also going further by adding C2PA Content Credentials verification to @GeminiApp alongside Synth ID detection and bringing both to Search and Chrome so you can easily check whether content was captured by a camera, or created/ edited with gen AI tools.

Sundar PichaiSundar Pichai@sundarpichai

Gemini Omni is our new model that can create anything from any input - starting with video. It combines Gemini’s intelligence with our generative media models, for a new level of world understanding, multimodality, and editing. Gemini Omni Flash is rolling out today to Google AI Plus, Pro and Ultra subscribers globally through the @Geminiapp and Google Flow, to @YouTube Shorts this week, and to our developer and enterprise APIs in the coming weeks.

5:59 PM · May 19, 2026 · 20.2K Views
5:59 PM · May 19, 2026 · 14.3K Views

Gemini 3.5 Flash is transforming what you can do in Google Search with new agentic capabilities. A few things we’re introducing:

A new intelligent AI-powered Search box, our biggest upgrade in 25 years — rolling out globally.

New information agents that work in the background 24/7 to find exactly what you need at the right moment, and help you take action (coming this summer).

And with the power of Google Antigravity, we’re unlocking new agentic coding capabilities, so Search can build custom interactive experiences – like visual simulations, dashboards or trackers – for your individual questions.

Read more: https://blog.google/products-and-platforms/products/search/search-io-2026

Sundar PichaiSundar Pichai@sundarpichai

As models get even better, the need for transparency grows. Last year @nvidia adopted our SynthID invisible watermark, and today we’re excited to announce @OpenAI, Kakao, and @ElevenLabs will join them. We’re also going further by adding C2PA Content Credentials verification to @GeminiApp alongside Synth ID detection and bringing both to Search and Chrome so you can easily check whether content was captured by a camera, or created/ edited with gen AI tools.

5:59 PM · May 19, 2026 · 14.3K Views
5:59 PM · May 19, 2026 · 20K Views

We’re having much more natural conversations with Gemini directly inside many of our products, so we’re bringing this to two more:

Ask YouTube is a new experience for searching for content on YouTube. It gives you information in an easy to navigate layout, with videos best matched to what you’re looking for, and jumps right to the part most relevant to your query.

With voice-powered Docs Live you can brain dump whatever is on your mind, and let Gemini do the rest. Rolling out this summer, and the same voice capability is coming to Gmail and Keep then too.

Sundar PichaiSundar Pichai@sundarpichai

Gemini 3.5 Flash is transforming what you can do in Google Search with new agentic capabilities. A few things we’re introducing: A new intelligent AI-powered Search box, our biggest upgrade in 25 years — rolling out globally. New information agents that work in the background 24/7 to find exactly what you need at the right moment, and help you take action (coming this summer). And with the power of Google Antigravity, we’re unlocking new agentic coding capabilities, so Search can build custom interactive experiences – like visual simulations, dashboards or trackers – for your individual questions. Read more: https://blog.google/products-and-platforms/products/search/search-io-2026

5:59 PM · May 19, 2026 · 20K Views
5:59 PM · May 19, 2026 · 34.7K Views

Read my remarks: https://blog.google/innovation-and-ai/sundar-pichai-io-2026/

Sundar PichaiSundar Pichai@sundarpichai

We’re having much more natural conversations with Gemini directly inside many of our products, so we’re bringing this to two more: Ask YouTube is a new experience for searching for content on YouTube. It gives you information in an easy to navigate layout, with videos best matched to what you’re looking for, and jumps right to the part most relevant to your query. With voice-powered Docs Live you can brain dump whatever is on your mind, and let Gemini do the rest. Rolling out this summer, and the same voice capability is coming to Gmail and Keep then too.

5:59 PM · May 19, 2026 · 34.7K Views
6:51 PM · May 19, 2026 · 24.9K Views

Workhorse model! (and hope you're enjoying your first I/O)

Chubby♨️Chubby♨️@kimmonismus

Insane evals for a Flash model! Gemini 3.5 Flash is really good for its size!

5:38 PM · May 19, 2026 · 161.6K Views
6:33 PM · May 19, 2026 · 140K Views

... and it is *so* fast ⚡️⚡️⚡️

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
6:26 PM · May 19, 2026 · 3.5K Views

Also had some early access to Gemini 3.5 Flash. Very fast for a flash model and very capable, though not as powerful as a full frontier model.

I added it to the gallery or procedurally generated one-shot towns (it made one error that it corrected): https://hg-20f7d1a3ce.netlify.app/#gemini-3-5-flash

6:05 PM · May 19, 2026 · 13.1K Views

Checking out Gemini 3.5 Flash, available today, which helped power Antigravity 2.0, Gemini Spark, and many more! https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5

Jeff DeanJeff Dean@JeffDean

1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽

5:45 PM · May 19, 2026 · 119K Views
6:54 PM · May 19, 2026 · 2.2K Views

Pretty wild to be able to automatically train AlphaZero and serve a playable Go demo with just 2 prompts using Gemini 3.5 Flash and Antigravity 2.0 ...

koray kavukcuoglukoray kavukcuoglu@koraykv

Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵

5:54 PM · May 19, 2026 · 80.9K Views
1:30 AM · May 20, 2026 · 12K Views

My notes on Gemini 3.5 Flash - 3x the price of Gemini 3 Flash but Google are planning to use it for many of their own products https://simonwillison.net/2026/May/19/gemini-35-flash/

10:41 PM · May 19, 2026 · 22.1K Views

The metrics on 3.5 Flash are major: 4x faster than other frontier models in output tokens per second Outperforms our previous 3.1 Pro model on nearly all benchmarks Shows massive improvement on coding and agentic benchmarks like Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%)

Noam ShazeerNoam Shazeer@NoamShazeer

Live from #GoogleIO, we’re introducing Gemini 3.5 Flash, our latest model with frontier performance for agents and coding. We’re rolling it out globally today, delivering frontier-level performance for real-world agentic workflows at 4x the speed of other frontier models. 🧵

5:54 PM · May 19, 2026 · 3.4K Views
5:54 PM · May 19, 2026 · 3.5K Views

Live from #GoogleIO, we’re introducing Gemini 3.5 Flash, our latest model with frontier performance for agents and coding. We’re rolling it out globally today, delivering frontier-level performance for real-world agentic workflows at 4x the speed of other frontier models. 🧵

5:54 PM · May 19, 2026 · 3.4K Views

The combination of performance and speed makes this model ideal for long-horizon and agentic tasks. Our enterprise partners are already seeing real-world results and accelerating their daily workflows. Read more about Gemini 3.5 in our blog: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/

Noam ShazeerNoam Shazeer@NoamShazeer

The metrics on 3.5 Flash are major: 4x faster than other frontier models in output tokens per second Outperforms our previous 3.1 Pro model on nearly all benchmarks Shows massive improvement on coding and agentic benchmarks like Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%)

5:54 PM · May 19, 2026 · 3.5K Views
5:54 PM · May 19, 2026 · 1.1K Views

@suchenzang At least we still have principles

Susan ZhangSusan Zhang@suchenzang

ouch

5:17 AM · May 20, 2026 · 65.2K Views
6:11 AM · May 20, 2026 · 92.3K Views

Quite the flex, you love to see it

Michael TruellMichael Truell@mntruell

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals

3:30 AM · May 20, 2026 · 1.3M Views
5:01 AM · May 20, 2026 · 10.8K Views

Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵

5:54 PM · May 19, 2026 · 80.9K Views

Gemini 3.5 Flash delivers sustained frontier-level performance at lightning-quick speeds: Beats 3.1 Pro on coding & agentic benchmarks Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%) 4x faster than other frontier models (12x in Antigravity!) SOTA on multimodality with 83.6% on MMMU-Pro

koray kavukcuoglukoray kavukcuoglu@koraykv

Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵

5:54 PM · May 19, 2026 · 80.9K Views
5:54 PM · May 19, 2026 · 3.6K Views

Gemini 3.5 Flash is rolling out globally for consumers in the Gemini app and Search AI Mode, for developers via the Gemini API, Google Antigravity, and Google AI Studio, and for businesses on the Gemini Enterprise Agent Platform. Read more about the Gemini 3.5 era: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/

koray kavukcuoglukoray kavukcuoglu@koraykv

Gemini 3.5 Flash delivers sustained frontier-level performance at lightning-quick speeds: Beats 3.1 Pro on coding & agentic benchmarks Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%) 4x faster than other frontier models (12x in Antigravity!) SOTA on multimodality with 83.6% on MMMU-Pro

5:54 PM · May 19, 2026 · 3.6K Views
5:54 PM · May 19, 2026 · 2.5K Views

Following our Gemini 3.5 Flash launch at #GoogleIO, check out this demo of what it can do right inside the Gemini app.

Watch 3.5 Flash build an interactive circuit helper, outputting a step-by-step physical build guide alongside a working simulation. It’s a great example of how the Gemini app can help students learn visually, breaking down new or complex concepts and guiding them through hands-on subjects interactively.

3.5 Flash is available globally today!

11:32 PM · May 19, 2026 · 34.2K Views

@benhylak 2.5-pro-exp-0325

ben hylakben hylak@benhylak

flash 2 was last great google model.

6:50 AM · May 20, 2026 · 15.1K Views
8:33 AM · May 20, 2026 · 5.1K Views

Probably true. I think the reality is that we’re going to have a de facto licensing regime (“voluntary”), where the government will give green lights to the labs on releases. That is fine as a very temporary solution, but it’s opaque and essentially lawless. We will need to institutionalize this stuff and create transparent, objective, and predictable protocols to structure it.

I support having private bodies, overseen by government, having a major role in evaluating and setting technical standards, internal governance practices within labs, and similar. But more important than my specific idea is that we operate according to the rule of law, rather than create a de facto, opaque licensing regime in the name of maintaining the illusion that pre-deployment review is “voluntary” and that we are Not Regulating AI.

Andrew CurranAndrew Curran@AndrewCurran_

I want to make this prediction now so I can quote it later. Gemini Pro 3.5 and GPT-5.6 are both ready now, and both labs want to release them, but they are being held back for safety testing in a test flight of the new regulations in the forthcoming executive order.

8:19 PM · May 20, 2026 · 58.5K Views
8:52 PM · May 20, 2026 · 53.4K Views

RL roughly on trend, multimodality on trend, strange to see them report mediocre MRCR and ARC-AGI-2. Given the speed, it might well have fewer active parameters than Flash-3 (so they both shrink the batch and grow margin). Will be a successful model until we get some 5.6-Mini.

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash Benchmarks

5:42 PM · May 19, 2026 · 24.4K Views
5:50 PM · May 19, 2026 · 7K Views

I'm wrong, thanks @yourboiilevi it's G3 Flash base, they just serve it faster interesting

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

RL roughly on trend, multimodality on trend, strange to see them report mediocre MRCR and ARC-AGI-2. Given the speed, it might well have fewer active parameters than Flash-3 (so they both shrink the batch and grow margin). Will be a successful model until we get some 5.6-Mini.

5:50 PM · May 19, 2026 · 7K Views
5:58 PM · May 19, 2026 · 49.1K Views

Just call it Gemini 4 Pro at this point

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way). The model is the product, please keep the feedback coming!

2:20 PM · May 20, 2026 · 203K Views
3:07 PM · May 20, 2026 · 16.4K Views

Thinking again about it > we spent the last 2.5 years putting the infrastructure, products, team, etc in place yeah makes sense they have NOT pretrained a new model, or even refreshed the data. (There has been a bit of an update, eg it knows me) Google is very… longtermist.

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way). The model is the product, please keep the feedback coming!

2:20 PM · May 20, 2026 · 203K Views
4:03 PM · May 20, 2026 · 11.5K Views
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Thinking again about it > we spent the last 2.5 years putting the infrastructure, products, team, etc in place yeah makes sense they have NOT pretrained a new model, or even refreshed the data. (There has been a bit of an update, eg it knows me) Google is very… longtermist.

4:03 PM · May 20, 2026 · 11.5K Views
4:07 PM · May 20, 2026 · 1.8K Views

“On what do I base the belief that the 2026 results were a "hypothetical projection"?

I base this on the fundamental structure of my data.” fair enough

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex
4:07 PM · May 20, 2026 · 1.8K Views
4:10 PM · May 20, 2026 · 1.6K Views

I think the verdict is in, Gemini didn't have any post training breakthrough, except maybe through the floor. Outside of vision, massive disappointment. fucking V4-Flash gets stuff DONE faster. Then again I almost never used 3-Flash I'll likely almost never use this thing too

ZephyrZephyr@zephyr_z9

Wait, what?????? What kind of post training breakthrough did they make?? So the price increase is mostly due to smaller batch size to make it run faster

9:50 PM · May 19, 2026 · 59.8K Views
6:42 AM · May 20, 2026 · 17.1K Views

@zephyr_z9 well, if V4-Flash is possible then this is also possible as far as we know G3-Flash has like 1.2T params, no? plenty of juice left if you're Google

ZephyrZephyr@zephyr_z9

Wait, what?????? What kind of post training breakthrough did they make?? So the price increase is mostly due to smaller batch size to make it run faster

9:50 PM · May 19, 2026 · 59.8K Views
9:53 PM · May 19, 2026 · 9.9K Views

I think one neglected area in model evals is case studies of LLM-Hard questions. Like, here we see that literally nothing can crack #10 and #12 ArXivMath in a few shots. (somehow #6 yields to… Qwen-2B). If we aren't just training on test, CoTs of such problems deserve scrutiny.

Jasper DekoninckJasper Dekoninck@j_dekoninck

Meh On MathArena, Gemini 3.5 Flash is neither bad nor great. It is very fast though: I ran 1000 queries in 30 minutes.

8:28 PM · May 19, 2026 · 13.7K Views
9:12 PM · May 19, 2026 · 5.7K Views

now admittedly it got 4 tries vs 3-4 for others, but still, lmao on apex-shortlist we see that top models struggle with #18 but those below them do not. Might it just be a ground truth failure? @j_dekoninck

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

I think one neglected area in model evals is case studies of LLM-Hard questions. Like, here we see that literally nothing can crack #10 and #12 ArXivMath in a few shots. (somehow #6 yields to… Qwen-2B). If we aren't just training on test, CoTs of such problems deserve scrutiny.

9:12 PM · May 19, 2026 · 5.7K Views
9:15 PM · May 19, 2026 · 2.3K Views

IMO #6 was a famous one. Recently got cracked with GPT 5.5 Pro. But that's not very interesting. A year later OpenAI's best can do this hard thing everyone was aware of, Duh. Tells us little. The recent OpenDeepThink from @wenhaocha1 et al (which I kinda reproduced) boasts +400 Elo on CF, but they also say this: "Of the seventeen unsolved problems across the Flash and 2.5 Pro runs, none crosses 5% pass@1 in any generation". I do not think models of this era literally do not have the "competence" for solving any particular programming task, it's all compositional. So I am generally much more intrigued in techniques that can break through this barrier than in amplifying pass@k by changing how we fiddle with K and partition it into l, m, n. Likewise for methods like PaCoRe from @StepFun_ai or the new MCTS from @ZyphraAI. How do we get unsolvable things solved by trading compute for intelligence rather than "performance"? Ultimately that's the whole promise of this journey to AGI via scaling, isn't it. Is there a way that doesn't just rely on iterative training of models on synthetic data? If that were all, we're at risk at having to do exponentially costly search for data recipes that do not exceed inherent capability of models and thus lead to narrow-generalizing memorization of patterns and more false promises. Yes, we can evidently stack these chairs to a dizzying height if money is no issue, but could we at least evolve to processing them into plywood already? Might be a reason GDM is so calm in the face of two "startups"; why Gemini is half-assing this main vector of market competition that is agentic SWE. Demis suspects that AGI have to be done the hard way, from the ground up. Raw bytes, universal predictors, world models; removing layers of human-digested slop between downstream outputs and bare metal as your stockpile of metal grows. The "synthetic data" stockpile might prove to be fairy gold if he's right.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

now admittedly it got 4 tries vs 3-4 for others, but still, lmao on apex-shortlist we see that top models struggle with #18 but those below them do not. Might it just be a ground truth failure? @j_dekoninck

9:15 PM · May 19, 2026 · 2.3K Views
9:32 PM · May 19, 2026 · 1.6K Views

@suchenzang @theo I wonder how 3.5 Pro fares while I have no love for 3.1 Pro it was a powerful model for its time

Susan ZhangSusan Zhang@suchenzang

it was a third of the reason i came to gdm back in 2023 (the other two being google hardware and search). no other player had a serious cloud business and frontier lab in one, and cutting down cost/boosting speed while preserving ~90% capability was obviously critical for any interesting dev work outside of the labs. at some point i realized "frontier capabilities" actually didn't matter for companies with real businesses (eg meta, apple, etc). the people who really need the hype to play out are the ones whose entire market value counts on this one specific thing incrementally improving + future projected light cone of value captured. plenty of reasons to hate on apple, but in retrospect them sitting out this race might not have been the worst idea. let's see if G sees value in course correcting on its priorities. if not... guess it's all on open-source to fill these shoes.

7:02 AM · May 20, 2026 · 3.9K Views
7:19 AM · May 20, 2026 · 567 Views

@VictorTaelin available on openrouter

TaelinTaelin@VictorTaelin

Narrator: they already fucked up → Gemini 3.5 Flash not available on API. → Fast mode locked to Antigravity only. I don’t understand why companies keep doing this. They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money? Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use. It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is. And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money. Your model is the product. You keep chasing old business models. Completely out of touch. Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!! ~~~ Reposting this. I deleted before because they launched Antigravity CLI, which isn't an API, but at least gives us *some* flexibility. But no: I'm getting 10x slower TPS on CLI compared to the IDE. So either a bug or they really want you to use the visual IDE. So my money will keep going to Anthropic, unfortunately. 🤦‍♂️

7:08 PM · May 19, 2026 · 4.8K Views
7:11 PM · May 19, 2026 · 1K Views

Kind of a big deal

GoogleGoogle@Google

Meet Gemini 3.5 Flash — our strongest agentic and coding model yet. It delivers frontier-level performance at 4x the speed of comparable frontier models — often at less than half the cost. Generally available, starting today. 🧵 #GoogleIO

5:25 PM · May 19, 2026 · 776.7K Views
5:32 PM · May 19, 2026 · 59.2K Views

Holy shit man

GoogleGoogle@Google

Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.

5:25 PM · May 19, 2026 · 935.9K Views
5:32 PM · May 19, 2026 · 435.2K Views

nevermind

kachekache@yacineMTB

Holy shit man

5:32 PM · May 19, 2026 · 435.2K Views
9:11 AM · May 20, 2026 · 2.1K Views

Everything AI released at Google I/O 2026

- Gemini Omni Flash - Gemini 3.5 Flash (and in GA) - Antigravity 2.0 - Managed Agents in the Gemini API - AI Studio app in pre-order - New SynthID partnerships - AI Studio: native Android support, Workspace Integrations, and export to AGY - Antigravity SDK and CLI - Gemini Spark - New Google AI Ultra subscription

And stay tuned, so much more to come!

4:23 AM · May 20, 2026 · 10.7K Views

@teortaxesTex That's for Q4 against GPT-6 and Claude 5.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Just call it Gemini 4 Pro at this point

3:07 PM · May 20, 2026 · 16.4K Views
3:10 PM · May 20, 2026 · 1.4K Views

I want to make this prediction now so I can quote it later. Gemini Pro 3.5 and GPT-5.6 are both ready now, and both labs want to release them, but they are being held back for safety testing in a test flight of the new regulations in the forthcoming executive order.

8:19 PM · May 20, 2026 · 58.5K Views

@scaling01 I hope so.

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Pro is the Gemini Ultra we always wanted

12:18 PM · May 19, 2026 · 29.8K Views
12:23 PM · May 19, 2026 · 1.3K Views

Proud to have worked on recreating Alphazero. The future is super super exciting 🔥!

koray kavukcuoglukoray kavukcuoglu@koraykv

Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵

5:54 PM · May 19, 2026 · 80.9K Views
6:42 PM · May 19, 2026 · 9K Views

Gemini 3.5 Flash is out, and it's a major jump over Gemini 3 Flash in model capability for knowledge work. We've been evaluating it on our Box AI Complex Work Eval in early release, and the model delivers a 12 percentage point jump on complex document tasks.

For testing this model, we give the Box AI Agent (using Gemini 3.5) complex problems to solve that represent common but difficult knowledge worker tasks in banking, consulting, public sector, healthcare, and other industries. These tasks can be things like drafting reports, doing due diligence, and more, given a set of relevant documents.

In our tests, Gemini 3.5 Flash delivered jumps across every industry, including:

* Financial services: 81% vs 73% (+8pp) * Public sector: 76% vs 59%, (+17pp) * Healthcare: 73% vs 51%, (+22pp) * Life Sciences: 67% vs 47%, (+20pp)

Incredible to see the continued performance gains.

Gemini 3.5 Flash will be available soon in Box AI Studio and through the Box API. The Box MCP Server will soon be available in the Gemini app with more details to come.

6:29 PM · May 19, 2026 · 32.7K Views

@OfficialLoganK @mercor_ai Benchmarks were never the issue for Gemini models. They’ve consistently struggled with vibes though.

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Gemini 3.5 Flash ranks #1 on the APEX-Agents-AA benchmark, outperforming much larger models a whole size above it.

1:56 PM · May 21, 2026 · 98.4K Views
2:21 PM · May 21, 2026 · 3.1K Views

oof

Theo - t3.ggTheo - t3.gg@theo

Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.

4:04 AM · May 20, 2026 · 1M Views
11:35 AM · May 20, 2026 · 11.7K Views

Looking forward to gemini-cli becoming usable

GoogleGoogle@Google

Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.

5:25 PM · May 19, 2026 · 935.9K Views
6:13 PM · May 19, 2026 · 374 Views

@JeffDean where can we try this? is there a site where you just put a paper name and get this kind of model card? would love to test it properly 🙏

Jeff DeanJeff Dean@JeffDean

2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency. It really helps you distill papers down to their essence and aid your understanding!

5:46 PM · May 19, 2026 · 83.1K Views
6:25 PM · May 19, 2026 · 2.3K Views

@JeffDean @LeslieNooteboom nice thanks a lot, i'll try this!! :)

Jeff DeanJeff Dean@JeffDean

My colleague @LeslieNooteboom generated these: we don't have it packaged up as an one-click link, but here's the prompt passed along by Leslie (thanks, Leslie!). Plug in the abstract of a paper at the bottom where it says '$abstract': --- ❝You are a world-class creative developer. Build a beautiful, high-resolution, highly visual concept animation. MUST follow: * Output ONLY valid, fully self-contained HTML/CSS/JS code. Start with <!DOCTYPE html> and end with </html>. Do not include any markdown fences in the JSON property. * Self-contained, elegant LIGHT THEME aesthetic matching a clean technical paper (e.g., white or very light gray background, dark crisp typography, minimal harmonious colors). Avoid dark backgrounds. * Focus on ONE strong visual metaphor with graphical animations or elegant interactions. * MINIMIZE EXPLANATORY TEXT: Do not add a title or snippet of the abstract to this animation code, the full abstract is already displayed next to this animation. Let the visual movement and graphic structure explain the concept. Avoid generating heavy text paragraphs or excessive text boxes. Keep text to a few minimal labels or neat status badges. * Keep the visual script and logic extremely concise, under 200 lines of code. Do not build a complex engine or import massive libraries. * HIGH-RESOLUTION CANVAS: Always configure the HTML5 canvas for high-DPI/Retina screens by multiplying canvas.width/height by window.devicePixelRatio, setting its CSS style to the logical width/height, and scaling the context with ctx.scale(dpr, dpr). This avoids any blurry/low-resolution drawings. * RESPONSIVE SCALING & VIEWPORT: Design the visualization to be fully responsive, filling 100% of the viewport width and height (using 100vw/100vh with margin 0 and overflow hidden). Implement a window resize listener to update the canvas buffer dimensions dynamically when the viewport changes, ensuring no scrollbars or visual clipping. * HYBRID HTML-ON-CANVAS & OVERLAY COLLISION PREVENTION: Draw high-performance background graphics (particles, nodes, flows) on the high-DPI canvas, but overlay crisp, high-resolution HTML/CSS divs/labels/buttons on top of the canvas (using absolute positioning) for gorgeous typography and control panels. To prevent absolute overlay panels from hiding, blocking, or overlapping the visual animation components on standard or narrow screens: 1. **ULTRA-COMPACT FOOTPRINT**: All overlay cards must be extremely compact. Set a strict 'max-width' of **no more than 240px** (or 25% of viewport width). Avoid long text paragraphs, heavy padding (use 'p-2.5' or 'p-3'), and large stack buttons. 2. **COMPACT CONTROLS**: If offering option buttons, style them as small inline segmented pills or a minimal dropdown select box rather than a stack of wide, fat buttons. Keep text sizing small ('text-[10px]' or 'text-[11px]'). 3. **TRANSPARENCY**: Use highly semi-transparent, elegant backgrounds (e.g., white with high transparency: 'rgba(255, 255, 255, 0.72)' and a backdrop blur 'backdrop-filter: blur(10px)') for all overlay cards so the underlying animation flows remain beautifully visible behind them. 4. **LAYOUT SAFETY BOUNDARIES**: Offset the center of the canvas drawings (like circles, nodes, or waves) horizontally or vertically (e.g., centering them in the remaining 75% clear space of the canvas) so they are never drawn directly underneath the control card. Scale down the radius/bounds dynamically if the viewport width contracts.Ensure all script tags, function braces, and HTML elements are completely and properly closed. No placeholders, no labels like // ... (insert here). Generate the concept animation based strictly on the following research paper abstract: "${abstract}" ❞

9:41 PM · May 19, 2026 · 1K Views
10:16 PM · May 19, 2026 · 291 Views

@JeffDean @LeslieNooteboom oh one additional question wry, is this generated with the model on https://gemini.google.com or a specific harness like in antigravity? (found for claude that it's much better inside claude code for this kind of stuff)

elieelie@eliebakouch

@JeffDean @LeslieNooteboom nice thanks a lot, i'll try this!! :)

10:16 PM · May 19, 2026 · 291 Views
10:18 PM · May 19, 2026 · 227 Views

correlation between CursorBench and Artificial Analysis reported scores

benchmarks like IFBench or tau2 show ~0 correlation with CursorBench. opus 4.7 (max effort) performs relatively better on CursorBench than on other benchmarks, gpt 5.5 shows the opposite pattern

Michael TruellMichael Truell@mntruell

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals

3:30 AM · May 20, 2026 · 1.3M Views
9:31 PM · May 20, 2026 · 23K Views

oh interesting, i don't have strong opinion here but if you look at flash 3.0 (or even pro) -> flash 3.5 you get improvement on benchmark across the board (not sure if it's the same base ect.. tho, hard to compare). i'd say recipe like train expert on RL then OPD kinda work to improve multiple domain at the same time?

Sasha RushSasha Rush@srush_nlp

@eliebakouch I think multibenchmarks are less interesting in a post training dominated world. It was very cool when it was pretraining only and just got better across the board.

9:59 PM · May 20, 2026 · 452 Views
10:11 PM · May 20, 2026 · 55 Views

@srush_nlp would be interesting to see what's the limitation of this if any, like by scaling the number of areas

Sasha RushSasha Rush@srush_nlp

@eliebakouch I don’t know what Google does. But my read is that train 100 experts then OPD is exactly how you get a model good at 100 areas.

10:13 PM · May 20, 2026 · 106 Views
10:16 PM · May 20, 2026 · 82 Views

@_arohan_ @scaling01 @PMinervini @vincentweisser 👀 could be fun indeed, will look into this

rohan anilrohan anil@_arohan_

@scaling01 @PMinervini @eliebakouch @vincentweisser it would be fun for you guys to use this against claude and codex in auto research loop and see if it has good tastes.

7:01 PM · May 19, 2026 · 1.5K Views
8:38 PM · May 19, 2026 · 148 Views

@willccbb @benhylak

will brownwill brown@willccbb

@benhylak 2.5-pro-exp-0325

8:33 AM · May 20, 2026 · 5.1K Views
9:23 AM · May 20, 2026 · 297 Views

TPUs have always been low-key goated, people are finally starting to feel it now.

2:50 AM · May 20, 2026 · 12.3K Views

Today, we introduced Gemini 3.5 Flash ⚡ Our most capable coding and agentic model — where "fast" and "best" aren't a tradeoff. Try it now across Antigravity, AI Studio, Gemini App, and AI Mode.

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
5:48 PM · May 19, 2026 · 4.5K Views

I’m extremely proud of the team and this has been one of most intense and most rewarding launches we have done! And we are not done yet and are busy cooking 3.5 Pro.

Melvin JohnsonMelvin Johnson@melvinjohnsonp

Today, we introduced Gemini 3.5 Flash ⚡ Our most capable coding and agentic model — where "fast" and "best" aren't a tradeoff. Try it now across Antigravity, AI Studio, Gemini App, and AI Mode.

5:48 PM · May 19, 2026 · 4.5K Views
5:48 PM · May 19, 2026 · 818 Views

Here is a fun demo showcasing the model’s Web Development capabilities.

Melvin JohnsonMelvin Johnson@melvinjohnsonp

I’m extremely proud of the team and this has been one of most intense and most rewarding launches we have done! And we are not done yet and are busy cooking 3.5 Pro.

5:48 PM · May 19, 2026 · 818 Views
5:48 PM · May 19, 2026 · 221 Views

@JeffDean Any idea when Gemini 3.1 pro will drop the “-preview”?

Jeff DeanJeff Dean@JeffDean

Highly capable models that are fast are super important. Our new Gemini 3.5 Flash model is a great mix of fast and capable.

9:43 PM · May 19, 2026 · 24.2K Views
6:46 AM · May 20, 2026 · 885 Views

Gemini 3.5 Flash is now GA. Our most capable Flash model, built for agentic execution, coding, and long-horizon tasks.

- Outperforms Gemini 3.1 Pro on coding and agentic tasks - 1M token context window with 65k max output tokens - 4x faster output tokens/sec - 4 thinking levels: minimal, low, medium (new default), high - Thought preservation across multi-turn conversations automatically

Available today in @GoogleAIStudio, @Android Studio, @antigravity, Gemini Enterprise, the @GeminiApp, and AI Mode in Search.

5:51 PM · May 19, 2026 · 7.2K Views

Developer Guide: https://ai.google.dev/gemini-api/docs/interactions/whats-new-gemini-3.5 Introducing Gemini 3.5: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/ AI Studio: https://aistudio.google.com/prompts/new_chat?model=gemini-3.5-flash

Philipp SchmidPhilipp Schmid@_philschmid

Gemini 3.5 Flash is now GA. Our most capable Flash model, built for agentic execution, coding, and long-horizon tasks. - Outperforms Gemini 3.1 Pro on coding and agentic tasks - 1M token context window with 65k max output tokens - 4x faster output tokens/sec - 4 thinking levels: minimal, low, medium (new default), high - Thought preservation across multi-turn conversations automatically Available today in @GoogleAIStudio, @Android Studio, @antigravity, Gemini Enterprise, the @GeminiApp, and AI Mode in Search.

5:51 PM · May 19, 2026 · 7.2K Views
5:51 PM · May 19, 2026 · 1.5K Views

@OfficialLoganK @GoogleDeepMind Good model

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
5:52 PM · May 19, 2026 · 2.8K Views

We wrote a new Developer Guide for Gemini 3.5 Models. Easiest way to migrate is install the Interactions API Skill. Then run

``` /gemini-interactions-api migrate my app to Gemini 3.5 Flash ```

Skill available at: `npx skills add google-gemini/gemini-skills --skill gemini-interactions-api`

6:08 PM · May 19, 2026 · 8.9K Views

@osanseviero Poor Gemini for Science 🥲

Omar SansevieroOmar Sanseviero@osanseviero

Everything AI released at Google I/O 2026 - Gemini Omni Flash - Gemini 3.5 Flash (and in GA) - Antigravity 2.0 - Managed Agents in the Gemini API - AI Studio app in pre-order - New SynthID partnerships - AI Studio: native Android support, Workspace Integrations, and export to AGY - Antigravity SDK and CLI - Gemini Spark - New Google AI Ultra subscription And stay tuned, so much more to come!

4:23 AM · May 20, 2026 · 10.7K Views
5:09 AM · May 20, 2026 · 342 Views

it now seems very likely to me that the new Gemini 3.5 Pro model will be a 10T+ model like Mythos

Lisan al GaibLisan al Gaib@scaling01

it's Gemini 3.5 Flash day but pricing is $1.5 / $9 per mtoks 💀

12:13 PM · May 19, 2026 · 158.6K Views
12:16 PM · May 19, 2026 · 68.1K Views

pricing of 3.5 Pro should be $6 / $27 if they keep the 4x scaling and the trend of undercutting OpenAI

Lisan al GaibLisan al Gaib@scaling01

it now seems very likely to me that the new Gemini 3.5 Pro model will be a 10T+ model like Mythos

12:16 PM · May 19, 2026 · 68.1K Views
12:17 PM · May 19, 2026 · 4.5K Views

Gemini 3.5 Pro is the Gemini Ultra we always wanted

Lisan al GaibLisan al Gaib@scaling01

it now seems very likely to me that the new Gemini 3.5 Pro model will be a 10T+ model like Mythos

12:16 PM · May 19, 2026 · 68.1K Views
12:18 PM · May 19, 2026 · 29.8K Views

some more Gemini 3.5 Flash benchmarks by Artificial Analysis AI: - the APEX-Agents-AA score is excellent - expected higher on CritPt - reasoning efficiency could also be better, but it kind of depends on what setting they used. if it's the max then it's very good - Price/Perf looks pretty bad vs GPT-5.5. You can get GPT-5.5-medium with better performance for less and with faster responses

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash scores kinda low on the Coding Index due to terrible TerminalBench-Hard scores

5:57 PM · May 19, 2026 · 31.7K Views
6:06 PM · May 19, 2026 · 159.7K Views
Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash Benchmarks

5:42 PM · May 19, 2026 · 24.4K Views
5:44 PM · May 19, 2026 · 1.5K Views

interestingly it still has the Jan 2025 knowledge cut-off

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash Benchmarks

5:42 PM · May 19, 2026 · 24.4K Views
5:46 PM · May 19, 2026 · 2K Views

Google optimized Gemini 3.5 Flash to make it run up to 12x faster (~867 tokens/s) than comparable models in AntiGravity

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash comparable with Opus 4.7, GPT-5.5 and Gemini 3.1 Pro on the Artificial Analysis Index while running up to 4x faster

5:26 PM · May 19, 2026 · 97.2K Views
5:34 PM · May 19, 2026 · 5.2K Views

Gemini 3.5 Flash now live in aistudio

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash Benchmarks

5:42 PM · May 19, 2026 · 24.4K Views
5:47 PM · May 19, 2026 · 14K Views

Gemini 3.5 Flash comparable with Opus 4.7, GPT-5.5 and Gemini 3.1 Pro on the Artificial Analysis Index while running up to 4x faster

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash beats Gemini 3.1 Pro across TerminalBench 2.1, GDPval and MCP Atlas

5:25 PM · May 19, 2026 · 24.1K Views
5:26 PM · May 19, 2026 · 97.2K Views

GPT-5.5-medium has lower end-to-end latency, uses less tokens and is overall smarter and cheaper than Gemini 3.5 Flash

it might genuinely be over for anyone not named OpenAI or Anthropic

Lisan al GaibLisan al Gaib@scaling01

some more Gemini 3.5 Flash benchmarks by Artificial Analysis AI: - the APEX-Agents-AA score is excellent - expected higher on CritPt - reasoning efficiency could also be better, but it kind of depends on what setting they used. if it's the max then it's very good - Price/Perf looks pretty bad vs GPT-5.5. You can get GPT-5.5-medium with better performance for less and with faster responses

6:06 PM · May 19, 2026 · 159.7K Views
6:24 PM · May 19, 2026 · 172.1K Views

Gemini 3.5 Flash Pricing confirmed at $1.5 / $9 per mtoks

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash Benchmarks

5:42 PM · May 19, 2026 · 24.4K Views
5:45 PM · May 19, 2026 · 8.8K Views

Gemini 3.5 Flash scores kinda low on the Coding Index due to terrible TerminalBench-Hard scores

5:57 PM · May 19, 2026 · 31.7K Views

that's rough

Gemini 3.5 Flash barely improved over Gemini 3 Flash but is now 3.5x more expensive on WeirdML

the same pattern was visible on Artificial Analysis Index: - it's more expensive because it uses more output tokens and slightly more total tokens than Gemini 3 Flash

10:16 AM · May 20, 2026 · 13.5K Views

Gemini 3.5 Flash beats Gemini 3.1 Pro across TerminalBench 2.1, GDPval and MCP Atlas

5:25 PM · May 19, 2026 · 24.1K Views

GPT-5.5-medium has lower end-to-end latency, uses less tokens and is overall smarter than Gemini 3.5 Flash

it might genuinely be over for anyone not named OpenAI or Anthropic

Lisan al GaibLisan al Gaib@scaling01

some more Gemini 3.5 Flash benchmarks by Artificial Analysis AI: - the APEX-Agents-AA score is excellent - expected higher on CritPt - reasoning efficiency could also be better, but it kind of depends on what setting they used. if it's the max then it's very good - Price/Perf looks pretty bad vs GPT-5.5. You can get GPT-5.5-medium with better performance for less and with faster responses

6:06 PM · May 19, 2026 · 159.7K Views
6:11 PM · May 19, 2026 · 3K Views

meh

doesn't even beat Kimi or GLM

Arena.aiArena.ai@arena

Gemini 3.5 Flash has landed #9 for Text and Code Arena: Frontend. Code Arena: Frontend evaluates models on agentic frontend coding tasks from real users building apps and websites (HTML and React). Scoring 1507, this is a significant +70 point improvement over Gemini-3 Flash. Sub-category highlights: - #7 Content Creation Tools - #8 Gaming - #8 Consumer Product - #9 Data & Analytics - #10 Reference-Based Design In Text Arena: #9 overall. Gemini 3.5 Flash also moves the price–performance frontier as the new top Arena score in its price tier. Congrats to the @GoogleDeepMind team on this launch! Click into the thread to see the rankings by each arena.

5:44 PM · May 19, 2026 · 173.2K Views
5:54 PM · May 19, 2026 · 50.3K Views

what the actual fuck

Gemini 3.5 Flash is 7.46 times more EXPENSIVE than GPT-5.5-xhigh on PencilPuzzleBench

(direct ask scores are below gpt-5.2-high)

7:11 PM · May 20, 2026 · 36.3K Views

the agentic score by itself is fine

but the cost is not real

Lisan al GaibLisan al Gaib@scaling01

what the actual fuck Gemini 3.5 Flash is 7.46 times more EXPENSIVE than GPT-5.5-xhigh on PencilPuzzleBench (direct ask scores are below gpt-5.2-high)

7:11 PM · May 20, 2026 · 36.3K Views
7:17 PM · May 20, 2026 · 2.3K Views

intelligence too cheap to meter

INIYSAINIYSA@lafaiel

Gemini 3.5 is 30x more expensive than 1.5

1:24 PM · May 19, 2026 · 306.7K Views
3:24 PM · May 19, 2026 · 28.6K Views

(deliberately not hyping Gemini 3.5 Flash too much this time. looks like an insane model, but you know how it is with self-reported benchmarks)

5:49 PM · May 19, 2026 · 6.5K Views

i mean this is ridiculous

Lisan al GaibLisan al Gaib@scaling01

(deliberately not hyping Gemini 3.5 Flash too much this time. looks like an insane model, but you know how it is with self-reported benchmarks)

5:49 PM · May 19, 2026 · 6.5K Views
5:50 PM · May 19, 2026 · 2.9K Views

Google built an entire operating system with Gemini 3.5 Flash in 12 hours for less than $1000

5:30 PM · May 19, 2026 · 16.5K Views

Gemini 3.5 Flash on CursorBench

Michael TruellMichael Truell@mntruell

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals

3:30 AM · May 20, 2026 · 1.3M Views
8:34 AM · May 20, 2026 · 9.3K Views

looks like gemini 3.5 is a flop. great opportunity to buy the dip and obtain some anthropic-waymo stock on sale

5:39 PM · May 20, 2026 · 6.7K Views

Strong performance across the board for Gemini, but holy crap GPT-5.5 is goated at long context WTF

GoogleGoogle@Google

Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.

5:25 PM · May 19, 2026 · 935.9K Views
6:02 PM · May 19, 2026 · 929 Views

What I think about every time I read MRCR 8-needle

GoogleGoogle@Google

Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.

5:25 PM · May 19, 2026 · 935.9K Views
6:23 PM · May 19, 2026 · 1.2K Views

Google Gemini 3.5 Flash is super strong model for its class. Beats Gemini 3.1 Pro on so many benchmarks.

An agent model with 4x faster tokens per second.

And @aimlapi just added gemini 3.5 Flash to their API and keeping it FREE for 24hrs.

Setup instructions in comment.

12:45 AM · May 20, 2026 · 12.8K Views

Enjoy 24hrs of free Gemini 3.5 Flash access

1. set up an AI/ML API account https://aimlapi.com/app/auth/

2. Talk to them on their discord https://discord.gg/2g6xMRdu3j

Rohan PaulRohan Paul@rohanpaul_ai

Google Gemini 3.5 Flash is super strong model for its class. Beats Gemini 3.1 Pro on so many benchmarks. An agent model with 4x faster tokens per second. And @aimlapi just added gemini 3.5 Flash to their API and keeping it FREE for 24hrs. Setup instructions in comment.

12:45 AM · May 20, 2026 · 12.8K Views
12:45 AM · May 20, 2026 · 1.2K Views

Gemini 3.5 Flash now outruns Gemini 3.1 Pro on several real-work automation tests.

- With 4x faster output tokens per second

- A really powerful agent model fast enough and cheap enough for everyday work

- Flash beats Gemini 3.1 Pro on several hard agent and coding benchmarks, including 76.2% Terminal-Bench 2.1, 83.6% MCP Atlas, and 1,656 Elo GDPval-AA.

- Available in the Gemini app, AI Mode in Search, Gemini API, Antigravity, Android Studio, and Google’s enterprise agent products.

- When coupled with the updated Antigravity harness, 3.5 Flash becomes a powerful engine for deploying collaborative subagents to tackle problems at scale.

so one subagent might inspect a folder, another might rewrite code, another might test the result, and another might summarize what changed.

Rohan PaulRohan Paul@rohanpaul_ai

Gemini 3.5 in few more hours. 🔥

9:40 AM · May 19, 2026 · 8.8K Views
1:36 AM · May 20, 2026 · 6.9K Views
Rohan PaulRohan Paul@rohanpaul_ai

Gemini 3.5 Flash now outruns Gemini 3.1 Pro on several real-work automation tests. - With 4x faster output tokens per second - A really powerful agent model fast enough and cheap enough for everyday work - Flash beats Gemini 3.1 Pro on several hard agent and coding benchmarks, including 76.2% Terminal-Bench 2.1, 83.6% MCP Atlas, and 1,656 Elo GDPval-AA. - Available in the Gemini app, AI Mode in Search, Gemini API, Antigravity, Android Studio, and Google’s enterprise agent products. - When coupled with the updated Antigravity harness, 3.5 Flash becomes a powerful engine for deploying collaborative subagents to tackle problems at scale. so one subagent might inspect a folder, another might rewrite code, another might test the result, and another might summarize what changed.

1:36 AM · May 20, 2026 · 6.9K Views
1:36 AM · May 20, 2026 · 1.1K Views

The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it.

And seems like it is 20x faster than Opus 4.6 !

Promising but Google will still find a way to fuck up

2:44 PM · May 19, 2026 · 137.7K Views
TaelinTaelin@VictorTaelin

The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up

2:44 PM · May 19, 2026 · 137.7K Views
2:54 PM · May 19, 2026 · 7.8K Views

@demishassabis @antigravity @GeminiApp plans to serve fast mode in the API?

Demis HassabisDemis Hassabis@demishassabis

Gemini 3.5 Flash is amazing! - Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost And Pro to come… Try it in @antigravity, @GeminiApp & more - enjoy!

1:05 AM · May 20, 2026 · 182.8K Views
2:05 AM · May 20, 2026 · 2.7K Views

Narrator: they already fucked up

→ Gemini 3.5 Flash not available on API.

→ Fast mode locked to Antigravity only.

I don't understand why companies keep doing this.

They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?

Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking under a old school product that nobody wants to use.

It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I'm certainly not launching a VSCode fork to use a model, no matter how great it is.

Your model is the product.

You do NOT need an IDE to make money.

You keep chasing old business models.

Completely out of touch.

Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!!

It is not hard. WHY it has to be so hard

. . .

TaelinTaelin@VictorTaelin

The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up

2:44 PM · May 19, 2026 · 137.7K Views
5:49 PM · May 19, 2026 · 298 Views

I'm getting only ~80 tokens/s on Gemini 3.5 Flash after launch? It peaked at 1000+ before. Since there is no API, it is hard to measure though...

6:56 PM · May 19, 2026 · 2K Views

The new Gemini Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it.

And seems like Gemini Flash is 20x faster than Opus 4.6 !

Promising but Google will still find a way to fuck up

2:39 PM · May 19, 2026 · 764 Views

Narrator: they already fucked up

→ Gemini 3.5 Flash not available on API.

→ Fast mode locked to Antigravity only.

I don't understand why companies keep doing this.

They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?

Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use.

It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I'm certainly not launching a VSCode fork to use a model, no matter how great it is.

And even these who DO use IDEs probably won't necessarily pick YOUR IDE. And they shouldn't. You do NOT need them to, to make money.

Your model is the product.

You keep chasing old business models.

Completely out of touch.

Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!!

It is not hard. WHY it has to be so hard

. . .

TaelinTaelin@VictorTaelin

The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up

2:44 PM · May 19, 2026 · 137.7K Views
5:51 PM · May 19, 2026 · 68 Views

btw this model is absolutely great

I just think locking the best product (fast-mode) under an old school visual IDE is a completely moronic business decision that only the Kodak of AI could truly make

TaelinTaelin@VictorTaelin

Deleted again because misinformation 🥲 Gemini 3.5 Flash *is* available on the API. Yet, both the API and the CLI versions are 3x slower than on the IDE! See the video below. → Antigravity IDE: 4 seconds (smooth) → Antigravity CLI: 15 seconds (buggy) So the point holds: they want you to use the visual IDE. Problem is: it is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is. They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money? Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use. And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money. Your model is the product. You keep chasing old business models. Completely out of touch. Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API /ctrlv

7:45 PM · May 19, 2026 · 63K Views
7:47 PM · May 19, 2026 · 263 Views

Narrator: they already fucked up

→ Gemini 3.5 Flash not available on API.

→ Fast mode locked to Antigravity only.

I don’t understand why companies keep doing this.

They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?

Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use.

It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is.

And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money.

Your model is the product.

You keep chasing old business models.

Completely out of touch.

Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!!

~~~

Reposting this. I deleted before because they launched Antigravity CLI, which isn't an API, but at least gives us *some* flexibility. But no: I'm getting 10x slower TPS on CLI compared to the IDE. So either a bug or they really want you to use the visual IDE. So my money will keep going to Anthropic, unfortunately. 🤦‍♂️

TaelinTaelin@VictorTaelin

The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up

2:44 PM · May 19, 2026 · 137.7K Views
7:08 PM · May 19, 2026 · 4.8K Views

Translating the same text, IDE vs CLI

→ IDE: smooth, 4 seconds

→ CLI: buggy, 15 seconds

I'm NOT using an IDE in 2026. I really want to stop giving money to Anthropic but everyone else is making it so hard

TaelinTaelin@VictorTaelin

Narrator: they already fucked up → Gemini 3.5 Flash not available on API. → Fast mode locked to Antigravity only. I don’t understand why companies keep doing this. They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money? Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use. It is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is. And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money. Your model is the product. You keep chasing old business models. Completely out of touch. Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API!!! ~~~ Reposting this. I deleted before because they launched Antigravity CLI, which isn't an API, but at least gives us *some* flexibility. But no: I'm getting 10x slower TPS on CLI compared to the IDE. So either a bug or they really want you to use the visual IDE. So my money will keep going to Anthropic, unfortunately. 🤦‍♂️

7:08 PM · May 19, 2026 · 4.8K Views
7:29 PM · May 19, 2026 · 180 Views

@OfficialLoganK @GoogleAIStudio oh, godspeed. highly appreciated

will fast mode available on API though?

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Massive updates to @GoogleAIStudio and the Gemini API 🤯 - Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity

8:53 PM · May 19, 2026 · 67.1K Views
9:03 PM · May 19, 2026 · 1.1K Views

Deleted again because misinformation 🥲

Gemini 3.5 Flash *is* available on the API. Yet, both the API and the CLI versions are 3x slower than on the IDE! See the video below.

→ Antigravity IDE: 4 seconds (smooth)

→ Antigravity CLI: 15 seconds (buggy)

So the point holds: they want you to use the visual IDE.

Problem is: it is 2026. NOBODY should be using IDEs anymore. Get over it. Let it GO. I’m certainly not launching a VSCode fork to use a model, no matter how great it is.

They invent a portal gun, only to lock it behind a taxi subscription, because they completely fail to realize their very product deprecates that other thing they think will make them money?

Cursor is a great example of a company that (sadly) is very likely fail because of that mindset. Composer is actually surprisingly good model. They should put all efforts in serving it. Yet, they keep locking it under an old school product that nobody wants to use.

And even these who DO use IDEs probably won’t necessarily pick YOUR IDE. And they shouldn’t. You do NOT need them to, to make money.

Your model is the product.

You keep chasing old business models.

Completely out of touch.

Meanwhile Anthropic is all charging at full speed to sooner or later surpass Google by just serving great models under an API

/ctrlv

TaelinTaelin@VictorTaelin

The new Gemini 3.5 Flash solved the HVM3's wnf bug in 1/3 attempts. This is my main test to take a model seriously. So far only the big models like GPT 5.5 solved it. And seems like it is 20x faster than Opus 4.6 ! Promising but Google will still find a way to fuck up

2:44 PM · May 19, 2026 · 137.7K Views
7:45 PM · May 19, 2026 · 63K Views

@theo wow, this is a huge miss.

Theo - t3.ggTheo - t3.gg@theo

Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.

4:04 AM · May 20, 2026 · 1M Views
5:49 PM · May 20, 2026 · 814 Views

@melvinjohnsonp ⚡ woohoo! incredible work! congratulations! ⚡

Melvin JohnsonMelvin Johnson@melvinjohnsonp

Today, we introduced Gemini 3.5 Flash ⚡ Our most capable coding and agentic model — where "fast" and "best" aren't a tradeoff. Try it now across Antigravity, AI Studio, Gemini App, and AI Mode.

5:48 PM · May 19, 2026 · 4.5K Views
6:19 PM · May 19, 2026 · 87 Views

Flash has been a go-to for builders for its speed + performance + cost - sweet spot - excited for people to build on Gemini 3.5 Flash, even more powerful now!

Jeff DeanJeff Dean@JeffDean

1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽

5:45 PM · May 19, 2026 · 119K Views
6:24 PM · May 19, 2026 · 669 Views

Had a lot of fun pushing Flash’s capabilities on long-running agentic tasks!

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
9:14 PM · May 19, 2026 · 376 Views

@OfficialLoganK @GoogleAIStudio So many cool announcements this year. Congrats!

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Massive updates to @GoogleAIStudio and the Gemini API 🤯 - Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity

8:53 PM · May 19, 2026 · 67.1K Views
9:22 PM · May 19, 2026 · 213 Views

3/ The positioning is clear:

Flash is no longer just “cheap fast model.”

Google wants Gemini 3.5 Flash to be the default engine for long-horizon agents: plan, build, iterate, use tools, execute code, complete real work.

Gemini 3.5 Pro comes next month. Can't wait to try Flash

Alex VolkovAlex Volkov@altryne

2/ Benchmark are crazy: • Terminal-Bench 2.1: 76.2% • GDPval-AA: 1656 Elo • MCP Atlas: 83.6% • CharXiv Reasoning: 84.2% Google says 3.5 Flash beats Gemini 3.1 Pro on key coding/agentic evals and is 4x faster than other frontier models!

5:46 PM · May 19, 2026 · 300 Views
5:46 PM · May 19, 2026 · 330 Views

4/ Gemini Omni Flash is the other monster announcement.

Google’s framing: “create anything from any input — starting with video.”

Text, images, video, audio as inputs → high-quality generated/edited video grounded in Gemini’s world knowledge.

Alex VolkovAlex Volkov@altryne

3/ The positioning is clear: Flash is no longer just “cheap fast model.” Google wants Gemini 3.5 Flash to be the default engine for long-horizon agents: plan, build, iterate, use tools, execute code, complete real work. Gemini 3.5 Pro comes next month. Can't wait to try Flash

5:46 PM · May 19, 2026 · 330 Views
5:46 PM · May 19, 2026 · 250 Views

Look at them digits

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
5:44 PM · May 19, 2026 · 851 Views

Insane evals for a Flash model! Gemini 3.5 Flash is really good for its size!

Chubby♨️Chubby♨️@kimmonismus

Gemini 3.5 Flash official! Insanely fast an capable model

5:25 PM · May 19, 2026 · 77.3K Views
5:38 PM · May 19, 2026 · 161.6K Views

Gemini 3.5 Flash official! Insanely fast an capable model

Chubby♨️Chubby♨️@kimmonismus

„Progress towards AGI“: Gemini Omni - world models -Gemini Omni official!! It can create anything from any input!!!

5:17 PM · May 19, 2026 · 69.5K Views
5:25 PM · May 19, 2026 · 77.3K Views

Gemini 3.5 pro next month!!!

Chubby♨️Chubby♨️@kimmonismus

Gemini 3.5 Flash official! Insanely fast an capable model

5:25 PM · May 19, 2026 · 77.3K Views
5:35 PM · May 19, 2026 · 13.7K Views

Thank you Sundar - first I/O and already feeling at home.

Gemini 3.5 Flash is genuinely impressive for a model at this price point. The efficiency race is just getting started!

Sundar PichaiSundar Pichai@sundarpichai

Workhorse model! (and hope you're enjoying your first I/O)

6:33 PM · May 19, 2026 · 140K Views
6:36 PM · May 19, 2026 · 21.4K Views

@sundarpichai Thanks Sundar!

Sundar PichaiSundar Pichai@sundarpichai

Workhorse model! (and hope you're enjoying your first I/O)

6:33 PM · May 19, 2026 · 140K Views
9:31 PM · May 19, 2026 · 1.2K Views

So Google just cooked everyone on cost & speed Harnessing the full power of model-hardware co design Extreme sparsity and Ironwoods

Lisan al GaibLisan al Gaib@scaling01

Gemini 3.5 Flash comparable with Opus 4.7, GPT-5.5 and Gemini 3.1 Pro on the Artificial Analysis Index while running up to 4x faster

5:26 PM · May 19, 2026 · 97.2K Views
5:33 PM · May 19, 2026 · 65K Views

Clearly has very low active parameters but a lot more total parameters

GoogleGoogle@Google

Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.

5:25 PM · May 19, 2026 · 935.9K Views
5:38 PM · May 19, 2026 · 40.9K Views

Wait, what?????? What kind of post training breakthrough did they make?? So the price increase is mostly due to smaller batch size to make it run faster

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

I'm wrong, thanks @yourboiilevi it's G3 Flash base, they just serve it faster interesting

5:58 PM · May 19, 2026 · 49.1K Views
9:50 PM · May 19, 2026 · 59.8K Views

@teortaxesTex yeah, turned out to be benchmaxxed shit

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

I think the verdict is in, Gemini didn't have any post training breakthrough, except maybe through the floor. Outside of vision, massive disappointment. fucking V4-Flash gets stuff DONE faster. Then again I almost never used 3-Flash I'll likely almost never use this thing too

6:42 AM · May 20, 2026 · 17.1K Views
6:46 AM · May 20, 2026 · 4.3K Views

Genuinely impressive release by Google today (remember when they were behind?)

Gemini 3.5 Flash perf: * Building on prior strengths (83.6% of MMMU-Pro for multimodal), * big jump on agentic coding (76.2% on Terminal-Bench for agentic coding and 56.5% on Toolathon for real world tasks) * progress and expert tasks (57.9% on Finance Agent 2... we are cooked) * leading scores across SWE-Bench, OSWorld etc.

(also, elegant to bold the top scores in the chart below even if when it's not Google leading)

Ofc, just benchmarks, and also not cheap (~$9/M output), but Google is cookin'... we are all so spoiled to have the 3 labs compete

8:27 PM · May 19, 2026 · 7.3K Views

Day 2 Vibes On Gemini Flash 3.5

- Sonnet class model - More expensive than Sonnet in real-world usage - GPT 5.5 & Sonnet/Opus still maintain lead

Real problem - It's just too expensive as it spins on agentic problems

5:21 PM · May 20, 2026 · 832 Views

OH WOW! GOOGLE FINALLY BECOMES A LEGIT AI COMPANY Gemini 3.5 Flash is Generally Available!!

No more preview launches with rate limits!!!

This is earth shattering....😲😲

5:30 PM · May 19, 2026 · 11K Views

Gemini Flash 3.5 Is As Good As Sonnet 4.6

Flash is just below Sonnet 4.6 on the leader board. This is Google's first competitive model in a while!

But yes, Google has still got game 🚀🚀

7:32 PM · May 20, 2026 · 10.3K Views

Gemini Flash 3.5 seems pretty equivalent to Flash 3.1

So why is it 300% more expensive?!!

Imagine being in 3rd place, having infinite money and increasing prices!!!

It’s like they are not serious about AI

12:17 PM · May 20, 2026 · 202 Views

Gemini Flash 3.5 Is As Good As Sonnet 4.6

Flash is just below Sonnet 4.6 on the leader board. This is Google's first competitive model in a while!

There is one problem - in practice it is more expensive than Sonnet 4.6 as it loops forever when dealing with agentic loops

But yes, Google has still got game 🚀🚀

7:16 PM · May 20, 2026 · 1.4K Views

Google Makes A Come Back - Gemini Flash Early Vibes

- brilliant instruction follower!! like absolutely stunning - good on agentic coding - it is NOT bench-maxxed

This is genuinely a good model at a great price from Google.

Overall a way better alternative to Sonnet. Will be on ChatLLM shortly

7:55 PM · May 19, 2026 · 17.6K Views

OH WOW! GOOGLE FINALLY BECOMES A LEGIT AI COMPANY Gemini 3.5 Flash is Generally Available!!

No more preview launches with rate limits!!!

This is earth shattering.... It's like they really have an engineering team 😲😲

5:22 PM · May 19, 2026 · 1.2K Views

The flash version is pretty good….

Now imagine a Gemini Pro 3.5 that is NOT benchmaxxed

Beating everyone at everything!

12:28 AM · May 20, 2026 · 7.6K Views

Gemini 3.5 Flash is here!!! 🚀🚀

Priced at 3x it's predecessor but still WAY CHEAPER than GPT 5.5 or Opus 4.7

We are evaluating the model against a bunch of real-world quality evals. Results coming later today

2:21 PM · May 19, 2026 · 45.7K Views

TBH the Chinese open-source models still beat Gemini Flash 3.5 and are 10x cheaper The best open source models are very good

1:09 AM · May 20, 2026 · 17 Views

TBH, Kimi 2.6 beats Gemini Flash 3.6

Plus it is 10x cheaper

So, yes, open source is still winning

4:47 AM · May 20, 2026 · 26.8K Views

Gemini magic everywhere ✨

Sundar PichaiSundar Pichai@sundarpichai

Just off stage at #GoogleIO, some highlights from this morning 🧵 Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs. Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.

5:59 PM · May 19, 2026 · 293.4K Views
7:02 PM · May 19, 2026 · 1.2K Views

(1/4) Gemini 3.5 Flash is in a league of it's own! ⚡️ It's the perfect combo of intelligence, speed, & cost. It's now my daily driver in both Spark & Antigravity!

Watch 3.5 spawn subagents organize a set of marketing assets, rename them, and put them into folders

9:51 PM · May 19, 2026 · 36.2K Views

(3/4) And...did I mention 3.5 Flash is so fast?

Tulsee DoshiTulsee Doshi@tulseedoshi

(2/4) I'm really proud of the model's performance. Gemini 3.5 Flash outperforms Gemini 3.1 Pro on most benchmarks -- it's great at code & agentic workflows, and continues Gemini's multimodal excellence

9:51 PM · May 19, 2026 · 1.2K Views
9:51 PM · May 19, 2026 · 1.4K Views

(2/4) I'm really proud of the model's performance. Gemini 3.5 Flash outperforms Gemini 3.1 Pro on most benchmarks -- it's great at code & agentic workflows, and continues Gemini's multimodal excellence

Tulsee DoshiTulsee Doshi@tulseedoshi

(1/4) Gemini 3.5 Flash is in a league of it's own! ⚡️ It's the perfect combo of intelligence, speed, & cost. It's now my daily driver in both Spark & Antigravity! Watch 3.5 spawn subagents organize a set of marketing assets, rename them, and put them into folders

9:51 PM · May 19, 2026 · 36.2K Views
9:51 PM · May 19, 2026 · 1.2K Views

(4/4) You can try Gemini 3.5 Flash across the Gemini app, AI Mode in Search, the Gemini API, Google AI Studio, Android Studio, and our Enterprise platforms. Can’t wait to see what you build! ✨

blog.google
Gemini 3.5: frontier intelligence with action
At Google I/O we released Gemini 3.5, our latest series of models combining frontier intelligence with action.
Tulsee DoshiTulsee Doshi@tulseedoshi

(3/4) And...did I mention 3.5 Flash is so fast?

9:51 PM · May 19, 2026 · 1.4K Views
9:51 PM · May 19, 2026 · 1.1K Views

coding with flash is a different experience, it's absurdly fast, it sometimes feels instant. for hard debugging tasks it can explore large areas of the problem space in minutes. it can outperform bigger models on hard tasks by crunching more tokens in less clock time

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

5:41 PM · May 19, 2026 · 546K Views
6:10 PM · May 19, 2026 · 58 Views

@yacineMTB The main thing I was looking forward to

kachekache@yacineMTB

Kind of a big deal

5:32 PM · May 19, 2026 · 59.2K Views
6:13 PM · May 19, 2026 · 366 Views

Gemini 3.5 Flash seems to have an almost endearing penchant for lying to me. I have no idea why it is saying this... but, I am not upset. It feels like a side effect of the harness more than the model (as much as those can be differentiated, these days).

5:01 AM · May 20, 2026 · 3.1K Views

Gemini 3.5 Flash is a really interesting release.

It's super fast and surprisingly smart. It's also more expensive (3x more per token) and super token hungry.

The result - it costs 2x more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT-5.5 Medium.

11:21 PM · May 19, 2026 · 153.2K Views

The cost to performance chart is the most interesting.

3.5 Flash is "more expensive" and "dumber" than gpt-5.5 on medium

gpt-5.5-medium: 22m tokens, $1,199, 57 points gemini-3.5-flash: 73m tokens, $1,522, 55 points

Theo - t3.ggTheo - t3.gg@theo

Gemini 3.5 Flash is a really interesting release. It's super fast and surprisingly smart. It's also more expensive (3x more per token) and super token hungry. The result - it costs 2x more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT-5.5 Medium.

11:21 PM · May 19, 2026 · 153.2K Views
11:25 PM · May 19, 2026 · 23.1K Views

@OfficialLoganK I think 3.5 Flash was not marketed in an honest way.

In real world use, it's more expensive than 3.1 pro and not much better.

Combined with the sunsetting of Gemini CLI (I liked where it was going) + the Railway stuff and I've lost a lot of faith.

Theo - t3.ggTheo - t3.gg@theo

Gemini 3.5 Flash is a really interesting release. It's super fast and surprisingly smart. It's also more expensive (3x more per token) and super token hungry. The result - it costs 2x more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT-5.5 Medium.

11:21 PM · May 19, 2026 · 153.2K Views
2:59 AM · May 20, 2026 · 26.8K Views

@OfficialLoganK Video is up now btw

Theo - t3.ggTheo - t3.gg@theo

I'm scared to make this video, but I feel like I have to. It's time to talk about Google.

3:55 AM · May 20, 2026 · 1.9M Views
4:00 AM · May 20, 2026 · 2.6K Views

@OfficialLoganK They were representative enough to be in the official announcement blog post 🙃

Logan KilpatrickLogan Kilpatrick@OfficialLoganK

@theo Will be curious to see how it stacks up in real world use cases, not sure how representative AA index scores are given there’s a lot of coverage of purely academic benchmarks. It’s also much faster than 3.1 Pro.

4:47 AM · May 20, 2026 · 3.2K Views
4:57 AM · May 20, 2026 · 2.6K Views

Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!!

This might be the worst major lab model drop of all time. Llama 4 tier. Insane.

Michael TruellMichael Truell@mntruell

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. https://cursor.com/evals

3:30 AM · May 20, 2026 · 1.3M Views
4:04 AM · May 20, 2026 · 1M Views

I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5.

3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell

Theo - t3.ggTheo - t3.gg@theo

Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.

4:04 AM · May 20, 2026 · 1M Views
4:08 AM · May 20, 2026 · 160.1K Views

Video is up btw

Theo - t3.ggTheo - t3.gg@theo

I'm scared to make this video, but I feel like I have to. It's time to talk about Google.

3:55 AM · May 20, 2026 · 1.9M Views
6:38 AM · May 20, 2026 · 64.8K Views

Wait wtf, they STILL haven't updated pretraining???

9:23 AM · May 20, 2026 · 257.9K Views

@suchenzang Breaks my heart. Flash was one of my favorite model lines. I have a dozen videos talking about how much I love it.

I’ve yet to find a use case where price to perf on 3.5 makes sense. I’m trying, I’m just not seeing it (and nobody else has examples either)

Susan ZhangSusan Zhang@suchenzang

ouch

5:17 AM · May 20, 2026 · 65.2K Views
6:41 AM · May 20, 2026 · 10.6K Views

Clearly these people haven’t experienced the magic of Gemini 3.5 Flash Preview on High in the new Antigravity CLI

7:55 PM · May 19, 2026 · 157.1K Views

Google seems to be absolutely killing it with visual generation, but has mid LLM game, what’s up with that

Theo - t3.ggTheo - t3.gg@theo

Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!! This might be the worst major lab model drop of all time. Llama 4 tier. Insane.

4:04 AM · May 20, 2026 · 1M Views
4:20 PM · May 20, 2026 · 4.5K Views

flash 2 was last great google model.

Theo - t3.ggTheo - t3.gg@theo

I miss when Flash was the underrated goat model. I genuinely loved Flash 2 and genuinely tolerated 2.5. 3 was the start of the end. 3.5 is a useless model that should not be used for, well, anything as far as I can tell

4:08 AM · May 20, 2026 · 160.1K Views
6:50 AM · May 20, 2026 · 15.1K Views
Gemini 3.5 Flash appears in the Google Cloud Console quota interface at $1.5 per million input tokens and $9 per million output tokens while posting a leading 47.1% on the APEX-Agents AA leaderboard · Digg