Gemini 3.5 and Gemini Omni: Everything Google I/O 2026 Changed for Users and Creators
Google I/O 2026, held May 19-20 in Mountain View, was the most aggressive edition in the event’s history when it comes to AI launches. Google itself rounded up exactly 100 announcements, but two stole the show: Gemini 3.5 Flash, a new agentic model that shipped generally available (GA) on day one, and Gemini Omni, a model that creates video from almost any input — image, audio, video, or text. Add Gemini Spark (a 24/7 personal agent), the evolution of Antigravity, and a full redesign of the Gemini app, and it’s clear Google decided to fight 2026 wholesale.
For anyone who uses AI daily, makes content, or runs a digital business, this bundle isn’t just keynote noise. There are concrete changes in pricing, capability, and workflow — and some of them are already live. In this deep-dive, we cut to what matters: the real (confirmed) benchmark numbers, what each product does, what it costs, and what changes for you right now.
Gemini 3.5 Flash: The “Flash” That Beat Last Year’s “Pro”
The technical headline of I/O was an unusual one: a model from the Flash line (Google’s fast, cheap tier) beating Gemini 3.1 Pro, the previous generation’s flagship, on benchmarks that simulate real work. This isn’t marketing — the numbers came from Google itself and were echoed by independent analyses.
The benchmarks (confirmed)
- Terminal-Bench 2.1: 76.2%. Measures the model’s ability to operate inside a terminal as an autonomous agent.
- GDPval-AA: 1656 Elo. Evaluates performance on real economic tasks.
- MCP Atlas: 83.6% (Google reported the 3.5 Flash figure; independent analyses put Flash ahead of competitors on tool use via the MCP protocol, including Claude Opus 4.7 around 79% and GPT-5.5 around 75%).
- CharXiv Reasoning (multimodal): 84.2%, showing the leap isn’t limited to code.
Beyond intelligence, there’s speed: Google says 3.5 Flash generates output tokens roughly 4x faster than other frontier models, at a cost “often less than half” of competitors. In Google’s own words: “What used to take a developer days or an auditor weeks, 3.5 Flash can now help complete in a fraction of the time, often at less than half the cost.”
Where it already lives
3.5 Flash hit general availability (GA) on the day of the announcement, accessible via the Gemini API in Google AI Studio, Android Studio, and the Antigravity platform. More important for everyday users: it became the default model for AI Mode in Google Search, globally. So when you ask a conversational question on Google and get an AI-generated answer, it’s 3.5 Flash answering.
What about Gemini 3.5 Pro?
Google confirmed that 3.5 Pro is already in internal use, with a rollout planned for the month after I/O — meaning it should appear during June 2026. The logic is the same as always: Flash delivers speed and cost for agentic tasks, while Pro targets deep reasoning and the hardest cases.
The practical read: if you build agents, automations, or products that need to “get something done” (rather than “answer a hard research question”), 3.5 Flash is likely already the best price-performance option on the market. If you’re weighing your options, check our updated rundown on the best AI model in 2026.
Gemini Omni: Video From Anything
If 3.5 Flash was the announcement for developers, Gemini Omni was the one that made the room applaud. It’s a new model (or rather, a new series) that creates video from any input: feed it an image, a clip of audio, a reference video, or simply text, and it generates high-quality video — editable by conversation.
The differentiator: world understanding
The point Google stressed is that Omni isn’t just “pasting pretty pixels.” It combines Gemini’s intelligence with generative media models for a higher level of world understanding. In practice, that means when it edits a scene, the model keeps physical and narrative coherence: in Google’s words, “your characters stay consistent, the physics hold up, and the scene remembers what came before.”
That notion of physics is what sets Omni apart from older video generators. The model handles how objects move, fall, and interact — gravity, motion, fluids — better, instead of producing the weird hand-and-object distortions that marked the first generation of AI video. (One honest caveat: not every “physics understanding” figure shipped with a closed benchmark; what Google demonstrated were practical examples of consistency.)
Character consistency
A specific feature Google highlighted is character consistency in Gemini Omni Flash: a character’s identity and voice are preserved from one scene to the next. This has historically been AI video’s biggest Achilles’ heel — keeping the same face and the same voice across multiple cuts without “swapping people.” For creators, it’s exactly what was missing to produce narratives with a beginning, middle, and end.
All generated content carries an imperceptible SynthID watermark, part of Google’s effort to label synthetic media.
Where it lives and what it costs
Here’s the part that hits the wallet:
- Gemini Omni is rolling out to Google AI Plus, Pro, and Ultra subscribers, globally, via the Gemini app and Google Flow (Google’s audiovisual creation tool).
- Gemini Omni Flash arrives for free in YouTube Shorts Remix and the YouTube Create app (limited to users 18+).
That “free on YouTube” move is strategic. Google is putting AI video generation in front of billions of people at no cost — and anchoring it on the video platform where creators already work. If you make Shorts, this changes the production equation almost immediately. To understand the concept behind all of this, it’s worth revisiting our guide on multimodal AI explained. And if you want to start making AI video today, our Google Veo 3.1 tutorial walks through it step by step.
Gemini Spark: The 24/7 Personal Agent
The third major announcement was Gemini Spark, which Google describes as a personal agent that “takes actions on your behalf” to navigate your digital life. The key detail: it runs on dedicated virtual machines on Google Cloud and works 24/7, so it keeps going in the background even with your laptop closed. It’s built on Gemini 3.5 and Antigravity.
In practice, Spark integrates Gmail, Docs, and other Workspace apps, and is set to expand to third-party tools via the MCP protocol over the summer (Northern Hemisphere — i.e., the second half of 2026). The roadmap includes sending texts and emails directly, creating custom sub-agents, and even running authorized payments with budget controls.
Access started restricted: an early beta for Google AI Ultra subscribers in the U.S. only, the week after I/O. International rollouts will be slower — as usual with agentic features that touch payments and personal data. Still, Spark is the clearest signal of where the industry is headed: from “chatbot that answers” to “agent that does.” We dig deeper into this shift in our analysis of autonomous AI agents in 2026.
Antigravity and Stitch: Tools for the Builders
For developers and designers, I/O brought the evolution of two platforms:
- Antigravity gained three forms: Antigravity 2.0 (a desktop app to orchestrate multiple agents, now with voice commands), the Antigravity CLI (a terminal interface rebuilt in Go, faster, to spin up agents instantly), and the Antigravity SDK (programmatic access to build custom agents; for enterprises, it can connect directly to Google Cloud projects). Google says engineering tasks that used to take days now collapse into hours or minutes through subagent collaboration.
- Stitch, Google’s design tool, now allows real-time design steered by text or voice, and imports existing code and design files to maintain brand consistency.
For the ecosystem of small studios, freelancers, and solo devs, these tools lower the barrier to entry: you can prototype product, design, and backend with a team of agents instead of a team of people. It doesn’t replace talent — but it multiplies the reach of those who already have it.
The Pricing Change That Almost Slipped By
Amid the flashy launches, Google reworked its pricing structure in a way that matters:
- The Google AI Ultra plan now has an entry tier at $100/month (with 5x more usage than AI Pro and 20 TB of storage). The old $250 plan dropped to $200 with the same capabilities.
- More significantly: Google is moving from a “daily prompt limit” model to a compute-used model, based on prompt complexity and the features you use. Limits refresh every five hours until you hit a weekly cap.
Translation for creators and professionals: the cost of heavy AI use will be more tied to how much processing you actually consume. For light use, you’ll likely have headroom. For those generating video and running agents all day, it’s time to watch consumption — just as you already do with cloud.
What This Actually Changes
Pulling it all together, the concrete impact splits into three fronts:
For everyday users. Search’s AI Mode already runs on 3.5 Flash globally, so AI answers on Google got faster and more capable — without you doing a thing. The Gemini app was redesigned (the “Neural Expressive Design”) and is already live on Android, iOS, and web.
For creators. Gemini Omni Flash, free on YouTube Shorts and YouTube Create, is the most immediate announcement. Video generation with character consistency, at no cost, on the platform where you already publish. Whoever can script and direct the AI will produce faster and cheaper. The advantage window favors early learners.
For founders and devs. 3.5 Flash via API, at a cost “less than half” of competitors and with top-tier agentic performance, slashes the cost of building AI products. Antigravity and Stitch shorten the prototyping cycle. For those who want to turn this into a content business, our step-by-step on how to create an AI blog that ranks in 2026 still holds.
Frequently Asked Questions
Is Gemini 3.5 Flash available now?
Yes. 3.5 Flash hit general availability (GA) on the day of the announcement, accessible via the Gemini API in Google AI Studio, Android Studio, and Antigravity. It’s also already the default model for Google Search’s AI Mode, globally.
Is Gemini Omni free?
Partly. Gemini Omni Flash is free in YouTube Shorts Remix and the YouTube Create app (for users 18+). The full Gemini Omni is rolling out to Google AI Plus, Pro, and Ultra subscribers, via the Gemini app and Google Flow.
Is Gemini 3.5 Flash really better than Gemini 3.1 Pro?
On agentic and coding benchmarks, that’s what Google claims: the company says 3.5 Flash outperforms 3.1 Pro on these tests, and published figures such as Terminal-Bench 2.1 at 76.2%, GDPval-AA at 1656 Elo, and MCP Atlas at 83.6%. The caveat: those tests measure “agent work,” not necessarily very hard research questions — for those, 3.5 Pro, arriving in June 2026, should be the stronger pick. For a direct head-to-head, see our Gemini 3.5 Flash vs GPT-5.5 vs Claude Opus comparison.
When will Gemini 3.5 Pro launch?
Google confirmed 3.5 Pro is already in internal use, with a rollout planned for the month after I/O — meaning sometime in June 2026. No exact public date has been given as of this edition.
Does Gemini Omni actually understand physics?
Google demonstrated that Omni maintains physical and narrative coherence when editing scenes — objects move and fall plausibly, and characters stay consistent across cuts. That’s a leap over earlier generators. In fairness: Google showed practical examples and the “world understanding” framing, but did not publish a single closed physics benchmark with an exact number.
Does Gemini Spark work outside the U.S.?
Not yet. The initial Spark beta was restricted to Google AI Ultra subscribers in the U.S. the week after I/O. Expansion to third parties (via MCP) and to other countries is expected over the second half of 2026.
What to Watch From Here
Google I/O 2026 made one thing clear: the company’s strategy is to win on distribution — placing frontier models inside Search, Android, Workspace, and YouTube, where billions of people already are, rather than relying on a standalone app. Three things are worth monitoring in the coming weeks:
- The launch of Gemini 3.5 Pro (expected June 2026) and how it stacks up against GPT-5.5 and Claude Opus on deep reasoning.
- The international expansion of Gemini Omni and Spark, especially the payment and third-party integration features.
- The real effect of the new compute-based billing model on the monthly cost for heavy AI users.
To keep up with the next updates, subscribe to the iabrief newsletter — one email per week, no hype, only what matters.