Google has rolled out several changes to Gemini’s new usage limits following user complaints. The adjustments come just over a week after the company introduced compute-based limits at I/O 2026.
The new system measures “compute used” rather than counting prompts. It refreshes every five hours until a weekly limit is met. The approach factors in prompt complexity, the tools used, and chat length. Google explained that a simple text prompt consumes far less compute than a complex video or coding request. Many users, however, reported hitting their limits far too fast.
Gemini lead Josh Woodward said Google now caps how much quota a single prompt can use. This directly addresses complaints that complex prompts with large files drained limits almost instantly. By capping per-prompt usage, Google says users get more out of the Pro model.
Google also confirmed that errors no longer count against limits. If a request fails, users will not be charged. Quota now applies only to successful completions, and the company called system mistakes its own responsibility rather than the user’s.
Prompts using Gemini 3.1 Flash-Lite are now completely free and do not count against quota. Google also clarified its model selection behavior. When a user picks a specific model, Gemini remembers that choice across all future sessions. It changes only if the user adjusts it manually or hits a cap that triggers an automatic fallback to a lighter model.
Heavy tasks like Deep Research require more compute. To help users manage that, Google promised more detailed usage breakdowns and notifications. The current usage dashboard at gemini.google.com/usage only offers a high-level overview right now.
The company also fixed a bug where just one or two Omni videos drained quotas for certain users. Google AI Ultra subscribers now receive double the previous number of Omni generations. Google said it will keep looking for ways to increase Omni access further.
Looking ahead, Google plans to let Gemini app users buy pay-as-you-go top-up AI credits. This would give heavy users a way to extend their limits without upgrading their entire plan.
