Google Unveils Gemini 3.5 Flash, Making Large-Scale AI Operations More Efficient

Wed, 20 May 2026, 11:27 WIB | By Yudha Pratomo | Source: KOMPAS | Technology

Google has officially introduced its latest artificial intelligence (AI) model, Gemini 3.5 Flash. The AI model was unveiled at the Google I/O 2026 developer gathering, held on Tuesday (19 May 2026) US time or Wednesday (20 May 2026) early morning Indonesian time. Gemini Flash 3.5 is claimed to deliver high performance on par with top AI models, but with more efficient operating costs and processing times. Google CEO Sundar Pichai said that Gemini 3.5 Flash is designed to help companies curb the high costs of using AI at scale, particularly those related to token consumption. In generative AI systems, tokens are the basic data units processed by the AI model. A token can be a word fragment, number, symbol, or character used by AI as it understands questions and generates answers. This means operational costs of AI also rise because most commercial AI services charge based on the number of tokens used. According to Pichai, a company processing around one trillion AI tokens per day on Google Cloud could save more than $1 billion per year (about 17.7 trillion) by moving most of its workloads to Gemini 3.5 Flash. “Gemini 3.5 Flash arrives at a time when many CIOs are starting to exhaust their annual AI token budgets, even though it’s only May (the start of the year),” Pichai told VentureBeat. Overall, Google itself says that Gemini 3.5 Flash is designed for “agentic AI” needs. This is AI capable of working more independently to carry out incremental tasks, from coding, to tool usage, to automatic decision-making. Nevertheless, this AI can also be used to increase user productivity, according to their needs and tasks. On Google’s official DeepMind page, Google states that Gemini 3.5 Flash has multimodal capabilities. This means the model can process various input types such as text, images, videos, audio, and PDF documents. In its usage, Google claims the model is suitable for coding agentic, advanced reasoning, multimodal understanding, long document analysis, and everyday AI tasks.

View JSON | Print