Google launches the fastest and most cost-effective Gemini 3 model, with a response time improved by 2.5 times and output speed increased by 45%

Wallstreetcn
2026.03.03 16:41
portai
I'm PortAI, I can summarize articles.

Gemini 3.1 Flash-Lite is designed for developers to handle large-scale high-frequency workloads, with a preview version available to developers starting this Tuesday, featuring a "thinking layer"; benchmark tests show that the first answer response time of this model is 2.5 times faster than Gemini 2.5 Flash, with an output speed increase of 45%; GPQA Diamond and MMMU Pro test scores surpass competitors like GPT-5 Mini; pricing is $0.25 per million input tokens and $1.50 per million output tokens, with a maximum context window of 1 million tokens