Google has introduced the Gemini 2.0 Flash, the first model in its Gemini 2.0 family. Benchmarks suggest that this lightweight model outperforms its predecessor, Gemini 1.5 Pro, with response speeds that are twice as fast. However, it’s important to consider these results cautiously, as first-party benchmarks are typically presented in the best light.
Gemini 2.0 offers significant improvements, including support for multimodal inputs like images, video, and audio, as well as multimodal outputs, such as text combined with images and steerable multilingual text-to-speech. The model also integrates Google Search, executes code, and can utilize third-party functions.
Developers can access Gemini 2.0 Flash via the Gemini API in Google AI Studio and Vertex AI. While multimodal input and text output are available to all developers, text-to-speech and native image generation are initially reserved for early-access partners. Full general availability, including additional model sizes, is expected in January. Gemini 2.0 Flash is also available in the Gemini app, Google’s AI assistant, comparable to OpenAI’s ChatGPT.