
Google releases its first native multimodal embedding model, Gemini Embedding 2

I'm PortAI, I can summarize articles.
Google DeepMind launched its first native multimodal embedding model, Gemini Embedding 2, on March 10, which can unify text, images, videos, audio, and documents into a single embedding space. The model supports over 100 languages and introduces native voice embedding capabilities for the first time, eliminating the need for an intermediate step of converting speech to text. It uses MRL technology to support flexible compression of vector dimensions, balancing performance and storage costs
Log in to access the full 0 words article for free
Due to copyright restrictions, please log in to view.
Thank you for supporting legitimate content.

