Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries

"Gemma 3n represents a significant advancement in multimodal AI, offering increased scalability and efficiency for on-device applications, especially in enterprise scenarios."

"With selective parameter activation, the 2B and 4B variants of Gemma 3n enable greater efficiency by managing the parameters that are activated during inference."

"Gemma 3n supports both fine-tuning and on-device Retrieval Augmented Generation, empowering developers to optimize the model for specific applications effortlessly."

"The new quantization tools introduced by Google can reduce model sizes by 2.5-4X while improving performance metrics such as latency and memory consumption."

Google's new Gemma 3n model is a multimodal language model available in preview on the LiteRT Hugging Face platform. It facilitates text, image, audio, and video inputs, with customizability options due to retrieval-augmented generation (RAG) and function calling via AI Edge SDKs. Available in 2B and 4B parameter variants, it supports larger models for enterprise applications, improving field efficiency. Selective parameter activation allows efficient management of resources, and enhanced quantization tools promise to significantly reduce model footprint while enhancing performance.

#ai #machine-learning #multimodal-models #google #quantization

Read at InfoQ

Unable to calculate read time

Collection

[

...

]

Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling LibrariesGemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries Briefly

Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries
Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries
Briefly