Google has released Gemma, a family of AI models based on the same research as Gemini. Developers can’t quite get their hands into the engine of Google Gemini yet, but what the tech giant released on Feb. 21 is a smaller, open source model for researchers and developers to experiment with.
Although generative AI is trendy, organizations may struggle to figure out how to apply it and prove ROI; open source models allow them to experiment with finding practical use cases.
Smaller AI models like this don’t quite have the same performance as larger ones like Gemini or GPT-4, but they are flexible enough to let organizations build custom bots for customers or employees. In particular, the fact that Gemma can run on a workstation shows the continued trend from generative AI makers toward giving organizations options for ChatGPT-like functionality without the heavy workload.
SEE: OpenAI’s newest model Sora creates impressive photorealistic videos that still often look unreal. (TechRepublic)Â
What is Google’s Gemma?
Google Gemma is a family of generative AI models that can be used to build chatbots or tools that can summarize content. Google Gemma models can run on a developer laptop, a workstation or through Google Cloud. Two sizes are available, 2 billion or 7 billion parameters.
For developers, Google is providing a variety of tools for Gemma deployment, including toolchains for inference and supervised fine-tuning in JAX, PyTorch and TensorFlow.
For now, Gemma only works in English.
How do I access Google Gemma?
Google Gemma can be accessed through Colab, Hugging Face, Kaggle, Google’s Kubernetes Engine and Vertex AI, and NVIDIA’s NeMo.
Google Gemma can be accessed for free for research and development in Kaggle and through a free tier for Colab notebooks. First-time Google Cloud users can receive $300 in credits toward Gemma. Google Cloud credits of up to $500,000 are available for researchers who apply. Pricing and availability in other cases may depend on your organizations’ particular subscriptions and needs.
Since Google Gemma is open source, commercial use is permitted, as long as that use is in accordance with the Terms of Service. Google also released a Responsible Generative AI Toolkit with which developers can provide guidelines around their AI projects.
“It’s great to see Google reinforcing its commitment to open-source AI, and we’re excited to fully support the launch with comprehensive integration in Hugging Face,” said Hugging Face’s Technical Lead Phillip Schmid, Head of Platform and Community Omar Sanseviero and Machine Learning Engineer Pedro Cuenca in a blog post.
How does Google Gemma work?
Like other generative AI models, Gemma is a software that can respond to natural language prompts as opposed to conventional programming languages or commands. Google Gemma was trained on publicly available information, with personally identifiable information and “sensitive” material filtered out.
Google worked with NVIDIA to optimize Gemma for NVIDIA products, in particular by offering acceleration on NVIDIA’s TensorRT-LLM, a library for large language model inference. Gemma can be fine-tuned in the NVIDIA AI Enterprise.
What are the main competitors to Google Gemma?
Gemma competes with small generative AI models such as Meta’s open source large language models, particularly Llama 2; Mistral AI’s 7B model, Deci’s DecilLM and Microsoft’s Phi-2, as well as similar small generative AI models meant to run on an organization’s own hardware.
Hugging Face noted that Gemma out-performs many other small AI models on its leaderboard, which evaluates pre-trained models on basic factual questions, commonsense reasoning and trustworthiness. Only Llama 2 70B, the model included as a reference benchmark, earned a higher score than Gemma 7B. Gemma 2B, on the other hand, performed relatively poorly compared to other small, open AI models.
Google’s full-scale AI model, Gemini, comes in 1.8B and 3.25B parameter versions and is designed to run on Android phones.