Gemma: Google Delivers Advanced AI Capabilities through Open Source

DeepMind's launch of the Gemma model heralds a new era of open source AI – one that goes beyond narrow benchmarks to general intelligence capabilities. Extensively tested for security and widely accessible, Gemma sets a new standard for responsible open source in the field of AI.

The field of artificial intelligence (AI) has experienced tremendous progress in recent years, driven in large part by advances in the fields of deep learning artificial intelligence and natural language processing (NLP). At the forefront of these advances are large language models (LLMs) – AI systems trained on large amounts of text data that can generate human-like text and engage in conversational tasks.

LLMs like Google's PaLM, Claude Anthropic, and DeepMind's Gopher have demonstrated extraordinary abilities, from coding to common sense reasoning. However, most of these models have not been released openly, limiting their access for research, development, and useful applications.

This changes with the recent open source of Gemma – a family of LLMs from Google's DeepMind based on the powerful and proprietary Gemini model. In this blog post, we'll dive into Gemma, analyzing its architecture, training process, performance, and responsible releases.

A glimpse of Gemma


In February 2023, DeepMind open-sourced two sizes of Gemma models – a 2 billion parameter version optimized for on-device deployment, and a larger 7 billion parameter version designed for GPU/TPU use.

Gemma leverages a transformer-based architecture and training methodology similar to DeepMind's leading Gemini model. It was trained on up to 6 trillion text tokens from web documents, math, and code.

DeepMind released pre-trained Gemma checkpoints, as well as enhanced versions with supervised learning and human feedback to improve capabilities in areas such as dialogue, following instructions, and coding.

Starting with Gemma


Gemma's open release makes its advanced AI capabilities accessible to developers, researchers, and enthusiasts. Here's a quick guide to getting started:

Platform Agnostic Deployment


Gemma's main strength is its flexibility – you can run it on a CPU, GPU, or TPU. For CPU, utilize TensorFlow Lite or HuggingFace Transformers. To accelerate performance on GPU/TPU, use TensorFlow. Cloud services like Vertex AI from Google Cloud also provide seamless scaling.

Access Trained Models


Gemma comes in different trained variants depending on your needs. Models 2B and 7B offer strong generative capabilities. For custom fit, the 2B-FT and 7B-FT models are an ideal starting point.

Build Interesting Apps


You can build a variety of applications with Gemma, such as story creation, language translation, question answering, and creative content production. The key is to harness the power of Gemma through adjustments to your own data set.

Architecture


Gemma uses a decoder-specific transformer architecture, built on advances such as multi-query attention and spin position embedding:

Transformer: Introduced in 2017, transformer architectures based solely on attention mechanisms have become ubiquitous in NLP. Gemma inherits the transformer's ability to model long-term dependencies in text.
Decoder only: Gemma uses only a transformer decoder stack, unlike encoder-decoder models like BART or T5. This provides powerful generative capabilities for tasks such as text generation.
Multi-query attention: Gemma uses multi-query attention in its larger models, allowing each attention head to process multiple queries in parallel for faster inference.
Rotary position embedding: Gemma represents position information using rotary embedding instead of absolute position encoding. This technique reduces the size of the model while retaining position information.
The use of techniques such as multi-query attention and spin position embedding allows Gemma models to achieve an optimal balance between performance, inference speed, and model size.

Data Process and Training


Gemma was trained on 6 trillion tokens of text data , primarily in English. This includes web documents, mathematical texts, and source code. DeepMind invests significant effort in data filtering, removing toxic or malicious content using classifiers and heuristics.

Training was conducted using Google's TPUv5 infrastructure, with up to 4096 TPUs used to train the Gemma-7B. Efficient model techniques and data parallelism enable training of massive models on commodity hardware.

Incremental training is used, constantly adjusting the data distribution to focus on relevant, high-quality text. The final refinement stage uses a mix of human-generated and synthetic instruction examples to improve capabilities.

Model Performance


DeepMind rigorously evaluates Gemma models based on more than 25 benchmarks covering question answering, reasoning, mathematics, coding, common sense, and dialogue ability.

Gemma achieves state-of-the-art results compared to similarly sized open source models in most benchmarks. Some important things:

Mathematics : Gemma excels in mathematical reasoning tests such as GSM8K and MATH, outperforming models such as Codex and Anthropic's Claude by more than 10 points.
Coding : Gemma equals or exceeds Codex's performance on programming benchmarks like MBPP, despite not being trained specifically on code.
Dialogue : Gemma demonstrated strong conversational abilities with a 51.7% win rate over Anthropic's Mistral-7B in the human preference test.
Thoughts : On tasks requiring inference such as ARC and Winogrande, the Gemma outperforms other 7B models by 5-10 points.
Gemma's flexibility in a wide range of disciplines indicates strong general intelligence capabilities. Although gaps to human-level performance still exist, Gemma represents a leap forward in open source NLP.

Security and Responsibility


Releasing the weight of large, open source models raises challenges around intentional misuse and inherent model bias. DeepMind is taking steps to mitigate risks:

Data filtering: Potentially toxic, illegal, or biased text was removed from the training data using classifiers and heuristics.
Ratings: Gemma is tested on 30+ curated benchmarks to assess security, fairness, and robustness. It matches or exceeds other models.
Tuning: Model enhancements focus on improving safety capabilities such as information filtering and appropriate hedging/denying behavior.
Terms of use: Terms of use prohibit offensive, illegal, or unethical applications of Gemma models. However, law enforcement remains a challenge.
Model cards: Cards detailing model capabilities, limitations, and biases are released to encourage transparency.
Despite the risks of open source, DeepMind determined that the release of Gemma provided a net social benefit based on its security profile and research enablement. However, careful monitoring of potential hazards will remain important.

Enabling the Next Wave of AI Innovation


Releasing Gemma as an open source model family will unlock advancements across the AI ​​community:

Accessibility: Gemma reduces the barrier for organizations to build on the cutting edge of NLP, who previously faced high compute/data costs to train their own LLM.
New applications: With pre-trained and customized open source checkpoints, DeepMind enables easier development of useful applications in areas such as education, science, and accessibility.
Customization: Developers can further customize Gemma for industry- or domain-specific applications through ongoing training on proprietary data.
Research: Open models like Gemma encourage greater transparency and auditing of current NLP systems, thereby illuminating future research directions.
Innovation: The availability of powerful underlying models like Gemma will accelerate progress in areas such as bias mitigation, factuality, and AI safety.
By bringing Gemma's capabilities to everyone through open source, DeepMind hopes to spur the development of responsible AI for social good.

The road ahead


With every leap in AI, we move closer to models that rival or surpass human intelligence in all areas. Systems like Gemma underscore how rapid advances in self-monitoring models are unlocking increasingly advanced cognitive capabilities.

However, much work still needs to be done to improve the reliability, interpretability and controllability of AI – areas where human intelligence is still a top priority. Areas such as mathematics highlight this persistent gap, with Gemma scoring 64% on the MMLU compared to a human performance estimate of 89%.

Closing this gap while ensuring the safety and ethics of increasingly sophisticated AI systems will be a key challenge in the years to come. Striking the right balance between openness and caution is critical, as DeepMind aims to democratize access to the benefits of AI while managing emerging risks.

Initiatives to promote AI safety – such as Dario Amodei's ANC, DeepMind's Ethics & Society team, and Anthropic's Constitutional AI – signal growing awareness of the need for these nuances. Meaningful progress requires open, evidence-based dialogue between researchers, developers, policymakers, and society.

If navigated responsibly, Gemma does not represent the pinnacle of AI, but rather a basecamp for the next generation of AI researchers following in DeepMind's footsteps towards fair and beneficial artificial general intelligence.

Conclusion


DeepMind's launch of the Gemma model heralds a new era of open source AI – one that goes beyond narrow benchmarks to general intelligence capabilities. Extensively tested for security and widely accessible, Gemma sets a new standard for responsible open source in the field of AI.

Driven by a competitive spirit combined with cooperative values, sharing breakthroughs like Gemma will improve the performance of the AI ​​ecosystem. The entire community now has access to the versatile LLM family to encourage or support their initiatives.

While risks remain, DeepMind's technical diligence and ethics provide confidence that the benefits of Gemma outweigh the potential harm. As AI capabilities become more sophisticated, maintaining a nuance between openness and caution is critical.

Gemma takes us one step closer to AI that benefits all of humanity. But many major challenges still await on the road to benevolent artificial general intelligence. If AI researchers, developers, and society at large can sustain collaborative progress, Gemma may one day be seen as a historic basecamp, and not the ultimate summit.


IRFAN SHAH

7 ব্লগ পোস্ট

মন্তব্য