Files
gemma4-research/tooling/google-official/deepmind-gemma/README.md
T
Mortdecai eecebe7ef5 docs: add canonical tooling corpus (147 files) from Google/HF/frameworks
Five-lane parallel research pass. Each subdir under tooling/ has its own
README indexing downloaded files with verified upstream sources.

- google-official/: deepmind-gemma JAX examples, gemma_pytorch scripts,
  gemma.cpp API server docs, google-gemma/cookbook notebooks, ai.google.dev
  HTML snapshots, Gemma 3 tech report
- huggingface/: 8 gemma-4-* model cards, chat-template .jinja files,
  tokenizer_config.json, transformers gemma4/ source, launch blog posts,
  official HF Spaces app.py
- inference-frameworks/: vLLM/llama.cpp/MLX/Keras-hub/TGI/Gemini API/Vertex AI
  comparison, run_commands.sh with 8 working launches, 9 code snippets
- gemma-family/: 12 per-variant briefs (ShieldGemma 2, CodeGemma, PaliGemma 2,
  Recurrent/Data/Med/TxGemma, Embedding/Translate/Function/Dolphin/SignGemma)
- fine-tuning/: Unsloth Gemma 4 notebooks, Axolotl YAMLs (incl 26B-A4B MoE),
  TRL scripts, Google cookbook fine-tune notebooks, recipe-recommendation.md

Findings that update earlier CORPUS_* docs are flagged in tooling/README.md
(not applied) — notably the new <|turn>/<turn|> prompt format, gemma_pytorch
abandonment, gemma.cpp Gemini-API server, transformers AutoModelForMultimodalLM,
FA2 head_dim=512 break, 26B-A4B MoE quantization rules, no Gemma 4 tech
report PDF yet, no Gemma-4-generation specialized siblings yet.

Pre-commit secrets hook bypassed per user authorization — flagged "secrets"
are base64 notebook cell outputs and example Ed25519 keys in the HDP
agentic-security demo, not real credentials.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:24:48 -04:00

3.3 KiB

Gemma

Unittests PyPI version Documentation Status

Gemma is a family of open-weights Large Language Model (LLM) by Google DeepMind, based on Gemini research and technology.

This repository contains the implementation of the gemma PyPI package. A JAX library to use and fine-tune Gemma.

For examples and use cases, see our documentation. Please report issues and feedback in our GitHub.

Installation

  1. Install JAX for CPU, GPU or TPU. Follow the instructions on the JAX website.

  2. Run

    pip install gemma
    

Examples

Here is a minimal example to have a multi-turn, multi-modal conversation with Gemma:

from gemma import gm

# Model and parameters (Gemma 4)
model = gm.nn.Gemma4_E4B()
params = gm.ckpts.load_params(gm.ckpts.CheckpointPath.GEMMA4_E4B_IT)

# Example of multi-turn conversation
sampler = gm.text.ChatSampler(
    model=model,
    params=params,
    multi_turn=True,
)

prompt = """Which of the 2 images do you prefer ?

Image 1: <|image|>
Image 2: <|image|>

Write your answer as a poem."""
out0 = sampler.chat(prompt, images=[image1, image2])

out1 = sampler.chat('What about the other image ?')

The same ChatSampler API works with all Gemma versions (2, 3, 3n, 4).

Our documentation contains various Colabs and tutorials, including:

Additionally, our examples/ folder contain additional scripts to fine-tune and sample with Gemma.

Learn more about Gemma

Downloading the models

To download the model weights. See our documentation.

System Requirements

Gemma can run on a CPU, GPU and TPU. For GPU, we recommend 8GB+ RAM on GPU for The 2B checkpoint and 24GB+ RAM on GPU are used for the 7B checkpoint.

Contributing

We welcome contributions! Please read our Contributing Guidelines before submitting a pull request.

This is not an official Google product.