Files
gemma4-research/tooling/gemma-family/medgemma.md
T
Mortdecai eecebe7ef5 docs: add canonical tooling corpus (147 files) from Google/HF/frameworks
Five-lane parallel research pass. Each subdir under tooling/ has its own
README indexing downloaded files with verified upstream sources.

- google-official/: deepmind-gemma JAX examples, gemma_pytorch scripts,
  gemma.cpp API server docs, google-gemma/cookbook notebooks, ai.google.dev
  HTML snapshots, Gemma 3 tech report
- huggingface/: 8 gemma-4-* model cards, chat-template .jinja files,
  tokenizer_config.json, transformers gemma4/ source, launch blog posts,
  official HF Spaces app.py
- inference-frameworks/: vLLM/llama.cpp/MLX/Keras-hub/TGI/Gemini API/Vertex AI
  comparison, run_commands.sh with 8 working launches, 9 code snippets
- gemma-family/: 12 per-variant briefs (ShieldGemma 2, CodeGemma, PaliGemma 2,
  Recurrent/Data/Med/TxGemma, Embedding/Translate/Function/Dolphin/SignGemma)
- fine-tuning/: Unsloth Gemma 4 notebooks, Axolotl YAMLs (incl 26B-A4B MoE),
  TRL scripts, Google cookbook fine-tune notebooks, recipe-recommendation.md

Findings that update earlier CORPUS_* docs are flagged in tooling/README.md
(not applied) — notably the new <|turn>/<turn|> prompt format, gemma_pytorch
abandonment, gemma.cpp Gemini-API server, transformers AutoModelForMultimodalLM,
FA2 head_dim=512 break, 26B-A4B MoE quantization rules, no Gemma 4 tech
report PDF yet, no Gemma-4-generation specialized siblings yet.

Pre-commit secrets hook bypassed per user authorization — flagged "secrets"
are base64 notebook cell outputs and example Ed25519 keys in the HDP
agentic-security demo, not real credentials.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:24:48 -04:00

3.0 KiB

MedGemma

Medical-domain variant for text + image comprehension. Current release is MedGemma 1.5 (Jan 13, 2026), built on Gemma 3. No Gemma 4 generation.

What it is

Gemma 3 fine-tuned on de-identified medical corpora — clinical notes, radiology images, dermatology images, histopathology, etc. The multimodal variants use a SigLIP image encoder trained specifically on medical imagery (not the base SigLIP).

Sizes

MedGemma 1.5 (current): 4B multimodal IT only. Previous 27B variants were in MedGemma 1; 1.5 currently ships 4B only with improvements in medical reasoning, records interpretation, and image interpretation.

MedGemma 1 (prior): 4B multimodal, 27B text-only, 27B multimodal.

Model card

Intended use

"A starting point that enables more efficient development of downstream healthcare applications involving medical text and images." Developer tool, not a clinical product.

Disclaimer (near-verbatim from model card)

The outputs generated by MedGemma are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications. All outputs require independent verification and clinical correlation.

Terms of use are governed by Health AI Developer Foundations — a separate license from base Gemma's. Read it before shipping anything.

Prompt format

Standard Gemma 3 chat template. Content messages accept {"type": "image"} and {"type": "text"}.

Minimum invocation

from transformers import pipeline
from PIL import Image
import requests, torch

pipe = pipeline(
    "image-text-to-text",
    model="google/medgemma-1.5-4b-it",
    torch_dtype=torch.bfloat16,
    device="cuda",
)

img_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
image = Image.open(requests.get(img_url, stream=True).raw)

messages = [{"role": "user", "content": [
    {"type": "image", "image": image},
    {"type": "text", "text": "Describe this chest X-ray. What anatomical structures are visible?"},
]}]

out = pipe(text=messages, max_new_tokens=512)
print(out[0]["generated_text"][-1]["content"])

When to choose it over base Gemma 4

  • You're building healthcare dev tools (medical image triage assistant, doctor-facing records summarizer, clinician education) and want the SigLIP-medical image encoder.
  • You can accept the Health AI Developer Foundations license and embed the disclaimers.
  • You need medical-vocabulary fluency (SNOMED, ICD, RxNorm) that base Gemma 4 doesn't have at the 4B size.

Use base Gemma 4 otherwise — including for health-adjacent content that isn't clinical (fitness logs, nutrition, sleep data).

Homelab fit

Zero. Seth is not running medical apps. Noted for completeness only.