docs: add canonical tooling corpus (147 files) from Google/HF/frameworks
Five-lane parallel research pass. Each subdir under tooling/ has its own README indexing downloaded files with verified upstream sources. - google-official/: deepmind-gemma JAX examples, gemma_pytorch scripts, gemma.cpp API server docs, google-gemma/cookbook notebooks, ai.google.dev HTML snapshots, Gemma 3 tech report - huggingface/: 8 gemma-4-* model cards, chat-template .jinja files, tokenizer_config.json, transformers gemma4/ source, launch blog posts, official HF Spaces app.py - inference-frameworks/: vLLM/llama.cpp/MLX/Keras-hub/TGI/Gemini API/Vertex AI comparison, run_commands.sh with 8 working launches, 9 code snippets - gemma-family/: 12 per-variant briefs (ShieldGemma 2, CodeGemma, PaliGemma 2, Recurrent/Data/Med/TxGemma, Embedding/Translate/Function/Dolphin/SignGemma) - fine-tuning/: Unsloth Gemma 4 notebooks, Axolotl YAMLs (incl 26B-A4B MoE), TRL scripts, Google cookbook fine-tune notebooks, recipe-recommendation.md Findings that update earlier CORPUS_* docs are flagged in tooling/README.md (not applied) — notably the new <|turn>/<turn|> prompt format, gemma_pytorch abandonment, gemma.cpp Gemini-API server, transformers AutoModelForMultimodalLM, FA2 head_dim=512 break, 26B-A4B MoE quantization rules, no Gemma 4 tech report PDF yet, no Gemma-4-generation specialized siblings yet. Pre-commit secrets hook bypassed per user authorization — flagged "secrets" are base64 notebook cell outputs and example Ed25519 keys in the HDP agentic-security demo, not real credentials. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,105 @@
|
||||
# TranslateGemma
|
||||
|
||||
Multilingual text + image translation. Released **January 15, 2026**. Built on **Gemma 3** (not Gemma 4, despite being the newest variant at time of writing).
|
||||
|
||||
## What it is
|
||||
|
||||
Gemma 3 fine-tuned for translation across **55 languages**, using a two-stage distillation from Gemini. Retains Gemma 3's multimodal capability — can translate text embedded in images.
|
||||
|
||||
## Sizes
|
||||
|
||||
- **4B IT**
|
||||
- **12B IT**
|
||||
- **27B IT**
|
||||
|
||||
Google's headline claim: the 12B beats Gemma 3 27B baseline translation quality with less than half the parameters.
|
||||
|
||||
## Model card
|
||||
|
||||
- HF: https://huggingface.co/google/translategemma-4b-it
|
||||
- Blog: https://blog.google/innovation-and-ai/technology/developers-tools/translategemma/
|
||||
- InfoQ: https://www.infoq.com/news/2026/01/google-translategemma-models/
|
||||
|
||||
## Supported languages
|
||||
|
||||
55 languages via ISO 639-1 codes (`en`, `de`, `es`, `fr`, `pl`, `ja`, `zh`, `ar`, `hi`, etc.) plus regional variants (`en-US`, `en-GB`, `pt-BR`, `pt-PT`, `de-DE`, `de-AT`, `de-CH`, `zh-CN`, `zh-TW`, etc.).
|
||||
|
||||
## Prompt format
|
||||
|
||||
**Strict chat-template format.** Content list must contain exactly **one entry**, with mandatory `source_lang_code` and `target_lang_code`.
|
||||
|
||||
### Text translation
|
||||
|
||||
```python
|
||||
messages = [{
|
||||
"role": "user",
|
||||
"content": [{
|
||||
"type": "text",
|
||||
"source_lang_code": "cs",
|
||||
"target_lang_code": "de-DE",
|
||||
"text": "V nejhorším případě i k prasknutí čočky.",
|
||||
}],
|
||||
}]
|
||||
```
|
||||
|
||||
### Image translation (translates text inside the image)
|
||||
|
||||
```python
|
||||
messages = [{
|
||||
"role": "user",
|
||||
"content": [{
|
||||
"type": "image",
|
||||
"source_lang_code": "ja",
|
||||
"target_lang_code": "en",
|
||||
"url": "https://example.com/japanese-sign.jpg",
|
||||
}],
|
||||
}]
|
||||
```
|
||||
|
||||
Only `"text"` and `"image"` types are supported. Only `user` and `assistant` roles. Image input is normalized to 896×896 (256 vision tokens).
|
||||
|
||||
## Minimum invocation
|
||||
|
||||
```python
|
||||
from transformers import pipeline
|
||||
import torch
|
||||
|
||||
pipe = pipeline(
|
||||
"image-text-to-text",
|
||||
model="google/translategemma-4b-it",
|
||||
device="cuda",
|
||||
dtype=torch.bfloat16,
|
||||
)
|
||||
|
||||
messages = [{
|
||||
"role": "user",
|
||||
"content": [{
|
||||
"type": "text",
|
||||
"source_lang_code": "pl",
|
||||
"target_lang_code": "en",
|
||||
"text": "Dziadek mieszkał w Warszawie przed wojną.",
|
||||
}],
|
||||
}]
|
||||
|
||||
out = pipe(text=messages, max_new_tokens=200)
|
||||
print(out[0]["generated_text"][-1]["content"])
|
||||
```
|
||||
|
||||
## Performance
|
||||
|
||||
- **WMT24++ across 55 languages:** MetricX 5.32, COMET 81.6.
|
||||
- Context window: 2K tokens (short — this is a translation model, not a long-doc summarizer).
|
||||
|
||||
## When to choose it over base Gemma 4
|
||||
|
||||
- You want **translation quality > general Gemma 4** at equivalent size, with the strict prompt contract making it easy to drop into a pipeline.
|
||||
- You need **image-text translation** (street signs, menus, old documents) as a first-class task.
|
||||
- You care about the 55-language coverage and regionalized variants.
|
||||
|
||||
Base Gemma 4 31B *can* translate — fine for casual use. TranslateGemma wins for production pipelines and when you care about metric-validated quality.
|
||||
|
||||
## Homelab fit
|
||||
|
||||
**Strong fit for family history agent.** If source documents are in German, Polish, Hungarian, Yiddish, or any of the 55 supported languages, TranslateGemma 4B on pve197 (GPU-backed) becomes the translation leg of an ingest pipeline: OCR → TranslateGemma → Gemma 4 for reasoning. The 4B size fits alongside the other models on the V100.
|
||||
|
||||
Also useful for SearchXNG (if Seth ever wants to auto-translate non-English search results) and the news-summary print system (translate foreign-language feeds before summarization).
|
||||
Reference in New Issue
Block a user