eecebe7ef5
Five-lane parallel research pass. Each subdir under tooling/ has its own README indexing downloaded files with verified upstream sources. - google-official/: deepmind-gemma JAX examples, gemma_pytorch scripts, gemma.cpp API server docs, google-gemma/cookbook notebooks, ai.google.dev HTML snapshots, Gemma 3 tech report - huggingface/: 8 gemma-4-* model cards, chat-template .jinja files, tokenizer_config.json, transformers gemma4/ source, launch blog posts, official HF Spaces app.py - inference-frameworks/: vLLM/llama.cpp/MLX/Keras-hub/TGI/Gemini API/Vertex AI comparison, run_commands.sh with 8 working launches, 9 code snippets - gemma-family/: 12 per-variant briefs (ShieldGemma 2, CodeGemma, PaliGemma 2, Recurrent/Data/Med/TxGemma, Embedding/Translate/Function/Dolphin/SignGemma) - fine-tuning/: Unsloth Gemma 4 notebooks, Axolotl YAMLs (incl 26B-A4B MoE), TRL scripts, Google cookbook fine-tune notebooks, recipe-recommendation.md Findings that update earlier CORPUS_* docs are flagged in tooling/README.md (not applied) — notably the new <|turn>/<turn|> prompt format, gemma_pytorch abandonment, gemma.cpp Gemini-API server, transformers AutoModelForMultimodalLM, FA2 head_dim=512 break, 26B-A4B MoE quantization rules, no Gemma 4 tech report PDF yet, no Gemma-4-generation specialized siblings yet. Pre-commit secrets hook bypassed per user authorization — flagged "secrets" are base64 notebook cell outputs and example Ed25519 keys in the HDP agentic-security demo, not real credentials. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
282 lines
18 KiB
Markdown
282 lines
18 KiB
Markdown
# Gemma 4 Fine-Tuning Tooling — Index
|
||
|
||
Research captured 2026-04-18. All downloads verified against upstream repos.
|
||
|
||
## TL;DR
|
||
|
||
| Tool | Gemma 4 coverage | GPU floor (LoRA) | GPU floor (full FT) | Best at |
|
||
|------|------------------|------------------|---------------------|---------|
|
||
| **Unsloth** | Full parity — all 4 sizes, text/vision/audio/GRPO/RL | E2B: 8 GB, E4B: 17 GB, 26B A4B: ~40 GB, 31B QLoRA: 22 GB | Not recommended locally | **Fastest path**, Google-blessed, free Colab |
|
||
| **TRL** | Partial — no `sft_gemma4.py` yet; `sft_gemma3.py` + `AutoModelForImageTextToText` works | Same as Unsloth w/ `load_in_4bit` | 2x H100 min for 31B | Research-grade control, DPO/GRPO/online RL, VLM GRPO on Gemma 4 (CARLA) |
|
||
| **Axolotl** | **Native Gemma 4 configs shipped** (`examples/gemma4/`) | Single 5090 (32 GB) for 26B A4B QLoRA validated | >80 GB, "not tested" per README | Declarative YAML, multi-GPU FSDP, MoE expert LoRA |
|
||
| **Google cookbook** | `docs/core/*` notebooks default to `google/gemma-4-E2B` | Depends on Colab tier | L4 (22 GB) for E4B QLoRA | Canonical baseline, paired with ai.google.dev docs |
|
||
| **HF gemma-recipes** | Inference + one GRPO VLM script (CARLA) | E2B on T4 | — | VLM GRPO with tool-calling environment |
|
||
| **Ollama** | Serves fine-tuned Gemma 4 via Modelfile `ADAPTER` | — | — | Final serving step |
|
||
|
||
**Recommendation for Seth: Unsloth.** See `recipe-recommendation.md`.
|
||
|
||
---
|
||
|
||
## 1. Unsloth (`unsloth/`)
|
||
|
||
**Upstream:** `unslothai/notebooks`, `unslothai/unsloth`
|
||
**License:** LGPL-3.0 (notebooks), Apache-2.0 (library)
|
||
**Published Gemma 4 Dynamic quants:**
|
||
- `unsloth/gemma-4-{E2B,E4B,31B,26B-A4B}-{,it}-unsloth-bnb-4bit` (dynamic 4-bit)
|
||
- `unsloth/gemma-4-{E2B,E4B,31B,26B-A4B}-it-GGUF` (GGUF for inference)
|
||
- Collection: https://huggingface.co/collections/unsloth/gemma-4
|
||
|
||
**Downloaded files (local paths under this directory):**
|
||
- `unsloth/notebooks/Gemma4_(E2B)-Text.ipynb` — **canonical SFT notebook, T4-compatible**
|
||
- `unsloth/notebooks/Gemma4_(E4B)-Text.ipynb` — 10 GB VRAM, higher accuracy
|
||
- `unsloth/notebooks/Gemma4_(26B_A4B)-Text.ipynb` — MoE SFT (needs A100+)
|
||
- `unsloth/notebooks/Gemma4_(31B)-Text.ipynb` — dense 31B SFT
|
||
- `unsloth/notebooks/Gemma4_(E2B|E4B|26B_A4B|31B)-Vision.ipynb` — vision SFT w/ `UnslothVisionDataCollator`
|
||
- `unsloth/notebooks/Gemma4_(E2B|E4B)-Audio.ipynb` — audio SFT (E2B/E4B only — 31B/26B have no audio encoder)
|
||
- `unsloth/notebooks/Gemma4_(E2B)_GRPO.ipynb` — GRPO RL w/ Python reward funcs
|
||
- `unsloth/notebooks/Gemma4_(E2B)_Reinforcement_Learning_{2048,Sudoku}_Game.ipynb` — game-playing RL
|
||
- `unsloth/python_scripts/*.py` — same content as `.py` scripts (easier to grep/modify)
|
||
- `unsloth/kaggle/Gemma4_(31B)-Text.ipynb`, `unsloth/kaggle/Gemma4_(E4B)-Text.ipynb` — Kaggle-flavored variants
|
||
- `unsloth/docs/unsloth-README.md` — top-level Unsloth README
|
||
|
||
**Upstream URLs (useful to share):**
|
||
- SFT E4B Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma4_(E4B)-Text.ipynb
|
||
- GRPO Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma4_(E2B)_GRPO.ipynb
|
||
- Unsloth Gemma 4 docs: https://unsloth.ai/docs/models/gemma-4/train
|
||
|
||
### Unsloth chat-template & masking detail (CRITICAL for Gemma 4)
|
||
|
||
Gemma 4 does **not** use Gemma 3's `<start_of_turn>` / `<end_of_turn>`. The new format is:
|
||
|
||
```
|
||
<bos><|turn>user
|
||
Hello<turn|>
|
||
<|turn>model
|
||
Hey there!<turn|>
|
||
```
|
||
|
||
Unsloth's helper:
|
||
```python
|
||
from unsloth.chat_templates import get_chat_template
|
||
tokenizer = get_chat_template(tokenizer, chat_template = "gemma-4") # literal "gemma-4", not "gemma4"
|
||
```
|
||
|
||
Response-only masking (matches Unsloth's convention; everything *before* `response_part` is loss-masked):
|
||
```python
|
||
from unsloth.chat_templates import train_on_responses_only
|
||
trainer = train_on_responses_only(
|
||
trainer,
|
||
instruction_part = "<|turn>user\n",
|
||
response_part = "<|turn>model\n",
|
||
)
|
||
```
|
||
|
||
`<bos>` gotcha: `apply_chat_template` prepends `<bos>`; Unsloth's `formatting_prompts_func` strips it with `.removeprefix('<bos>')` because the SFTTrainer's data collator adds its own — double `<bos>` silently degrades training.
|
||
|
||
**Tool tokens (`<|tool>`, `<|tool_call>`, `<|tool_response>`, `<|"|>`) are *not* masked** in Unsloth's default setup — they flow through as plain text inside user/assistant turns. If you're fine-tuning on tool-call data, include full `<|tool_call>...<tool_call|>` markup in the assistant `content` field; the template doesn't need a special `role=tool` branch.
|
||
|
||
### Unsloth MoE note
|
||
|
||
For 26B A4B (128 experts): Unsloth explicitly recommends **bf16/16-bit LoRA, NOT 4-bit QLoRA** ("MoE QLoRA not recommended, dense 31B is fine"). Their notebook uses `load_in_4bit = True` at >40 GB but the docs flag this as suboptimal.
|
||
|
||
---
|
||
|
||
## 2. TRL (`trl/`)
|
||
|
||
**Upstream:** `huggingface/trl`
|
||
**License:** Apache-2.0
|
||
|
||
**Gemma 4-specific scripts:** NONE in `examples/scripts/` as of 2026-04-18. The canonical Gemma 4 TRL example lives in `huggingface-gemma-recipes/scripts/carla_vlm_gemma.py` (see next section).
|
||
|
||
**Closest-match Gemma 3 scripts downloaded (drop-in for Gemma 4 — change `model_id` to `google/gemma-4-*-it`, keep `AutoModelForImageTextToText`):**
|
||
- `trl/sft_gemma3.py` — **use this as the Gemma 4 SFT template**. Pure text SFT (Codeforces-COTS).
|
||
- `trl/sft_vlm_gemma3.py` — vision SFT template (uses `AutoModelForImageTextToText`, `all-linear` LoRA).
|
||
- `trl/sft.py`, `trl/trl_scripts_sft.py` — the generic SFTTrainer wrappers.
|
||
- `trl/sft_vlm.py` — model-agnostic VLM SFT.
|
||
- `trl/dpo.py` — DPO (1-liner using TrlParser).
|
||
- `trl/grpo_agent.py`, `trl/grpo_vlm.py` — GRPO with tool-calling environments.
|
||
- `trl/sft_tiny_aya_tool_calling.py` — tool-calling SFT pattern.
|
||
|
||
**Chat template / masking detail:** TRL's `SFTTrainer` uses `tokenizer.apply_chat_template` end-to-end and delegates to the tokenizer's built-in Jinja template. For `google/gemma-4-*-it`, that template already produces `<|turn>user…<turn|>`. TRL supports `completion_only_loss` via the `SFTConfig(assistant_only_loss=True)` flag (TRL ≥ 0.22), which masks anything before the assistant turn — no manual `instruction_part` plumbing needed.
|
||
|
||
### Official HF blog says (verbatim):
|
||
> "Gemma 4 is fully supported for fine-tuning with TRL. … we have prepared an example on how to fine-tune Gemma 4 with TRL on Vertex AI using SFT, to showcase how to extend the function calling capabilities, **whilst freezing both the vision and audio towers**."
|
||
(see `huggingface-recipes/hf-blog-gemma4.md` §634-687)
|
||
|
||
---
|
||
|
||
## 3. Axolotl (`axolotl/`)
|
||
|
||
**Upstream:** `axolotl-ai-cloud/axolotl`, `examples/gemma4/`
|
||
**License:** Apache-2.0
|
||
**Gemma 4 status:** **Native support shipped**, day-one-class parity.
|
||
|
||
**Downloaded files:**
|
||
- `axolotl/README.md` — official Axolotl Gemma 4 guide
|
||
- `axolotl/31b-qlora.yaml` — 31B dense QLoRA, 1x80GB @ ~44 GB VRAM
|
||
- `axolotl/31b-qlora-flex.yaml` — 31B dense QLoRA + Flex Attention, 1x80GB @ ~26 GB (40% less VRAM, 50% throughput cost)
|
||
- `axolotl/26b-a4b-moe-qlora.yaml` — 26B MoE QLoRA + ScatterMoE expert-quantized + Expert-LoRA. Validated: 50 steps FineTome, loss 8.8→1.8, single RTX 5090 (32 GB), 21 GiB peak
|
||
- `axolotl/e2b-vision-lora.yaml` — E2B vision LoRA with `freeze_mm_modules: true`
|
||
|
||
**Run command (from Axolotl README):**
|
||
```bash
|
||
axolotl train examples/gemma4/26b-a4b-moe-qlora.yaml
|
||
axolotl train examples/gemma4/31b-qlora.yaml
|
||
axolotl train examples/gemma4/31b-qlora-flex.yaml
|
||
```
|
||
|
||
### Axolotl chat template & masking detail
|
||
|
||
```yaml
|
||
chat_template: gemma4
|
||
datasets:
|
||
- path: mlabonne/FineTome-100k
|
||
type: chat_template
|
||
field_messages: conversations
|
||
message_property_mappings:
|
||
role: from
|
||
content: value
|
||
```
|
||
`chat_template: gemma4` (no dash — Axolotl's key is different from Unsloth's `"gemma-4"`). The template applies Gemma 4 turn tokens (`<|turn>user … <turn|>`). Masking is handled automatically by `type: chat_template` — only the assistant turn counts toward loss.
|
||
|
||
### Axolotl hard limitations for Gemma 4 (from their README)
|
||
|
||
- **Flash Attention OFF.** FA2 caps head_dim at 256; FA4 at 128; Gemma 4's `global_head_dim=512` exceeds both. **Use SDP or Flex Attention.** (`sdp_attention: true` in every yaml.)
|
||
- **LoRA kernels OFF.** Due to Gemma 4's shared-KV layers (last N layers reuse K/V tensors): `lora_mlp_kernel: false`, `lora_qkv_kernel: false`, `lora_o_kernel: false`.
|
||
- **`lora_target_linear` is incompatible** for multimodal. You MUST use `lora_target_modules` with the regex (see below) to restrict LoRA to the text decoder and NOT the vision/audio encoders.
|
||
|
||
Axolotl's canonical regex restricts LoRA to text layers only:
|
||
```regex
|
||
model.language_model.layers.[\d]+.(_checkpoint_wrapped_module.)?(mlp|self_attn).(up|down|gate|q|k|v|o)_proj
|
||
```
|
||
|
||
For 26B A4B MoE, additionally target expert 3D tensors:
|
||
```yaml
|
||
lora_target_parameters:
|
||
- experts.gate_up_proj
|
||
- experts.down_proj
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Google Cookbook (`google-cookbook/`)
|
||
|
||
**Upstream:** `google-gemma/cookbook`, `docs/core/`
|
||
**License:** Apache-2.0
|
||
**Gemma 4 status:** The `docs/core/*.ipynb` fine-tuning notebooks default to `google/gemma-4-E2B` as `model_id` — they ARE the Gemma 4 path, despite generic filenames.
|
||
|
||
**Downloaded files:**
|
||
- `google-cookbook/huggingface_text_finetune_qlora.ipynb` — **text-to-SQL QLoRA tutorial** (gretel-synthetic-text-to-sql dataset, `philschmid/gretel-synthetic-text-to-sql`). This is the one ai.google.dev links to as the "official" fine-tune path.
|
||
- `google-cookbook/huggingface_text_full_finetune.ipynb` — full-weights fine-tune variant
|
||
- `google-cookbook/huggingface_vision_finetune_qlora.ipynb` — vision QLoRA on product descriptions
|
||
- `google-cookbook/lora_tuning.ipynb` — LoRA concepts tutorial
|
||
- `google-cookbook/function-calling-gemma4.ipynb` — official Google function-calling notebook (not a fine-tune, but the authoritative reference for tool-call tokens)
|
||
- `google-cookbook/Gemma_4_HDP_Agentic_Security.ipynb` + `Gemma_4_HDP_README.md` — full-app fine-tune example (agentic security)
|
||
|
||
**Upstream URLs:**
|
||
- https://ai.google.dev/gemma/docs/core/huggingface_text_finetune_qlora
|
||
- https://ai.google.dev/gemma/docs/core/huggingface_vision_finetune_qlora
|
||
- https://ai.google.dev/gemma/docs/capabilities/text/function-calling-gemma4
|
||
|
||
### Google cookbook chat template & masking detail (VERY IMPORTANT)
|
||
|
||
The cookbook notebooks use TRL's `SFTTrainer` with standard `messages` list (`role`/`content`) — chat-template is applied automatically by the tokenizer's built-in Jinja. No manual `instruction_part`/`response_part`.
|
||
|
||
**The non-obvious detail** is the `LoraConfig`:
|
||
```python
|
||
peft_config = LoraConfig(
|
||
lora_alpha=16, lora_dropout=0.05, r=16, bias="none",
|
||
target_modules="all-linear",
|
||
task_type="CAUSAL_LM",
|
||
modules_to_save=["lm_head", "embed_tokens"], # NOTE
|
||
ensure_weight_tying=True, # NOTE
|
||
)
|
||
```
|
||
`modules_to_save=["lm_head","embed_tokens"]` + `ensure_weight_tying=True` is required because **Gemma 4 introduced new special tokens (`<|turn>`, `<|tool>`, `<|tool_call>`, `<|tool_response>`, `<|"|>`) that need their embeddings to be trainable in a fine-tune.** PEFT 0.15+ added `ensure_weight_tying` specifically for this case. Skipping it causes the adapter to see frozen random embeddings for the new tokens and training silently underperforms.
|
||
|
||
For vision, Google's cookbook uses plain `target_modules="all-linear"` (NO `exclude_modules`) — meaning it *does* train LoRA adapters on the vision tower. This is a different tradeoff from Axolotl (`freeze_mm_modules: true`) and from TRL's CARLA recipe (`exclude_modules=["vision_tower", "multi_modal_projector"]`). Pick based on whether your task needs the vision encoder to adapt (e.g., new image domain) or just the text decoder (most cases).
|
||
|
||
---
|
||
|
||
## 5. HuggingFace gemma-recipes (`huggingface-recipes/`)
|
||
|
||
**Upstream:** `huggingface/huggingface-gemma-recipes`
|
||
**License:** Apache-2.0
|
||
|
||
**Downloaded files:**
|
||
- `huggingface-recipes/carla_vlm_gemma.py` — **The canonical TRL + Gemma 4 example.** GRPO VLM training in a CARLA driving environment with tool calls. Shows `exclude_modules=["vision_tower", "multi_modal_projector"]`, `chat_template_kwargs={"enable_thinking": False}`, `max_tool_calling_iterations=10`.
|
||
- `huggingface-recipes/Gemma4_(E2B)-Multimodal.ipynb` — **inference-only** multimodal demo (vision, video, audio, function calling, object detection). Not a fine-tune but necessary reference for the input format the training data must match.
|
||
- `huggingface-recipes/README.md` — HF's top-level recipes index
|
||
- `huggingface-recipes/hf-blog-gemma4.md` — the HF blog post's raw markdown (§630-707 is the fine-tuning section)
|
||
|
||
**Run command for the CARLA VLM RL example:**
|
||
```bash
|
||
pip install git+https://github.com/huggingface/trl.git
|
||
python examples/scripts/openenv/carla_vlm_gemma.py \
|
||
--env-urls https://sergiopaniego-carla-env.hf.space https://sergiopaniego-carla-env-2.hf.space \
|
||
--model google/gemma-4-E2B-it
|
||
```
|
||
|
||
**Known gap:** HF's gemma-recipes repo has *fine-tuning* notebooks for Gemma 3 and Gemma 3n (free T4 Colab) but **no pure-SFT Gemma 4 fine-tuning notebook yet** — the Gemma 4 Colab is inference only. Their blog points users to Unsloth Studio for the easy path.
|
||
|
||
---
|
||
|
||
## 6. Ollama / llama.cpp LoRA serving (`ollama-llamacpp/`)
|
||
|
||
**Downloaded:** `ollama-llamacpp/ollama-import-lora.md` — distilled from https://docs.ollama.com/import (2026-04-18 fetch).
|
||
|
||
**Short answer:** Yes, you can serve a Gemma 4 LoRA via Ollama. Two paths:
|
||
|
||
1. **Merge then serve (simpler, recommended):** `model.save_pretrained_merged("out", tokenizer, save_method="merged_16bit")` → `llama.cpp/convert_hf_to_gguf.py` → `llama.cpp/quantize` to Q4_K_M → `ollama create mymodel -f Modelfile` with `FROM ./gemma4-mortdecai.gguf`.
|
||
2. **Adapter-only serve:** `llama.cpp/convert_lora_to_gguf.py` on the PEFT directory → Modelfile with `FROM gemma4:e4b-it-q8_0` + `ADAPTER ./adapter.gguf`.
|
||
|
||
Ollama's docs list supported architectures as Llama/Mistral/Gemma 1-2 — Gemma 4 isn't *explicitly* listed, but llama.cpp has day-one Gemma 4 support and in practice the path works. (Vision-adapter serving via Ollama is still a grey area.)
|
||
|
||
---
|
||
|
||
## 7. Datasets the canonical tutorials pair with Gemma 4
|
||
|
||
| Tutorial | Dataset | Format | Notes |
|
||
|----------|---------|--------|-------|
|
||
| Unsloth Gemma4 E4B Text | `mlabonne/FineTome-100k` | ShareGPT-style `conversations` field | Also the Axolotl default |
|
||
| Unsloth Gemma4 GRPO | Synthetic kernel-optimization prompts in-notebook | Python reward funcs | RL w/ `function_works` / `check_only_stdlib_imports` |
|
||
| Unsloth Gemma4 Vision | `unsloth/LaTeX_OCR` | HF image-text pairs | Demonstrates `UnslothVisionDataCollator` |
|
||
| Google cookbook text QLoRA | `philschmid/gretel-synthetic-text-to-sql` | chat `messages` list | Google's "official" demo dataset for Gemma 4 |
|
||
| Google cookbook vision QLoRA | `philschmid/amazon-product-descriptions-vlm` | image + text pairs | Product-description generation |
|
||
| Axolotl Gemma 4 (all sizes) | `mlabonne/FineTome-100k` | `type: chat_template` | Validated in axolotl README |
|
||
| Axolotl E2B vision LoRA | `HuggingFaceH4/llava-instruct-mix-vsft` | vision-language SFT | Same as HF's VLM template |
|
||
| TRL sft_gemma3 (transfers) | `open-r1/codeforces-cots` | `messages` list | Chain-of-thought coding |
|
||
| TRL carla_vlm_gemma (Gemma 4 VLM GRPO) | CARLA simulator (live) | environment rollouts | Multimodal tool responses |
|
||
|
||
No one uses Alpaca or UltraChat as the canonical Gemma 4 pair. **FineTome-100k is the unofficial standard** — both Unsloth and Axolotl default to it.
|
||
|
||
---
|
||
|
||
## 8. Chat-template-and-masking matrix (the debugging cheat sheet)
|
||
|
||
| Framework | chat_template key | Turn tokens | Response masking API | BOS handling |
|
||
|-----------|-------------------|-------------|----------------------|--------------|
|
||
| Unsloth | `"gemma-4"` | `<|turn>role\n...<turn|>` | `train_on_responses_only(instruction_part="<|turn>user\n", response_part="<|turn>model\n")` | Strip `<bos>` manually with `.removeprefix('<bos>')` before passing to trainer |
|
||
| TRL | tokenizer's built-in Jinja (no key needed) | same | `SFTConfig(assistant_only_loss=True)` | Tokenizer handles automatically |
|
||
| Axolotl | `chat_template: gemma4` (no dash) | same | automatic via `type: chat_template` | Automatic |
|
||
| Google cookbook | tokenizer built-in Jinja | same | automatic via `SFTTrainer` + `messages` | Automatic |
|
||
|
||
Tool tokens (`<|tool>`, `<|tool_call>`, `<|tool_response>`, `<|"|>`) ride inside message content — none of the frameworks mask them specially, and none provide a `role="tool"` branch in the default template. If you're training tool-call data, put the complete `<|tool_call>call:{...}<tool_call|>` block in the assistant message `content`.
|
||
|
||
Also: **all Gemma 4 fine-tunes should `modules_to_save=["lm_head","embed_tokens"]` + `ensure_weight_tying=True`** in LoraConfig if you're using PEFT directly, because the new special-token embeddings need to be trainable. Unsloth and Axolotl handle this for you; naïve TRL + PEFT scripts do NOT by default.
|
||
|
||
---
|
||
|
||
## What's NOT here (and why)
|
||
|
||
- **Kaggle/Colab free-tier notebooks as a separate category** — the Unsloth notebooks *are* the free-tier notebooks. E2B Text runs on a free T4; 31B/26B-A4B need A100 Colab Pro. I pulled 2 Kaggle-flavored variants to `unsloth/kaggle/` for completeness.
|
||
- **Google's DeepMind JAX/Flax Gemma 4 fine-tune script** — Google's DeepMind-gemma repo ships inference/reference code, not a SFT script. Google's *canonical* fine-tune path is the HF+TRL notebook in `google-gemma/cookbook` (above), NOT JAX. If you want JAX, see the archived `.archive/Gemma/[Gemma_1]Finetune_distributed.ipynb` pattern — not ported to Gemma 4.
|
||
- **Full-weights 31B fine-tuning commands** — Axolotl's README says "heavy and has not been tested." Skip unless Seth rents an 8×H100 pod.
|
||
- **Prompt engineering / inference-only notebooks** — per scope.
|
||
|
||
## See also
|
||
|
||
- `recipe-recommendation.md` — which tool Seth should actually use for his homelab, with the exact command.
|
||
- `../../GOTCHAS.md` §"Fine-Tuning Ecosystem Issues" — day-one issues (required `mm_token_type_ids` field, Gemma4ClippableLinear PEFT issue, E2B/E4B training loss 13-15 being normal).
|
||
- `../../CORPUS_tool_calling_format.md` — the 6 tool-calling special tokens.
|