# Gemma 4 Native Tool Calling Format > Source: Google AI for Developers - Function Calling docs > https://ai.google.dev/gemma/docs/capabilities/text/function-calling-gemma4 ## Special Tokens (6 total) | Token | Purpose | |-------|---------| | `<\|tool>` / `` | Tool definition block | | `<\|tool_call>` / `` | Model's tool request | | `<\|tool_response>` / `` | Tool execution result | String delimiter: `<\|"\|>` (encloses all string values in native format) ## Native Format (raw model tokens) ### Tool definition in system prompt: ``` <|tool>declaration: get_current_temperature{ location:{type:<|"|>string<|"|>,description:<|"|>The city<|"|>}, unit:{type:<|"|>string<|"|>,enum:[<|"|>celsius<|"|>,<|"|>fahrenheit<|"|>]} } ``` ### Tool call from model: ``` <|tool_call>call:get_current_temperature{location:<|"|>London<|"|>} ``` ### Tool response: ``` <|tool_response>response:get_current_weather{temperature:15,weather:<|"|>sunny<|"|>} ``` ## JSON Chat Format (for Ollama / OpenAI-compatible APIs) This is what you actually use in practice. Ollama translates to/from native tokens. ### Tool definition: ```json { "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "city": {"type": "string", "description": "The city name"} }, "required": ["city"] } } } ``` ### Model returns: ```json { "role": "assistant", "tool_calls": [{ "function": { "name": "get_weather", "arguments": {"city": "London"} } }] } ``` ### Tool result message: ```json { "role": "tool", "content": "{\"temperature\": 15, \"weather\": \"sunny\"}" } ``` ## Thinking Mode + Tool Calls - When thinking is enabled, preserve thoughts between tool calls - For long agent chains, summarize thoughts as plain text to save context - Recommended: **disable thinking for tool-heavy workflows** (Seth's finding) ## Framework Flags | Framework | Required Flag | |-----------|--------------| | llama.cpp | `--jinja` | | vLLM | `--enable-auto-tool-choice` | | Ollama | Works via `/api/chat` endpoint with `tools` field | | transformers | `apply_chat_template(tools=[...])` | ## Known Issues - Ollama v0.20.0-0.20.1: tool call parser broken, streaming drops tool calls - llama.cpp: format mismatches and continuous loops reported - LM Studio: compatibility issues with tool calling - **Workaround:** Use non-streaming mode for tool calls (proven in Simon)