gemma4-research/tooling/google-official/cookbook/docs_gemma_chat.ipynb

{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "G3MMAcssHTML"
      },
      "source": [
        "<link rel=\"stylesheet\" href=\"/site-assets/css/gemma.css\">\n",
        "<link rel=\"stylesheet\" href=\"https://fonts.googleapis.com/css2?family=Google+Symbols:opsz,wght,FILL,GRAD@20..48,100..700,0..1,-50..200\" />"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Tce3stUlHN0L"
      },
      "source": [
        "##### Copyright 2024 Google LLC."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "tuOe1ymfHZPu"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4qxv4Sn9b8CE"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://ai.google.dev/gemma/docs/gemma_chat\"><img src=\"https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png\" height=\"32\" width=\"32\" />View on ai.google.dev</a>\n",
        "  </td>\n",
        "    <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/docs/gemma_chat.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://kaggle.com/kernels/welcome?src=https://github.com/google-gemini/gemma-cookbook/blob/main/docs/gemma_chat.ipynb\"><img src=\"https://www.kaggle.com/static/images/logos/kaggle-logo-transparent-300.png\" height=\"32\" width=\"70\"/>Run in Kaggle</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fgoogle-gemini%2Fgemma-cookbook%2Fmain%2Fdocs%2Fgemma_chat.ipynb\"><img src=\"https://ai.google.dev/images/cloud-icon.svg\" width=\"40\" />Open in Vertex AI</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/google-gemini/gemma-cookbook/blob/main/docs/gemma_chat.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "402c3d8a"
      },
      "source": [
        "# Building a chatbot with Gemma"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "b686fd95"
      },
      "source": [
        "Large Language Models (LLMs) such as Gemma excel at generating informative responses, making them ideal for building virtual assistants and chatbots.\n",
        "\n",
        "Conventionally, LLMs operate in a stateless manner, meaning they lack an inherent memory to store past conversations. Each prompt or question is processed independently, disregarding prior interactions. However, a crucial aspect of natural conversation is the ability to retain context from prior interactions. To overcome this limitation and enable LLMs to maintain conversation context, they must be explicitly provided with relevant information such as the conversation history (or pertinent parts) into each new prompt presented to the LLM.\n",
        "\n",
        "This tutorial shows you how to develop a chatbot using the instruction-tuned model variant of Gemma."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "29732090"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QQ6W7NzRe1VM"
      },
      "source": [
        "### Gemma setup\n",
        "\n",
        "To complete this tutorial, you'll first need to complete the setup instructions at [Gemma setup](https://ai.google.dev/gemma/docs/setup). The Gemma setup instructions show you how to do the following:\n",
        "\n",
        "* Get access to Gemma on kaggle.com.\n",
        "* Select a Colab runtime with sufficient resources to run\n",
        "  the Gemma 2B model.\n",
        "* Generate and configure a Kaggle username and API key.\n",
        "\n",
        "After you've completed the Gemma setup, move on to the next section, where you'll set environment variables for your Colab environment."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_gN-IVRC3dQe"
      },
      "source": [
        "### Set environment variables\n",
        "\n",
        "Set environment variables for `KAGGLE_USERNAME` and `KAGGLE_KEY`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "DrBoa_Urw9Vx"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "from google.colab import userdata\n",
        "\n",
        "# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env\n",
        "# vars as appropriate for your system.\n",
        "os.environ[\"KAGGLE_USERNAME\"] = userdata.get('KAGGLE_USERNAME')\n",
        "os.environ[\"KAGGLE_KEY\"] = userdata.get('KAGGLE_KEY')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "z9oy3QUmXtSd"
      },
      "source": [
        "### Install dependencies\n",
        "\n",
        "Install Keras and KerasNLP."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "a973dd7a"
      },
      "outputs": [],
      "source": [
        "# Install Keras 3 last. See https://keras.io/getting_started/ for more details.\n",
        "!pip install -q tensorflow-cpu\n",
        "!pip install -q -U keras-nlp tensorflow-hub\n",
        "!pip install -q -U \"keras>=3\"\n",
        "!pip install -q -U tensorflow-text"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Wme8666dUPVR"
      },
      "source": [
        "### Select a backend\n",
        "\n",
        "Keras is a high-level, multi-framework deep learning API designed for simplicity and ease of use. [Keras 3](https://keras.io/keras_3){:.external} lets you choose the backend: TensorFlow, JAX, or PyTorch. All three will work for this tutorial."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "67d12d2d"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "\n",
        "# Select JAX as the backend\n",
        "os.environ[\"KERAS_BACKEND\"] = \"jax\"\n",
        "\n",
        "# Pre-allocate 100% of TPU memory to minimize memory fragmentation\n",
        "os.environ[\"XLA_PYTHON_CLIENT_MEM_FRACTION\"] = \"1.0\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ajm_SGWTUjVd"
      },
      "source": [
        "### Import packages\n",
        "\n",
        "Import Keras and KerasNLP."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3lyn9FxPUok8"
      },
      "outputs": [],
      "source": [
        "import keras\n",
        "import keras_nlp\n",
        "\n",
        "# for reproducibility\n",
        "keras.utils.set_random_seed(42)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "39dc9d5b"
      },
      "source": [
        "### Instantiate the model\n",
        "\n",
        "KerasNLP provides implementations of many popular [model architectures](https://keras.io/api/keras_nlp/models/){:.external}. In this tutorial, you'll instantiate the model using `GemmaCausalLM`, an end-to-end Gemma model for causal language modeling. A causal language model predicts the next token based on previous tokens.\n",
        "\n",
        "Instantiate the model using the `from_preset` method:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "c86dc8fe"
      },
      "outputs": [],
      "source": [
        "gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset(\"gemma2_instruct_2b_en\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tcCv0BSdVFv9"
      },
      "source": [
        "The `GemmaCausalLM.from_preset()` function instantiates the model from a preset architecture and weights. In the code above, the string `\"gemma2_instruct_2b_en\"` specifies the preset the Gemma 2 2B model with 2 billion parameters. Gemma models with [7B, 9B, and 27B parameters](https://ai.google.com/gemma/docs/get_started#models-list) are also available. You can find the code strings for Gemma models in their **Model Variation** listings on [Kaggle](https://www.kaggle.com/models/google/gemma).\n",
        "\n",
        "Note: To run the larger models in Colab, you need access to the premium GPUs available in paid plans. Alternatively, you can perform inferences using Kaggle notebooks or Google Cloud projects."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bLNx8AoeVe-a"
      },
      "source": [
        "Use the `summary` method to get more info about the model:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3MorieIpVksu"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Preprocessor: \"gemma_causal_lm_preprocessor\"</span>\n",
              "</pre>\n"
            ],
            "text/plain": [
              "\u001b[1mPreprocessor: \"gemma_causal_lm_preprocessor\"\u001b[0m\n"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/html": [
              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
              "┃<span style=\"font-weight: bold\"> Tokenizer (type)                                   </span>┃<span style=\"font-weight: bold\">                                             Vocab # </span>┃\n",
              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
              "│ gemma_tokenizer (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">GemmaTokenizer</span>)                   │                                             <span style=\"color: #00af00; text-decoration-color: #00af00\">256,000</span> │\n",
              "└────────────────────────────────────────────────────┴─────────────────────────────────────────────────────┘\n",
              "</pre>\n"
            ],
            "text/plain": [
              "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
              "┃\u001b[1m \u001b[0m\u001b[1mTokenizer (type)                                  \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m                                            Vocab #\u001b[0m\u001b[1m \u001b[0m┃\n",
              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
              "│ gemma_tokenizer (\u001b[38;5;33mGemmaTokenizer\u001b[0m)                   │                                             \u001b[38;5;34m256,000\u001b[0m │\n",
              "└────────────────────────────────────────────────────┴─────────────────────────────────────────────────────┘\n"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/html": [
              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"gemma_causal_lm\"</span>\n",
              "</pre>\n"
            ],
            "text/plain": [
              "\u001b[1mModel: \"gemma_causal_lm\"\u001b[0m\n"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/html": [
              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
              "┃<span style=\"font-weight: bold\"> Layer (type)                  </span>┃<span style=\"font-weight: bold\"> Output Shape              </span>┃<span style=\"font-weight: bold\">         Param # </span>┃<span style=\"font-weight: bold\"> Connected to               </span>┃\n",
              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
              "│ padding_mask (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">InputLayer</span>)     │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>)              │               <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ -                          │\n",
              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
              "│ token_ids (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">InputLayer</span>)        │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>)              │               <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ -                          │\n",
              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
              "│ gemma_backbone                │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">2304</span>)        │   <span style=\"color: #00af00; text-decoration-color: #00af00\">2,614,341,888</span> │ padding_mask[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>],        │\n",
              "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">GemmaBackbone</span>)               │                           │                 │ token_ids[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]            │\n",
              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
              "│ token_embedding               │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">256000</span>)      │     <span style=\"color: #00af00; text-decoration-color: #00af00\">589,824,000</span> │ gemma_backbone[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]       │\n",
              "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">ReversibleEmbedding</span>)         │                           │                 │                            │\n",
              "└───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘\n",
              "</pre>\n"
            ],
            "text/plain": [
              "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
              "┃\u001b[1m \u001b[0m\u001b[1mLayer (type)                 \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mOutput Shape             \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m        Param #\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mConnected to              \u001b[0m\u001b[1m \u001b[0m┃\n",
              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
              "│ padding_mask (\u001b[38;5;33mInputLayer\u001b[0m)     │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m)              │               \u001b[38;5;34m0\u001b[0m │ -                          │\n",
              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
              "│ token_ids (\u001b[38;5;33mInputLayer\u001b[0m)        │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m)              │               \u001b[38;5;34m0\u001b[0m │ -                          │\n",
              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
              "│ gemma_backbone                │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m2304\u001b[0m)        │   \u001b[38;5;34m2,614,341,888\u001b[0m │ padding_mask[\u001b[38;5;34m0\u001b[0m][\u001b[38;5;34m0\u001b[0m],        │\n",
              "│ (\u001b[38;5;33mGemmaBackbone\u001b[0m)               │                           │                 │ token_ids[\u001b[38;5;34m0\u001b[0m][\u001b[38;5;34m0\u001b[0m]            │\n",
              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
              "│ token_embedding               │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m256000\u001b[0m)      │     \u001b[38;5;34m589,824,000\u001b[0m │ gemma_backbone[\u001b[38;5;34m0\u001b[0m][\u001b[38;5;34m0\u001b[0m]       │\n",
              "│ (\u001b[38;5;33mReversibleEmbedding\u001b[0m)         │                           │                 │                            │\n",
              "└───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘\n"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/html": [
              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">2,614,341,888</span> (9.74 GB)\n",
              "</pre>\n"
            ],
            "text/plain": [
              "\u001b[1m Total params: \u001b[0m\u001b[38;5;34m2,614,341,888\u001b[0m (9.74 GB)\n"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/html": [
              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">2,614,341,888</span> (9.74 GB)\n",
              "</pre>\n"
            ],
            "text/plain": [
              "\u001b[1m Trainable params: \u001b[0m\u001b[38;5;34m2,614,341,888\u001b[0m (9.74 GB)\n"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/html": [
              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
              "</pre>\n"
            ],
            "text/plain": [
              "\u001b[1m Non-trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "gemma_lm.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ArZPOzFpVp6S"
      },
      "source": [
        "As you can see from the summary, the model has 2.6 billion trainable parameters.\n",
        "\n",
        "Note: For purposes of naming the model (\"2B\"), the embedding layer is not counted against the number of parameters."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1WpS39TBYql9"
      },
      "source": [
        "### Define formatting helper functions"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3-obTC1jZGpZ"
      },
      "outputs": [],
      "source": [
        "from IPython.display import Markdown\n",
        "import textwrap\n",
        "\n",
        "def display_chat(prompt, text):\n",
        "  formatted_prompt = \"<font size='+1' color='brown'>🙋‍♂️<blockquote>\" + prompt + \"</blockquote></font>\"\n",
        "  text = text.replace('•', '  *')\n",
        "  text = textwrap.indent(text, '> ', predicate=lambda _: True)\n",
        "  formatted_text = \"<font size='+1' color='teal'>🤖\\n\\n\" + text + \"\\n</font>\"\n",
        "  return Markdown(formatted_prompt+formatted_text)\n",
        "\n",
        "def to_markdown(text):\n",
        "  text = text.replace('•', '  *')\n",
        "  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5ca54e8c"
      },
      "source": [
        "## Building the chatbot\n",
        "\n",
        "The Gemma instruction-tuned model `gemma2_instruct_2b_en` is fine-tuned to understand the following turn tokens:\n",
        "\n",
        "```\n",
        "<start_of_turn>user\\n  ... <end_of_turn>\\n\n",
        "<start_of_turn>model\\n ... <end_of_turn>\\n\n",
        "```\n",
        "\n",
        "This tutorial uses these tokens to build the chatbot. Refer to [Formatting and system instructions](https://ai.google.dev/gemma/docs/formatting) for more information on Gemma control tokens.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9583dfd1"
      },
      "source": [
        "### Create a chat helper to manage the conversation state"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "e4e9a187"
      },
      "outputs": [],
      "source": [
        "class ChatState():\n",
        "  \"\"\"\n",
        "  Manages the conversation history for a turn-based chatbot\n",
        "  Follows the turn-based conversation guidelines for the Gemma family of models\n",
        "  documented at https://ai.google.dev/gemma/docs/formatting\n",
        "  \"\"\"\n",
        "\n",
        "  __START_TURN_USER__ = \"<start_of_turn>user\\n\"\n",
        "  __START_TURN_MODEL__ = \"<start_of_turn>model\\n\"\n",
        "  __END_TURN__ = \"<end_of_turn>\\n\"\n",
        "\n",
        "  def __init__(self, model, system=\"\"):\n",
        "    \"\"\"\n",
        "    Initializes the chat state.\n",
        "\n",
        "    Args:\n",
        "        model: The language model to use for generating responses.\n",
        "        system: (Optional) System instructions or bot description.\n",
        "    \"\"\"\n",
        "    self.model = model\n",
        "    self.system = system\n",
        "    self.history = []\n",
        "\n",
        "  def add_to_history_as_user(self, message):\n",
        "      \"\"\"\n",
        "      Adds a user message to the history with start/end turn markers.\n",
        "      \"\"\"\n",
        "      self.history.append(self.__START_TURN_USER__ + message + self.__END_TURN__)\n",
        "\n",
        "  def add_to_history_as_model(self, message):\n",
        "      \"\"\"\n",
        "      Adds a model response to the history with start/end turn markers.\n",
        "      \"\"\"\n",
        "      self.history.append(self.__START_TURN_MODEL__ + message)\n",
        "\n",
        "  def get_history(self):\n",
        "      \"\"\"\n",
        "      Returns the entire chat history as a single string.\n",
        "      \"\"\"\n",
        "      return \"\".join([*self.history])\n",
        "\n",
        "  def get_full_prompt(self):\n",
        "    \"\"\"\n",
        "    Builds the prompt for the language model, including history and system description.\n",
        "    \"\"\"\n",
        "    prompt = self.get_history() + self.__START_TURN_MODEL__\n",
        "    if len(self.system)>0:\n",
        "      prompt = self.system + \"\\n\" + prompt\n",
        "    return prompt\n",
        "\n",
        "  def send_message(self, message):\n",
        "    \"\"\"\n",
        "    Handles sending a user message and getting a model response.\n",
        "\n",
        "    Args:\n",
        "        message: The user's message.\n",
        "\n",
        "    Returns:\n",
        "        The model's response.\n",
        "    \"\"\"\n",
        "    self.add_to_history_as_user(message)\n",
        "    prompt = self.get_full_prompt()\n",
        "    response = self.model.generate(prompt, max_length=2048)\n",
        "    result = response.replace(prompt, \"\")  # Extract only the new response\n",
        "    self.add_to_history_as_model(result)\n",
        "    return result\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9hmJS4h4ZmiP"
      },
      "source": [
        "### Chat with the model\n",
        "\n",
        "Start chatting with the model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "b1913181"
      },
      "outputs": [
        {
          "data": {
            "text/markdown": [
              "<font size='+1' color='brown'>🙋‍♂️<blockquote>Tell me, in a few words,  how to compute all prime numbers up to 1000?</blockquote></font><font size='+1' color='teal'>🤖\n",
              "\n",
              "> **Sieve of Eratosthenes.** \n",
              "> <end_of_turn>\n",
              "</font>"
            ],
            "text/plain": [
              "<IPython.core.display.Markdown object>"
            ]
          },
          "execution_count": 9,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "chat = ChatState(gemma_lm)\n",
        "message = \"Tell me, in a few words,  how to compute all prime numbers up to 1000?\"\n",
        "display_chat(message, chat.send_message(message))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ODKxUPP2Zuqy"
      },
      "source": [
        "Continue the conversation."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7448005b"
      },
      "outputs": [
        {
          "data": {
            "text/markdown": [
              "<font size='+1' color='brown'>🙋‍♂️<blockquote>Now in Python! No numpy, please!</blockquote></font><font size='+1' color='teal'>🤖\n",
              "\n",
              "> ```python\n",
              "> def sieve_of_eratosthenes(n):\n",
              ">   \"\"\"Returns a list of prime numbers up to n.\"\"\"\n",
              ">   primes = [True] * (n + 1)\n",
              ">   primes[0] = primes[1] = False\n",
              ">   for i in range(2, int(n**0.5) + 1):\n",
              ">     if primes[i]:\n",
              ">       for j in range(i * i, n + 1, i):\n",
              ">         primes[j] = False\n",
              ">   return [i for i, is_prime in enumerate(primes) if is_prime]\n",
              "> \n",
              "> primes = sieve_of_eratosthenes(1000)\n",
              "> print(primes)\n",
              "> ```\n",
              "> \n",
              "> **Explanation:**\n",
              "> \n",
              "> 1. **Initialization:**\n",
              ">    - `primes = [True] * (n + 1)`: Creates a list `primes` of boolean values, initially assuming all numbers are prime.\n",
              ">    - `primes[0] = primes[1] = False`: Sets 0 and 1 as non-prime.\n",
              "> \n",
              "> 2. **Iteration:**\n",
              ">    - `for i in range(2, int(n**0.5) + 1):`: Iterates from 2 to the square root of `n`. We only need to check up to the square root because any composite number must have a prime factor less than or equal to its square root.\n",
              ">    - `if primes[i]:`: If `i` is marked as prime:\n",
              ">      - `for j in range(i * i, n + 1, i):`: Marks all multiples of `i` as non-prime.\n",
              "> \n",
              "> 3. **Result:**\n",
              ">    - `return [i for i, is_prime in enumerate(primes) if is_prime]`: Creates a list of indices where `primes[i]` is True, representing the prime numbers.\n",
              "> \n",
              "> \n",
              "> Let me know if you'd like a more detailed explanation of any part! \n",
              "> <end_of_turn>\n",
              "</font>"
            ],
            "text/plain": [
              "<IPython.core.display.Markdown object>"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "message = \"Now in Python! No numpy, please!\"\n",
        "display_chat(message, chat.send_message(message))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "0973ff54"
      },
      "outputs": [
        {
          "data": {
            "text/markdown": [
              "<font size='+1' color='brown'>🙋‍♂️<blockquote>Thank you, it works! Can you explain the code in French?</blockquote></font><font size='+1' color='teal'>🤖\n",
              "\n",
              "> Bien sûr ! Voici une explication du code en français :\n",
              "> \n",
              "> ```python\n",
              "> def sieve_of_eratosthenes(n):\n",
              ">   \"\"\"Retourne une liste de nombres premiers jusqu'à n.\"\"\"\n",
              ">   primes = [True] * (n + 1)\n",
              ">   primes[0] = primes[1] = False\n",
              ">   for i in range(2, int(n**0.5) + 1):\n",
              ">     if primes[i]:\n",
              ">       for j in range(i * i, n + 1, i):\n",
              ">         primes[j] = False\n",
              ">   return [i for i, is_prime in enumerate(primes) if is_prime]\n",
              "> \n",
              "> primes = sieve_of_eratosthenes(1000)\n",
              "> print(primes)\n",
              "> ```\n",
              "> \n",
              "> **Explication:**\n",
              "> \n",
              "> 1. **Initialisation:**\n",
              ">    - `primes = [True] * (n + 1)`: Crée une liste `primes` de valeurs booléennes, initialement supposant que tous les nombres sont premiers.\n",
              ">    - `primes[0] = primes[1] = False`: Définit 0 et 1 comme non-premiers.\n",
              "> \n",
              "> 2. **Itération:**\n",
              ">    - `for i in range(2, int(n**0.5) + 1):`: Itère de 2 jusqu'à la racine carrée de `n`. Nous ne devons vérifier que jusqu'à la racine carrée car tout nombre composite doit avoir un facteur premier inférieur ou égal à sa racine carrée.\n",
              ">    - `if primes[i]:`: Si `i` est considéré comme premier:\n",
              ">      - `for j in range(i * i, n + 1, i):`:  Marquer tous les multiples de `i` comme non-premiers.\n",
              "> \n",
              "> 3. **Resultat:**\n",
              ">    - `return [i for i, is_prime in enumerate(primes) if is_prime]`: Crée une liste des indices où `primes[i]` est vrai, représentant les nombres premiers.\n",
              "> \n",
              "> \n",
              "> N'hésitez pas à me demander si vous avez besoin d'une explication plus détaillée de quelque chose! \n",
              "> <end_of_turn>\n",
              "</font>"
            ],
            "text/plain": [
              "<IPython.core.display.Markdown object>"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "message = \"Thank you, it works! Can you explain the code in French?\"\n",
        "display_chat(message, chat.send_message(message))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "a0c51f42"
      },
      "outputs": [
        {
          "data": {
            "text/markdown": [
              "<font size='+1' color='brown'>🙋‍♂️<blockquote>Great! Now add those explanations as comments in the code.</blockquote></font><font size='+1' color='teal'>🤖\n",
              "\n",
              "> ```python\n",
              "> def sieve_of_eratosthenes(n):\n",
              ">   \"\"\"Retourne une liste de nombres premiers jusqu'à n.\"\"\"\n",
              ">   # Initialise une liste de boolean avec True pour tous les nombres de 0 à n\n",
              ">   primes = [True] * (n + 1)\n",
              ">   # Définit 0 et 1 comme non-premiers\n",
              ">   primes[0] = primes[1] = False\n",
              ">   # Itère de 2 à la racine carrée de n\n",
              ">   for i in range(2, int(n**0.5) + 1):\n",
              ">     # Si i est considéré comme premier\n",
              ">     if primes[i]:\n",
              ">       # Itère sur tous les multiples de i\n",
              ">       for j in range(i * i, n + 1, i):\n",
              ">         # Définit les multiples de i comme non-premiers\n",
              ">         primes[j] = False\n",
              ">   # Retourne la liste des indices des nombres premiers\n",
              ">   return [i for i, is_prime in enumerate(primes) if is_prime]\n",
              "> \n",
              "> primes = sieve_of_eratosthenes(1000)\n",
              "> print(primes)\n",
              "> ```\n",
              "> \n",
              "> **Explication:**\n",
              "> \n",
              "> * **Initialisation:**\n",
              ">     * `primes = [True] * (n + 1)`:  Crée une liste `primes` de valeurs booléennes, initialement supposant que tous les nombres sont premiers.\n",
              ">     * `primes[0] = primes[1] = False`: Définit 0 et 1 comme non-premiers.\n",
              "> * **Itération:**\n",
              ">     * `for i in range(2, int(n**0.5) + 1):`: Itère de 2 jusqu'à la racine carrée de `n`. Nous ne devons vérifier que jusqu'à la racine carrée car tout nombre composite doit avoir un facteur premier inférieur ou égal à sa racine carrée.\n",
              ">     * `if primes[i]:`: Si `i` est considéré comme premier:\n",
              ">         * `for j in range(i * i, n + 1, i):`:  Marquer tous les multiples de `i` comme non-premiers.\n",
              "> * **Resultat:**\n",
              ">     * `return [i for i, is_prime in enumerate(primes) if is_prime]`: Crée une liste des indices où `primes[i]` est vrai, représentant les nombres premiers. \n",
              "> \n",
              "> \n",
              "> \n",
              "> <end_of_turn>\n",
              "</font>"
            ],
            "text/plain": [
              "<IPython.core.display.Markdown object>"
            ]
          },
          "execution_count": 12,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "message = \"Great! Now add those explanations as comments in the code.\"\n",
        "display_chat(message, chat.send_message(message))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "51a33627"
      },
      "source": [
        "Test the generated response by running the generated code:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "221c0817"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997]\n"
          ]
        }
      ],
      "source": [
        "def sieve_of_eratosthenes(n):\n",
        "  \"\"\"Retourne une liste de nombres premiers jusqu'à n.\"\"\"\n",
        "  # Initialise une liste de boolean avec True pour tous les nombres de 0 à n\n",
        "  primes = [True] * (n + 1)\n",
        "  # Définit 0 et 1 comme non-premiers\n",
        "  primes[0] = primes[1] = False\n",
        "  # Itère de 2 à la racine carrée de n\n",
        "  for i in range(2, int(n**0.5) + 1):\n",
        "    # Si i est considéré comme premier\n",
        "    if primes[i]:\n",
        "      # Itère sur tous les multiples de i\n",
        "      for j in range(i * i, n + 1, i):\n",
        "        # Définit les multiples de i comme non-premiers\n",
        "        primes[j] = False\n",
        "  # Retourne la liste des indices des nombres premiers\n",
        "  return [i for i, is_prime in enumerate(primes) if is_prime]\n",
        "\n",
        "primes = sieve_of_eratosthenes(1000)\n",
        "print(primes)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1c8ece6c"
      },
      "source": [
        "Use the `get_history` method to see how all the context was retained by the `Chat` class."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "e48f4ca1"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "<start_of_turn>user\n",
            "Tell me, in a few words,  how to compute all prime numbers up to 1000?<end_of_turn>\n",
            "<start_of_turn>model\n",
            "**Sieve of Eratosthenes.** \n",
            "<end_of_turn><start_of_turn>user\n",
            "Now in Python! No numpy, please!<end_of_turn>\n",
            "<start_of_turn>model\n",
            "```python\n",
            "def sieve_of_eratosthenes(n):\n",
            "  \"\"\"Returns a list of prime numbers up to n.\"\"\"\n",
            "  primes = [True] * (n + 1)\n",
            "  primes[0] = primes[1] = False\n",
            "  for i in range(2, int(n**0.5) + 1):\n",
            "    if primes[i]:\n",
            "      for j in range(i * i, n + 1, i):\n",
            "        primes[j] = False\n",
            "  return [i for i, is_prime in enumerate(primes) if is_prime]\n",
            "\n",
            "primes = sieve_of_eratosthenes(1000)\n",
            "print(primes)\n",
            "```\n",
            "\n",
            "**Explanation:**\n",
            "\n",
            "1. **Initialization:**\n",
            "   - `primes = [True] * (n + 1)`: Creates a list `primes` of boolean values, initially assuming all numbers are prime.\n",
            "   - `primes[0] = primes[1] = False`: Sets 0 and 1 as non-prime.\n",
            "\n",
            "2. **Iteration:**\n",
            "   - `for i in range(2, int(n**0.5) + 1):`: Iterates from 2 to the square root of `n`. We only need to check up to the square root because any composite number must have a prime factor less than or equal to its square root.\n",
            "   - `if primes[i]:`: If `i` is marked as prime:\n",
            "     - `for j in range(i * i, n + 1, i):`: Marks all multiples of `i` as non-prime.\n",
            "\n",
            "3. **Result:**\n",
            "   - `return [i for i, is_prime in enumerate(primes) if is_prime]`: Creates a list of indices where `primes[i]` is True, representing the prime numbers.\n",
            "\n",
            "\n",
            "Let me know if you'd like a more detailed explanation of any part! \n",
            "<end_of_turn><start_of_turn>user\n",
            "Thank you, it works! Can you explain the code in French?<end_of_turn>\n",
            "<start_of_turn>model\n",
            "Bien sûr ! Voici une explication du code en français :\n",
            "\n",
            "```python\n",
            "def sieve_of_eratosthenes(n):\n",
            "  \"\"\"Retourne une liste de nombres premiers jusqu'à n.\"\"\"\n",
            "  primes = [True] * (n + 1)\n",
            "  primes[0] = primes[1] = False\n",
            "  for i in range(2, int(n**0.5) + 1):\n",
            "    if primes[i]:\n",
            "      for j in range(i * i, n + 1, i):\n",
            "        primes[j] = False\n",
            "  return [i for i, is_prime in enumerate(primes) if is_prime]\n",
            "\n",
            "primes = sieve_of_eratosthenes(1000)\n",
            "print(primes)\n",
            "```\n",
            "\n",
            "**Explication:**\n",
            "\n",
            "1. **Initialisation:**\n",
            "   - `primes = [True] * (n + 1)`: Crée une liste `primes` de valeurs booléennes, initialement supposant que tous les nombres sont premiers.\n",
            "   - `primes[0] = primes[1] = False`: Définit 0 et 1 comme non-premiers.\n",
            "\n",
            "2. **Itération:**\n",
            "   - `for i in range(2, int(n**0.5) + 1):`: Itère de 2 jusqu'à la racine carrée de `n`. Nous ne devons vérifier que jusqu'à la racine carrée car tout nombre composite doit avoir un facteur premier inférieur ou égal à sa racine carrée.\n",
            "   - `if primes[i]:`: Si `i` est considéré comme premier:\n",
            "     - `for j in range(i * i, n + 1, i):`:  Marquer tous les multiples de `i` comme non-premiers.\n",
            "\n",
            "3. **Resultat:**\n",
            "   - `return [i for i, is_prime in enumerate(primes) if is_prime]`: Crée une liste des indices où `primes[i]` est vrai, représentant les nombres premiers.\n",
            "\n",
            "\n",
            "N'hésitez pas à me demander si vous avez besoin d'une explication plus détaillée de quelque chose! \n",
            "<end_of_turn><start_of_turn>user\n",
            "Great! Now add those explanations as comments in the code.<end_of_turn>\n",
            "<start_of_turn>model\n",
            "```python\n",
            "def sieve_of_eratosthenes(n):\n",
            "  \"\"\"Retourne une liste de nombres premiers jusqu'à n.\"\"\"\n",
            "  # Initialise une liste de boolean avec True pour tous les nombres de 0 à n\n",
            "  primes = [True] * (n + 1)\n",
            "  # Définit 0 et 1 comme non-premiers\n",
            "  primes[0] = primes[1] = False\n",
            "  # Itère de 2 à la racine carrée de n\n",
            "  for i in range(2, int(n**0.5) + 1):\n",
            "    # Si i est considéré comme premier\n",
            "    if primes[i]:\n",
            "      # Itère sur tous les multiples de i\n",
            "      for j in range(i * i, n + 1, i):\n",
            "        # Définit les multiples de i comme non-premiers\n",
            "        primes[j] = False\n",
            "  # Retourne la liste des indices des nombres premiers\n",
            "  return [i for i, is_prime in enumerate(primes) if is_prime]\n",
            "\n",
            "primes = sieve_of_eratosthenes(1000)\n",
            "print(primes)\n",
            "```\n",
            "\n",
            "**Explication:**\n",
            "\n",
            "* **Initialisation:**\n",
            "    * `primes = [True] * (n + 1)`:  Crée une liste `primes` de valeurs booléennes, initialement supposant que tous les nombres sont premiers.\n",
            "    * `primes[0] = primes[1] = False`: Définit 0 et 1 comme non-premiers.\n",
            "* **Itération:**\n",
            "    * `for i in range(2, int(n**0.5) + 1):`: Itère de 2 jusqu'à la racine carrée de `n`. Nous ne devons vérifier que jusqu'à la racine carrée car tout nombre composite doit avoir un facteur premier inférieur ou égal à sa racine carrée.\n",
            "    * `if primes[i]:`: Si `i` est considéré comme premier:\n",
            "        * `for j in range(i * i, n + 1, i):`:  Marquer tous les multiples de `i` comme non-premiers.\n",
            "* **Resultat:**\n",
            "    * `return [i for i, is_prime in enumerate(primes) if is_prime]`: Crée une liste des indices où `primes[i]` est vrai, représentant les nombres premiers. \n",
            "\n",
            "\n",
            "\n",
            "<end_of_turn>\n"
          ]
        }
      ],
      "source": [
        "print(chat.get_history())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9693c66f"
      },
      "source": [
        "## Summary and further reading\n",
        "\n",
        "In this tutorial, you learned how to chat with the Gemma 2B Instruction tuned model using Keras on JAX.\n",
        "\n",
        "Check out these guides and tutorials to learn more about Gemma:\n",
        "\n",
        "* [Get started with Keras Gemma](https://ai.google.dev/gemma/docs/get_started).\n",
        "* [Finetune the Gemma model on GPU](https://ai.google.dev/gemma/docs/lora_tuning).\n",
        "* Learn about [Gemma integration with Vertex AI](https://ai.google.dev/gemma/docs/integrations/vertex)\n",
        "* Learn how to [use Gemma models with Vertex AI](https://cloud.google.com/vertex-ai/docs/generative-ai/open-models/use-gemma){:.external}.\n"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "name": "gemma_chat.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}