quickstarts/rest/Embeddings_REST.ipynb (345 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "Tce3stUlHN0L"
},
"source": [
"##### Copyright 2025 Google LLC."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "tuOe1ymfHZPu"
},
"outputs": [],
"source": [
"# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "agmT3hrjsffX"
},
"source": [
"# Gemini API: Embedding Quickstart with REST\n",
"\n",
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/rest/Embeddings_REST.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" height=30/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JMNKdTpTGZET"
},
"source": [
"This notebook provides quick code examples that show you how to get started generating embeddings using `curl`.\n",
"\n",
"You can run this in Google Colab, or you can copy/paste the `curl` commands into your terminal.\n",
"\n",
"To run this notebook, your API key must be stored it in a Colab Secret named GOOGLE_API_KEY. If you are running in a different environment, you can store your key in an environment variable. See [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) to learn more."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "R-Vw_mOM_WD0"
},
"outputs": [],
"source": [
"import os\n",
"from google.colab import userdata"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "wCkLTpb3oTXE"
},
"outputs": [],
"source": [
"os.environ['GOOGLE_API_KEY'] = userdata.get('GOOGLE_API_KEY')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tjGqGBZ9yARd"
},
"source": [
"## Embed content\n",
"\n",
"Call the `embed_content` method with the `text-embedding-004` model to generate text embeddings:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "eA7I_Ww8IETn"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"embedding\": {\n",
" \"values\": [\n",
" 0.013168523,\n",
" -0.008711934,\n",
" -0.046782676,\n",
" 0.00069968984,\n",
" -0.009518873,\n",
" -0.008720178,\n",
" 0.060103577,\n"
]
}
],
"source": [
"%%bash\n",
"\n",
"curl \"https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004:embedContent?key=$GOOGLE_API_KEY\" \\\n",
"-H 'Content-Type: application/json' \\\n",
"-d '{\"model\": \"models/text-embedding-004\",\n",
" \"content\": {\n",
" \"parts\":[{\n",
" \"text\": \"Hello world\"}]}, }' 2> /dev/null | head"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "x7ngWdZ7yDHp"
},
"source": [
"# Batch embed content\n",
"\n",
"You can embed a list of multiple prompts with one API call for efficiency.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "Z0b35xv5Ja_d"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"embeddings\": [\n",
" {\n",
" \"values\": [\n",
" -0.010632277,\n",
" 0.019375855,\n",
" 0.0209652,\n",
" 0.0007706424,\n",
" -0.061464064,\n",
"--\n",
" -0.0071538696,\n",
" -0.028534694\n",
" ]\n",
" },\n",
" {\n",
" \"values\": [\n",
" 0.018467998,\n",
" 0.0054281196,\n",
" -0.017658804,\n",
" 0.013859266,\n",
" 0.053418662,\n",
"--\n",
" 0.026714385,\n",
" 0.0018762538\n",
" ]\n",
" },\n",
" {\n",
" \"values\": [\n",
" 0.05808907,\n",
" 0.020941721,\n",
" -0.108728774,\n",
" -0.04039259,\n",
" -0.04440443,\n"
]
}
],
"source": [
"%%bash\n",
"\n",
"curl \"https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004:batchEmbedContents?key=$GOOGLE_API_KEY\" \\\n",
"-H 'Content-Type: application/json' \\\n",
"-d '{\"requests\": [{\n",
" \"model\": \"models/text-embedding-004\",\n",
" \"content\": {\n",
" \"parts\":[{\n",
" \"text\": \"What is the meaning of life?\"}]}, },\n",
" {\n",
" \"model\": \"models/text-embedding-004\",\n",
" \"content\": {\n",
" \"parts\":[{\n",
" \"text\": \"How much wood would a woodchuck chuck?\"}]}, },\n",
" {\n",
" \"model\": \"models/text-embedding-004\",\n",
" \"content\": {\n",
" \"parts\":[{\n",
" \"text\": \"How does the brain work?\"}]}, }, ]}' 2> /dev/null | grep -C 5 values"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nPBk2k4xuql8"
},
"source": [
"## Set the output dimensionality\n",
"If you're using `text-embeddings-004`, you can set the `output_dimensionality` parameter to create smaller embeddings.\n",
"\n",
"* `output_dimensionality` truncates the embedding (e.g., `[1, 3, 5]` becomes `[1,3]` when `output_dimensionality=2`).\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"id": "ny3bOQK1ut2_"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"embedding\": {\n",
" \"values\": [\n",
" 0.013168523,\n",
" -0.008711934,\n",
" -0.046782676,\n",
" 0.00069968984,\n",
" -0.009518873,\n",
" -0.008720178,\n",
" 0.060103577,\n"
]
}
],
"source": [
"%%bash\n",
"\n",
"curl \"https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004:embedContent?key=$GOOGLE_API_KEY\" \\\n",
"-H 'Content-Type: application/json' \\\n",
"-d '{\"model\": \"models/text-embedding-004\",\n",
" \"output_dimensionality\":256,\n",
" \"content\": {\n",
" \"parts\":[{\n",
" \"text\": \"Hello world\"}]}, }' 2> /dev/null | head"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ObAdUvlk9x05"
},
"source": [
"## Use `task_type` to provide a hint to the model how you'll use the embeddings\n",
"\n",
"Let's look at all the parameters the embed_content method takes. There are four:\n",
"\n",
"* `model`: Required. Must be `models/embedding-001`.\n",
"* `content`: Required. The content that you would like to embed.\n",
"* `task_type`: Optional. The task type for which the embeddings will be used. See below for possible values.\n",
"* `title`: The given text is a document from a corpus being searched. Optionally, set the `title` parameter with the title of the document. Can only be set when `task_type` is `RETRIEVAL_DOCUMENT`.\n",
"\n",
"`task_type` is an optional parameter that provides a hint to the API about how you intend to use the embeddings in your application.\n",
"\n",
"The following task_type parameters are accepted:\n",
"\n",
"* `TASK_TYPE_UNSPECIFIED`: If you do not set the value, it will default to retrieval_query.\n",
"* `RETRIEVAL_QUERY` : The given text is a query in a search/retrieval setting.\n",
"* `RETRIEVAL_DOCUMENT`: The given text is a document from the corpus being searched.\n",
"* `SEMANTIC_SIMILARITY`: The given text will be used for Semantic Textual Similarity (STS).\n",
"* `CLASSIFICATION`: The given text will be classified.\n",
"* `CLUSTERING`: The embeddings will be used for clustering.\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"id": "NwzsJmRrAo-t"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"embedding\": {\n",
" \"values\": [\n",
" 0.060187872,\n",
" -0.031515103,\n",
" -0.03244149,\n",
" -0.019341845,\n",
" 0.057285223,\n",
" 0.037159503,\n",
" 0.035636507,\n"
]
}
],
"source": [
"%%bash\n",
"\n",
"curl \"https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=$GOOGLE_API_KEY\" \\\n",
"-H 'Content-Type: application/json' \\\n",
"-d '{\"model\": \"models/text-embedding-004\",\n",
" \"content\": {\n",
" \"parts\":[{\n",
" \"text\": \"Hello world\"}]},\n",
" \"task_type\": \"RETRIEVAL_DOCUMENT\",\n",
" \"title\": \"My title\"}' 2> /dev/null | head"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jXkRYBhbB_b2"
},
"source": [
"## Learning more\n",
"\n",
"* Learn more about text-embeddings-004 [here](https://developers.googleblog.com/2024/04/gemini-15-pro-in-public-preview-with-new-features.html).\n",
"* See the [REST API reference](https://ai.google.dev/api/rest) to learn more.\n",
"* Explore more examples in the cookbook.\n"
]
}
],
"metadata": {
"colab": {
"name": "Embeddings_REST.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}