gemini/global-endpoint/intro_global_endpoint.ipynb (375 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "sqi5B7V_Rjim"
},
"outputs": [],
"source": [
"# Copyright 2025 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VyPmicX9RlZX"
},
"source": [
"# Intro to Vertex AI Global Endpoint\n",
"\n",
"\n",
"<table align=\"left\">\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/global-endpoint/intro_global_endpoint.ipynb\">\n",
" <img width=\"32px\" src=\"https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fglobal-endpoint%2Fintro_global_endpoint.ipynb\">\n",
" <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/global-endpoint/intro_global_endpoint.ipynb\">\n",
" <img src=\"https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg\" alt=\"Vertex AI logo\"><br> Open in Vertex AI Workbench\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/global-endpoint/intro_global_endpoint.ipynb\">\n",
" <img width=\"32px\" src=\"https://www.svgrepo.com/download/217753/github.svg\" alt=\"GitHub logo\"><br> View on GitHub\n",
" </a>\n",
" </td>\n",
"</table>\n",
"\n",
"<div style=\"clear: both;\"></div>\n",
"\n",
"<b>Share to:</b>\n",
"\n",
"<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/global-endpoint/intro_global_endpoint.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/global-endpoint/intro_global_endpoint.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/global-endpoint/intro_global_endpoint.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/global-endpoint/intro_global_endpoint.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/global-endpoint/intro_global_endpoint.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n",
"</a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8MqT58L6Rm_q"
},
"source": [
"| | |\n",
"|-|-|\n",
"| Author(s) | [Eric Dong](https://github.com/gericdong) |"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nVxnv1D5RoZw"
},
"source": [
"## Overview\n",
"\n",
"Vertex AI global endpoint enables developers to leverage Google's highly scalable and global infrastructure for Gemini models. Leveraging the global endpoint can significantly improve overall service availability and throughput, while reducing the likelihood of rate limiting (429 RESOURCE_EXHAUSTED errors).\n",
"\n",
"This tutorial demonstrates how to call the Vertex AI global endpoint using both the REST API and the Google Gen AI SDK.\n",
"\n",
"**Important**:\n",
"\n",
"- Use the global endpoint if you don't have strict regional restriction requirements.\n",
"\n",
"Learn more about [Deployments and endpoints](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gPiTOAHURvTM"
},
"source": [
"## Getting Started"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CHRZUpfWSEpp"
},
"source": [
"### Install Google Gen AI SDK\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "sG3_LKsWSD3A"
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet google-genai"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HlMVjiAWSMNX"
},
"source": [
"### Authenticate your notebook environment\n",
"\n",
"If you are running this notebook on Google Colab, run the cell below to authenticate your environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "12fnq4V0SNV3"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
" from google.colab import auth\n",
"\n",
" auth.authenticate_user()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "z6kSWg40PPzd"
},
"source": [
"### Set Google Cloud project ID\n",
"\n",
"To start using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
"\n",
"Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UCgUOv4nSWhc"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"PROJECT_ID = (\n",
" \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\"}\n",
")\n",
"\n",
"if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n",
" PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BhGIYTj0PgsD"
},
"source": [
"### Use a Gemini model\n",
"\n",
"Learn more about the [global endpoint supported models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#supported_models).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "R6rCOr8URsYU"
},
"outputs": [],
"source": [
"MODEL_ID = \"gemini-2.0-flash-001\" # @param {type: \"string\"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IZxwWQocTXSx"
},
"source": [
"## Use the Global Endpoint\n",
"\n",
"You set location to `global` to specify a global endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cVN2UzZ8Wtap"
},
"outputs": [],
"source": [
"LOCATION = \"global\" # @param {type: \"string\"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cjnw6Jt6Uqcm"
},
"source": [
"### **Option 1**: Use global endpoint with REST API\n",
"\n",
"Next, send a REST API request to the Gemini model using cURL with the global endpoint. When forming the request URL for the global endpoint, ensure the hostname `API_HOST` does not include a region identifier (unlike regional endpoints), and specify `global` within the URL path where the location would normally appear.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "262jPzPdUz_v"
},
"outputs": [],
"source": [
"API_HOST = \"aiplatform.googleapis.com\"\n",
"\n",
"os.environ[\"API_ENDPOINT\"] = (\n",
" f\"{API_HOST}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XX7zfX4leqVB"
},
"source": [
"The following example shows a text generation that uses a global endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Zu1p_H_NVvei"
},
"outputs": [],
"source": [
"%%bash\n",
"\n",
"curl -X POST \\\n",
" -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n",
" -H \"Content-Type: application/json\" \\\n",
" https://${API_ENDPOINT}:generateContent \\\n",
" -d '{\n",
" \"contents\": {\n",
" \"role\": \"USER\",\n",
" \"parts\": { \"text\": \"Why is the sky blue?\" },\n",
" },\n",
" }'"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zXgFIiDVag2j"
},
"source": [
"### **Option 2**: Use global endpoint with Gen AI SDK\n",
"\n",
"When you use the Gen AI SDK, set the location as part of the client options. Here you create a `client` for Vertex AI service, and set location `global` to indicate to use a global endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "TpMdwQSpbEPt"
},
"outputs": [],
"source": [
"from google import genai\n",
"\n",
"client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6YModBSDcHXl"
},
"source": [
"The following example shows a text generation that uses a global endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "PENzMhWQak2w"
},
"outputs": [],
"source": [
"response = client.models.generate_content(\n",
" model=MODEL_ID, contents=\"Why is the sky blue?\"\n",
")\n",
"\n",
"print(response.text)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eQwiONFdVHw5"
},
"source": [
"## What's next\n",
"\n",
"- Learn more about [Generative AI on Vertex AI locations](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations).\n",
"- Learn more about [Model versions and lifecycle](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versions).\n",
"- Explore other notebooks in the [Google Cloud Generative AI repository](https://github.com/GoogleCloudPlatform/generative-ai)."
]
}
],
"metadata": {
"colab": {
"name": "intro_global_endpoint.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}