llm-routing/llm_routing

{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "80JsJB4V93dw" }, "source": [ "# **LLM Routing with Apigee**\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/apigee-samples/blob/main/llm-routing/llm_routing_v1.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\\\"><br> Open in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fapigee-samples%2Fmain%2Fllm-routing%2Fllm_routing_v1.ipynb\">\n", " <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/apigee-samples/main/llm-routing/llm_routing_v1.ipynb\">\n", " <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/apigee-samples/blob/main/llm-routing/llm_routing_v1.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", "</table>\n", "<br />\n", "<br />\n", "<br />\n", "\n", "# Routing Sample\n", "\n", "- This is a sample Apigee proxy to demonstrate the routing capabilities of Apigee across different LLM providers. In this sample we will use Google VertexAI, Mistral and HuggingFace as the LLM providers\n", "- The framework will easily help onboarding other providers using configurations\n", "\n", "![architecture](https://github.com/GoogleCloudPlatform/apigee-samples/blob/main/llm-routing/images/arch.jpg?raw=1)\n", "\n", "# Benefits of Routing with Apigee:\n", "\n", "* **Configuration Driven Routing**: All the routing logic are driven through configuration which makes onboarding very easy\n", "* **Security**: Irrespective of the model and providers, Apigee will secure the endpoints\n", "* **Consistency**: Apigee can offer that layer of consistency to work with any LLM SDKs that are being used\n", "\n", "## Setup\n", "\n", "Use the following GCP CloudShell tutorial. Follow the instructions to deploy the sample.\n", "\n", "[![Open in Cloud Shell](https://gstatic.com/cloudssh/images/open-btn.png)](https://ssh.cloud.google.com/cloudshell/open?cloudshell_git_repo=https://github.com/GoogleCloudPlatform/apigee-samples&cloudshell_git_branch=main&cloudshell_workspace=.&cloudshell_tutorial=llm-routing/docs/cloudshell-tutorial.md)\n", "\n", "## Test Sample" ] }, { "cell_type": "markdown", "metadata": { "id": "BIdKcCXZQ6Jr" }, "source": [ "### Install Dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "id": "8Ka1d8c81VTH" }, "outputs": [], "source": [ "!pip install -Uq langchain==0.3.18\n", "!pip install -Uq langchain-openai\n", "!pip install -Uq google-cloud-aiplatform\n", "!pip install -Uq openai" ] }, { "cell_type": "markdown", "metadata": { "id": "KyPnkqS9Hm5I" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "If you are running this notebook on Google Colab, run the following cell to authenticate your environment. This step is not required if you are using Vertex AI Workbench or Colab Enterprise." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "q_-3uHjVHmA2" }, "outputs": [], "source": [ "import sys\n", "\n", "# Additional authentication is required for Google Colab\n", "if \"google.colab\" in sys.modules:\n", " # Authenticate user to Google Cloud\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "markdown", "metadata": { "id": "oeZIwvv-3NiM" }, "source": [ "### Set the Variables" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "eAu0gkLn3bZm" }, "outputs": [], "source": [ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", "APIGEE_HOST=\"[your-apigee-host-domain]\" # @param {type:\"string\"}\n", "APIKEY=\"[your-apikey]\" # @param {type:\"string\"}\n", "\n", "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", " raise ValueError(\"Please set your PROJECT_ID\")\n", "if not APIGEE_HOST or APIGEE_HOST == \"[your-apigee-host-domain]\":\n", " raise ValueError(\"Please set your APIGEE_HOST\")\n", "if not APIKEY or APIKEY == \"[your-apikey]\":\n", " raise ValueError(\"Please set your APIKEY\")\n", "\n", "API_ENDPOINT = \"https://\"+APIGEE_HOST+\"/v1/samples/llm-routing/\"\n", "PROMPT=\"Suggest name for a flower shop\"" ] }, { "cell_type": "markdown", "metadata": { "id": "ra7t8N6Iu0i3" }, "source": [ "### Select an LLM Provider\n", "\n", "Select a provider from the dropdown. This will automatically set the model name used by the SDKs\n", "\n", "Try picking different providers from the dropdown above. You will see that the same SDK is able to call the Apigee endpoint serving responses from different providers" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RoPe5F0Ot8Qm" }, "outputs": [], "source": [ "import sys\n", "from google.colab import auth\n", "\n", "llm_provider = \"select\" # @param [\"select\",\"google\", \"huggingface\", \"mistral\"]\n", "\n", "if llm_provider == \"google\":\n", " model = \"google/gemini-2.0-flash\"\n", "elif llm_provider == \"mistral\":\n", " model = \"open-mistral-nemo\"\n", "elif llm_provider == \"huggingface\":\n", " model = \"Meta-Llama-3.1-8B-Instruct\"\n", "else:\n", " raise ValueError(\"Invalid LLM provider\")" ] }, { "cell_type": "markdown", "metadata": { "id": "CrECV5DbRW1R" }, "source": [ "### Using OpenAI SDK" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NZHU7JGw-Oxr" }, "outputs": [], "source": [ "import openai\n", "\n", "openai.api_key = APIKEY\n", "openai.base_url = API_ENDPOINT\n", "openai.default_headers = {\"x-apikey\": APIKEY, \"x-llm-provider\": llm_provider}\n", "\n", "completion = openai.chat.completions.create(\n", " model=model,\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"type\": \"text\",\n", " \"text\": PROMPT\n", " }\n", " ]\n", " }\n", " ]\n", ")\n", "print(f\"Using the OpenAI SDK, fetching the response from \\\"{model}\\\" provided by \\\"{llm_provider}\\\"\")\n", "print(\"\\n\")\n", "print(completion.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "bE6VZrAsncK-" }, "source": [ "### Using LangChain SDK" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "vOU-rfX_IvE5" }, "outputs": [], "source": [ "from langchain_openai import ChatOpenAI\n", "\n", "llm = ChatOpenAI(\n", " model=model,\n", " api_key=APIKEY,\n", " base_url=API_ENDPOINT,\n", " default_headers = {\"x-apikey\": APIKEY, \"x-llm-provider\": llm_provider}\n", ")\n", "messages = [\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"type\": \"text\",\n", " \"text\": PROMPT\n", " }\n", " ]\n", " }\n", "]\n", "print(f\"Using the LangChain SDK, fetching the response from \\\"{model}\\\" provided by \\\"{llm_provider}\\\"\")\n", "print(\"\\n\")\n", "print(llm.invoke(messages).content)" ] } ], "metadata": { "colab": { "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 0 }

llm-routing/llm_routing_v1.ipynb (287 lines of code) (raw):