sdk/python/endpoints/online/deploy-with-packages/registry-model/sdk-deploy-and-test.ipynb

{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Package models from registries for deployment\n", "\n", "In this examples, you will learn how to package models hosted in registries in Azure Machine Learning using the Model Packaging functionality. Packaging MLflow models allow you to deploy them on insfrastructure that doesn't have public access enabled to download dependencies and they don't perform dynamic installation of packages during deployment." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "\n", "Ensure you have the latest version of `azure-ai-ml`:\n", "\n", "```bash\n", "%pip install -U azure-ai-ml\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Connect to the registry and the workspace\n", "\n", "We need to connect to both resources in this example: to the registry where the model is hosted and to the workspace where we want to package and deploy it.\n", "\n", "### 1.1. Import the required libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "gather": { "logged": 1682372122610 } }, "outputs": [], "source": [ "from azure.identity import DefaultAzureCredential\n", "from azure.ai.ml import MLClient\n", "from azure.ai.ml.entities import (\n", " AzureMLOnlineInferencingServer,\n", " ModelPackage,\n", " ModelConfiguration,\n", ")\n", "from azure.ai.ml.entities import (\n", " ManagedOnlineEndpoint,\n", " ManagedOnlineDeployment,\n", " Environment,\n", " Model,\n", ")\n", "from azure.ai.ml.constants import AssetTypes" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 Configure workspace details and get a handle to the workspace\n", "\n", "To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "gather": { "logged": 1682372123363 } }, "outputs": [], "source": [ "subscription_id = \"<subscription>\"\n", "resource_group = \"<resource-group>\"\n", "workspace = \"<workspace>\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "gather": { "logged": 1682372123746 } }, "outputs": [], "source": [ "ml_client = MLClient(\n", " DefaultAzureCredential(), subscription_id, resource_group, workspace\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If you are running on AzureML compute, you can easily:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "gather": { "logged": 1682372124833 } }, "outputs": [], "source": [ "ml_client = MLClient.from_config(DefaultAzureCredential())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 Configure registry\n", "\n", "To connect to the registry where the model is hosted, we need to create another `MLClient` and indicate the name of the registry we want to consume. In this scenario we are connecting to the public registry `azureml` where fundational models are hosted:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "set_registry_name" }, "outputs": [], "source": [ "registry_name = \"azureml\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "configure_registry_client" }, "outputs": [], "source": [ "registry_client = MLClient(\n", " credential=DefaultAzureCredential(),\n", " subscription_id=ml_client.subscription_id,\n", " resource_group_name=ml_client.resource_group_name,\n", " workspace_reference=ml_client.workspace_name,\n", " registry_name=registry_name,\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Get the model\n", "\n", "Now, let's get the model we want to package. In this example, we will package the model named `t5-base`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "get_model" }, "outputs": [], "source": [ "model_name = \"t5-base\"\n", "model = registry_client.models.get(name=model_name, label=\"latest\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "## 3. Package the model\n", "\n", "Let's package this model for online deployment" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "### 3.1 The base environment\n", "\n", "Packages uses a base environment to construct the package. However, for MLflow models, we automatically select the best base image depending on the SKU of your target compute." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "### 3.2 Package the model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "configure_package" }, "outputs": [], "source": [ "import time\n", "\n", "model_package_name = f\"pkg-{model.name}-{model.version}\"\n", "model_package_version = str(int(time.time()))\n", "\n", "package_config = ModelPackage(\n", " target_environment=model_package_name,\n", " target_environment_version=model_package_version,\n", " inferencing_server=AzureMLOnlineInferencingServer(),\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "Let's start the package operation:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "build_package" }, "outputs": [], "source": [ "model_package = registry_client.models.package(\n", " model.name, model.version, package_config\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "The package operation will start and it will take a couple of minutes to complete. You can get the details of this package with:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "gather": { "logged": 1682372601102 }, "jupyter": { "outputs_hidden": false, "source_hidden": false }, "nteract": { "transient": { "deleting": false } } }, "outputs": [], "source": [ "model_package" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "> Notice how the package operation results in a new environment version being created." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Deploy the package to Online Endpoints\n", "\n", "Now, we can deploy this package in an Online Endpoint." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "### 4.1 Create the endpoint\n", "\n", "Let's name the endpoint:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "name_endpoint" }, "outputs": [], "source": [ "endpoint_name = \"t5-base\"" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Endpoint names should be unique so we will append a random string at the end to ensure that:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "gather": { "logged": 1682372601613 } }, "outputs": [], "source": [ "import random\n", "import string\n", "\n", "# Creating a unique endpoint name by including a random suffix\n", "allowed_chars = string.ascii_lowercase + string.digits\n", "endpoint_suffix = \"\".join(random.choice(allowed_chars) for x in range(5))\n", "endpoint_name = f\"{endpoint_name}-{endpoint_suffix}\"\n", "\n", "print(f\"Endpoint name: {endpoint_name}\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "Let's create the endpoint:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "create_endpoint" }, "outputs": [], "source": [ "endpoint = ManagedOnlineEndpoint(name=endpoint_name)\n", "endpoint = ml_client.online_endpoints.begin_create_or_update(endpoint).result()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "### 4.2 Deploy the package in a deployment" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "Now, we can deploy this package in an Online Endpoint." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "configure_deployment" }, "outputs": [], "source": [ "deployment_name = \"with-package\"\n", "deployment_package = ManagedOnlineDeployment(\n", " name=deployment_name,\n", " endpoint_name=endpoint_name,\n", " environment=model_package,\n", " instance_count=1,\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "> Tip: Notice how model or scoring script are not being indicated in this example. All of them are part of the package." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "Create the deployment:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "create_deployment" }, "outputs": [], "source": [ "ml_client.online_deployments.begin_create_or_update(deployment_package).result()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "### 4.3 Test the deployment\n", "\n", "We can test if the deployment is working as expected. Once the deployment is created, it is ready to receive requests." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "test_deployment" }, "outputs": [], "source": [ "ml_client.online_endpoints.invoke(\n", " endpoint_name=endpoint_name,\n", " deployment_name=deployment_name,\n", " request_file=\"sample-request.json\",\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "nteract": { "transient": { "deleting": false } } }, "source": [ "Now that we confirmed the deployment works, let's send all the traffic to it:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "update_deployment_traffic" }, "outputs": [], "source": [ "endpoint.traffic = {deployment_name: 100}\n", "ml_client.online_endpoints.begin_create_or_update(endpoint).result()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Deploying with packages directly from the deployment\n", "\n", "If you don't need to configure how the package is performed, you can quickly take advantage of the packaging functionality by indicating Online Endpoints to package before performing the deployment.\n", "\n", "To do so, indicate the argument `with_package=True`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "deployment_package = ManagedOnlineDeployment(\n", " name=\"with-package-inline\",\n", " endpoint_name=endpoint_name,\n", " model=model.id,\n", " instance_count=1,\n", " with_package=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's create the deployment:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "crete_deployment_inline" }, "outputs": [], "source": [ "ml_client.online_deployments.begin_create_or_update(deployment_package).result()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Clean un resources\n", "\n", "Once done, delete the associated resources from the workspace:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "name": "delete_resources" }, "outputs": [], "source": [ "ml_client.online_endpoints.begin_delete(endpoint.name).result()" ] } ], "metadata": { "kernel_info": { "name": "previews" }, "kernelspec": { "display_name": "Python 3.10 - SDK v2", "language": "python", "name": "python310-sdkv2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" }, "microsoft": { "ms_spell_check": { "ms_spell_check_language": "en" } }, "nteract": { "version": "nteract-front-end@1.0.0" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "7bc894e00fe86b2e284ef20cdeac20aaafcfa957033ee2bd5bf4c264a294a7ee" } } }, "nbformat": 4, "nbformat_minor": 2 }

sdk/python/endpoints/online/deploy-with-packages/registry-model/sdk-deploy-and-test.ipynb (659 lines of code) (raw):