sdk/python/endpoints/online/managed/online-endpoints-simple-deployment.ipynb (498 lines of code) (raw):
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deploy and score a machine learning model by using an online endpoint \n",
"\n",
"Learn how to use an online endpoint to deploy your model, so you don't have to create and manage the underlying infrastructure. You'll begin by deploying a model on your local machine to debug any errors, and then you'll deploy and test it in Azure.\n",
"\n",
"Managed online endpoints help to deploy your ML models in a turnkey manner. Managed online endpoints work with powerful CPU and GPU machines in Azure in a scalable, fully managed way. Managed online endpoints take care of serving, scaling, securing, and monitoring your models, freeing you from the overhead of setting up and managing the underlying infrastructure. \n",
"\n",
"For more information, see [What are Azure Machine Learning endpoints?](https://learn.microsoft.com/azure/machine-learning/concept-endpoints), and [Deploy an ML model with an online endpoint](https://learn.microsoft.com/azure/machine-learning/how-to-deploy-online-endpoints)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).\n",
"\n",
"* Install and configure the [Python SDK v2](sdk/setup.sh).\n",
"\n",
"* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.\n",
"\n",
"* You must have an Azure Machine Learning workspace. \n",
"\n",
"* To deploy locally, you must install Docker Engine on your local computer. We highly recommend this option, so it's easier to debug issues."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install docker"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 1. Connect to Azure Machine Learning Workspace\n",
"\n",
"The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.\n",
"\n",
"## 1.1. Import the required libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# import required libraries\n",
"from azure.ai.ml import MLClient\n",
"from azure.ai.ml.entities import (\n",
" ManagedOnlineEndpoint,\n",
" ManagedOnlineDeployment,\n",
" Model,\n",
" Environment,\n",
" CodeConfiguration,\n",
")\n",
"from azure.identity import DefaultAzureCredential"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1.2. Configure workspace details and get a handle to the workspace\n",
"\n",
"To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# enter details of your AML workspace\n",
"subscription_id = \"<SUBSCRIPTION_ID>\"\n",
"resource_group = \"<RESOURCE_GROUP>\"\n",
"workspace = \"<AML_WORKSPACE_NAME>\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get a handle to the workspace\n",
"ml_client = MLClient(\n",
" DefaultAzureCredential(), subscription_id, resource_group, workspace\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy and debug locally by using local endpoints"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Note\n",
"* To deploy locally, [Docker Engine](https://docs.docker.com/engine/install/) must be installed.\n",
"* Docker Engine must be running. Docker Engine typically starts when the computer starts. If it doesn't, you can [troubleshoot Docker Engine](https://docs.docker.com/config/daemon/#start-the-daemon-manually)."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. Define endpoint and deployment\n",
"\n",
"## 2.1 Define the endpoint\n",
"\n",
"To define an endpoint, you need to specify:\n",
"\n",
"* Endpoint name: The name of the endpoint. It must be unique in the Azure region. For more information on the naming rules, see [managed online endpoint limits](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints).\n",
"* Authentication mode: The authentication method for the endpoint. Choose between key-based authentication and Azure Machine Learning token-based authentication. A key doesn't expire, but a token does expire. For more information on authenticating, see [Authenticate to an online endpoint](how-to-authenticate-online-endpoint.md).\n",
"* Optionally, you can add a description and tags to your endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define an endpoint name\n",
"endpoint_name = \"my-endpoint\"\n",
"\n",
"# Example way to define a random name\n",
"import datetime\n",
"\n",
"endpoint_name = \"endpt-\" + datetime.datetime.now().strftime(\"%m%d%H%M%f\")\n",
"\n",
"# create an online endpoint\n",
"endpoint = ManagedOnlineEndpoint(\n",
" name=endpoint_name,\n",
" description=\"this is a sample online endpoint\",\n",
" auth_mode=\"key\",\n",
" tags={\"foo\": \"bar\"},\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.2 Define the deployment\n",
"\n",
"A deployment is a set of resources required for hosting the model that does the actual inferencing. To deploy a model, you must have:\n",
"\n",
"- Model files (or the name and version of a model that's already registered in your workspace). In the example, we have a scikit-learn model that does regression.\n",
"- A scoring script, that is, code that executes the model on a given input request. The scoring script receives data submitted to a deployed web service and passes it to the model. The script then executes the model and returns its response to the client. The scoring script is specific to your model and must understand the data that the model expects as input and returns as output. In this example, we have a *score.py* file.\n",
"- An environment in which your model runs. The environment can be a Docker image with Conda dependencies or a Dockerfile.\n",
"- Settings to specify the instance type and scaling capacity.\n",
"\n",
"The following table describes the key attributes of a deployment:\n",
"\n",
"| Attribute | Description |\n",
"|-----------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n",
"| Name | The name of the deployment. |\n",
"| Endpoint name | The name of the endpoint to create the deployment under. |\n",
"| Model | The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification. |\n",
"| Code path | The path to the directory on the local development environment that contains all the Python source code for scoring the model. You can use nested directories and packages. |\n",
"| Scoring script | The relative path to the scoring file in the source code directory. This Python code must have an `init()` function and a `run()` function. The `init()` function will be called after the model is created or updated (you can use it to cache the model in memory, for example). The `run()` function is called at every invocation of the endpoint to do the actual scoring and prediction. |\n",
"| Environment | The environment to host the model and code. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification. |\n",
"| Instance type | The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](reference-managed-online-endpoints-vm-sku-list.md). |\n",
"| Instance count | The number of instances to use for the deployment. Base the value on the workload you expect. For high availability, we recommend that you set the value to at least `3`. We reserve an extra 20% for performing upgrades. For more information, see [managed online endpoint quotas](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints). |"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model = Model(path=\"../model-1/model/sklearn_regression_model.pkl\")\n",
"env = Environment(\n",
" conda_file=\"../model-1/environment/conda.yaml\",\n",
" image=\"mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest\",\n",
")\n",
"\n",
"blue_deployment = ManagedOnlineDeployment(\n",
" name=\"blue\",\n",
" endpoint_name=endpoint_name,\n",
" model=model,\n",
" environment=env,\n",
" code_configuration=CodeConfiguration(\n",
" code=\"../model-1/onlinescoring\", scoring_script=\"score.py\"\n",
" ),\n",
" instance_type=\"Standard_DS3_v2\",\n",
" instance_count=1,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 3. Create local endpoint and deployment\n",
"\n",
"## 3.1 Create local endpoint\n",
"\n",
"The goal of a local endpoint deployment is to validate and debug your code and configuration before you deploy to Azure. Local deployment has the following limitations:\n",
"* Local endpoints *do not support* traffic rules, authentication, or probe settings.\n",
"* Local endpoints support only one deployment per endpoint.\n",
"* They support local model files only. If you want to test registered models, first download them, then use `path` in the deployment definition to refer to the parent folder."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3.2 Create local deployment\n",
"\n",
"Now, create a deployment named `blue` under the endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_deployments.begin_create_or_update(\n",
" deployment=blue_deployment, local=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `local=True` flag directs the SDK to deploy the endpoint in the Docker environment."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 4. Verify the local deployment succeeded\n",
"\n",
"## 4.1 Check the status to see whether the model was deployed without error"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_endpoints.get(name=endpoint_name, local=True)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4.2 Get logs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_deployments.get_logs(\n",
" name=\"blue\", endpoint_name=endpoint_name, local=True, lines=50\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4.3 Invoke the local endpoint\n",
"Invoke the endpoint to score the model by using the convenience command invoke and passing query parameters that are stored in a JSON file"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_endpoints.invoke(\n",
" endpoint_name=endpoint_name,\n",
" request_file=\"../model-1/sample-request.json\",\n",
" local=True,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. Deploy your online endpoint to Azure\n",
"Next, deploy your online endpoint to Azure.\n",
"\n",
"## 5.1 Create the endpoint\n",
"Using the `endpoint` we defined earlier and the `MLClient` created earlier, we'll now create the endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_endpoints.begin_create_or_update(endpoint).result()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5.2 Create the deployment\n",
"\n",
"Using the `blue_deployment` that we defined earlier and the `MLClient` we created earlier, we'll now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_deployments.begin_create_or_update(blue_deployment).result()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# blue deployment takes 100 traffic\n",
"endpoint.traffic = {\"blue\": 100}\n",
"ml_client.online_endpoints.begin_create_or_update(endpoint).result()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 6. Test the endpoint with sample data\n",
"Using the `MLClient` created earlier, we will get a handle to the endpoint. The endpoint can be invoked using the `invoke` command with the following parameters:\n",
"- `endpoint_name` - Name of the endpoint\n",
"- `request_file` - File with request data\n",
"- `deployment_name` - Name of the specific deployment to test in an endpoint\n",
"\n",
"We will send a sample request using a [json](./model-1/sample-request.json) file. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# test the blue deployment with some sample data\n",
"ml_client.online_endpoints.invoke(\n",
" endpoint_name=endpoint_name,\n",
" deployment_name=\"blue\",\n",
" request_file=\"../model-1/sample-request.json\",\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 7. Managing endpoints and deployments\n",
"\n",
"## 7.1 Get details of the endpoint"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the details for online endpoint\n",
"endpoint = ml_client.online_endpoints.get(name=endpoint_name)\n",
"\n",
"# existing traffic details\n",
"print(endpoint.traffic)\n",
"\n",
"# Get the scoring URI\n",
"print(endpoint.scoring_uri)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7.2 Get the logs for the new deployment\n",
"Get the logs for the green deployment and verify as needed"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_deployments.get_logs(\n",
" name=\"blue\", endpoint_name=endpoint_name, lines=50\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 8. Delete the endpoint\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_endpoints.begin_delete(name=endpoint_name)"
]
}
],
"metadata": {
"description": {
"description": "Use an online endpoint to deploy your model, so you don't have to create and manage the underlying infrastructure"
},
"kernelspec": {
"display_name": "Python 3.10 - SDK V2",
"language": "python",
"name": "python310-sdkv2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "78f9db5fbd37e8889e0f83cef79d3f22c09395ee4e0648cc45e2c02045ffa952"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}