# Progressive rollout of MLflow deployments

Online Endpoints have the concept of __Endpoint__ and __Deployment__. An endpoint represents the API that customers uses to consume the model, while a deployment indicates the specific implementation of that API. This distinction allows users to decouple the API from the implementation and to change the underlying implementation without affecting the consumer.

## 1. Connect to Azure Machine Learning Workspace

Import the namespaces:

In [None]:
from mlflow.tracking import MlflowClient
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

import json
import requests
import mlflow
import pandas as pd

### If you are working in a Compute Instance in Azure Machine Learning

If you are working in Azure Machine Learning Compute Instances, your MLflow installation is automatically connected to Azure Machine Learning, and you don't need to do anything.

### If you are working in your local machine or in a cloud outside Azure Machine Learning

You will need to connect MLflow to the Azure Machine Learning workspace you want to work on. MLflow uses the tracking URI to indicate the MLflow server you want to connect to. There are multiple ways to get the Azure Machine Learning MLflow Tracking URI. In this tutorial we will use the Azure ML SDK for Python, but you can check [Set up tracking environment - Azure Machine Learning Docs](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-mlflow-cli-runs#set-up-tracking-environment) for more alternatives.

In [None]:
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"

In [None]:
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

You can use the workspace object to get the tracking URI:

In [None]:
azureml_tracking_uri = ml_client.workspaces.get(
    ml_client.workspace_name
).mlflow_tracking_uri
mlflow.set_tracking_uri(azureml_tracking_uri)

## 2. Registering the model in the registry

This example uses an MLflow model based on the [UCI Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence).

The model has been trained using an XGBoost classifier and all the required preprocessing has been packaged as a scikit-learn pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.

Let's ensure the model is registered in the workspace:

In [None]:
model_name = "heart-classifier"
model_local_path = "model"

Let's check if the model is registered:

In [None]:
mlflow_client = MlflowClient()
model_versions = mlflow_client.search_model_versions(
    filter_string=f"name = '{model_name}'"
)

If not, let's create one:

In [None]:
if any(model_versions):
    version = model_versions[0].version
else:
    try:
        mlflow_client.create_registered_model(model_name)
    except Exception as e:
        print(e)
    registered_model = mlflow_client.create_model_version(
        name=model_name, source=f"file://{model_local_path}"
    )
    version = registered_model.version

In [None]:
print(f"We are going to deploy model {model_name} with version {version}")

## 3. Create an Online Endpoint

Online endpoints are endpoints that are used for online (real-time) inferencing. Online endpoints contain deployments that are ready to receive data from clients and can send responses back in real time.

We are going to exploit this functionality by deploying multiple versions of the same model under the same endpoint. However, the new deployment will receive 0% of the traffic at the begging. Once we are sure about the new model to work correctly, we are going to progressively move traffic from one deployment to the other.

First, let's create an MLflow deployment client for Azure Machine Learning:

In [None]:
from mlflow.deployments import get_deploy_client

deployment_client = get_deploy_client(mlflow.get_tracking_uri())

### 3.1 Configure the endpoint

Let's create an endpoint explicitly now. We can configure the properties of this deployment using a configuration file as we did before. In this case, we are configuring the authentication mode of the endpoint to be "key". The configuration file is optional.

In [None]:
endpoint_config = {"auth_mode": "key", "identity": {"type": "system_assigned"}}

Let's write this configuration into a `JSON` file:

In [None]:
endpoint_config_path = "endpoint_config.json"
with open(endpoint_config_path, "w") as outfile:
    outfile.write(json.dumps(endpoint_config))

Endpoints require a name, which needs to be unique in the same region. Let's ensure to create one that doesn't exist:

In [None]:
import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "heart-classifier-" + endpoint_suffix

print(f"Endpoint name: {endpoint_name}")

### 3.2 Create an Online Endpoint


Create the endpoint

In [None]:
endpoint = deployment_client.create_endpoint(
    name=endpoint_name,
    config={"endpoint-config-file": endpoint_config_path},
)

### 3.3 Get the scoring URI from the endpoint

In [None]:
scoring_uri = deployment_client.get_endpoint(endpoint=endpoint_name)["properties"][
    "scoringUri"
]
print(scoring_uri)

## 4. Create deployments

### 4.1 Create a blue deployment under the endpoint

So far, the endpoint is empty. There are no deployments on it. Let's create the first one by deploying the same model we were working on before. We will call this deployment "default" and this will represent our "blue deployment".

#### 4.1.1 Configure the deployment

In [None]:
blue_deployment_name = "default"

Configure the default deployment:

In [None]:
deploy_config = {
    "instance_type": "Standard_DS3_v2",
    "instance_count": 1,
}

Write the configuration to a file:

In [None]:
deployment_config_path = "deployment_config.json"
with open(deployment_config_path, "w") as outfile:
    outfile.write(json.dumps(deploy_config))

Create the deployment:

In [None]:
blue_deployment = deployment_client.create_deployment(
    name=blue_deployment_name,
    endpoint=endpoint_name,
    model_uri=f"models:/{model_name}/{version}",
    config={"deploy-config-file": deployment_config_path},
)

#### 4.1.2 Assign all the traffic to the created deployment

By default, new deployments receive none of the traffic from the endpoint. Let's assign all of it to the deployment:

In [None]:
traffic_config = {"traffic": {blue_deployment_name: 100}}

Let's write the configuration to a file:

In [None]:
traffic_config_path = "traffic_config.json"
with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))

We are going to use the key `endpoint-config-file` to update the configuration:

In [None]:
deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)

Now all the traffic is on our blue deployment.

### 4.2 Test the endpoint

The following code samples 5 observations from the training dataset, removes the `target` column (as the model will predict it), and creates a data frame we can use to test the deployment.

In [None]:
samples = (
    pd.read_csv("data/heart.csv")
    .sample(n=5)
    .drop(columns=["target"])
    .reset_index(drop=True)
)

#### 4.2.1 Invoke the endpoint using the deployment client

You can use the MLflow deployment client to invoke the endpoint and test it:

In [None]:
deployment_client.predict(endpoint=endpoint_name, df=samples)

#### 4.2.2 Making REST requests

Online Endpoints support both key-based authentication or Azure Active Directory. In this case we are going to use key-based authentication which is based on a secret that the caller needs to include in the headers of the request. You can get this key using:

- Azure ML SDK for Python
- Azure ML CLI
- [Azure ML studio](https://ml.azure.com)

In our case, we are going to use the Azure ML SDK for Python. If you didn't create an `MLClient` before, create a client for the Azure Machine Learning workspace:

In [None]:
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

Let's get the secrets of the endpoint:

In [None]:
endpoint_secret_key = ml_client.online_endpoints.get_keys(
    name=endpoint_name
).access_token

Once you have the secret key, we need  to create the headers for the request:

In [None]:
headers = {
    "Content-Type": "application/json",
    "Authorization": ("Bearer " + endpoint_secret_key),
}

Make a post to the endpoint. Azure Machine Learning requires the key `input_data` to be added to the input examples that you want to provide to the service. Notice that this is not the case of the command `mlflow model serve`.

In [None]:
sample_request = {
    "input_data": json.loads(samples.to_json(orient="split", index=False))
}

In [None]:
req = requests.post(scoring_uri, json=sample_request, headers=headers)
req.json()

### 4.3 Create a green deployment under the endpoint

Let's imagine that there is a new version of the model created by the development team and it is ready to be in production. We can first try to fly this model and once we are confident, we can update the endpoint to route the traffic to it.

#### 4.3.1 Register a new model version

In [None]:
registered_model = mlflow_client.create_model_version(
    name=model_name, source=f"file://{model_local_path}"
)
version = registered_model.version

#### 4.3.2 Create a new deployment under the same endpoint

We will call this new deployment `xgboost-model-<version>` and this correspond to our "green deployment".

In [None]:
green_deployment_name = f"xgboost-model-{version}"

In [None]:
new_deployment = deployment_client.create_deployment(
    name=green_deployment_name,
    endpoint=endpoint_name,
    model_uri=f"models:/{model_name}/{version}",
    config={"deploy-config-file": deployment_config_path},
)

> We are using the same hardware confirmation indicated in the `deployment-config-file`. However, there is no requirements to have the same configuration. You can configure different hardware for different models depending on the requirements.

#### 4.3.3 Test the new deployment

Let's test the new deployment. By default, the endpoint is configure to do not route any request to the green deployment. However, we can bypass the router by adding an specific deployment in our request:

In [None]:
deployment_client.predict(endpoint=endpoint_name, deployment_name=green_deployment_name df=samples)

We can also use REST to invoke this specific deployment:

In [None]:
headers = {
    "Content-Type": "application/json",
    "Authorization": ("Bearer " + endpoint_secret_key),
    "azureml-model-deployment": green_deployment_name,
}

In [None]:
req = requests.post(scoring_uri, json=sample_request, headers=headers)
req.json()

## 5. Progressively update the traffic

### 5.1 Update traffic

One we are confident with the new deployment, we can update the traffic to route some of it to the new deployment. Traffic is configured at the endpoint level:

In [None]:
traffic_config = {"traffic": {blue_deployment_name: 90, green_deployment_name: 10}}

Let's write the configuration to a file:

In [None]:
traffic_config_path = "traffic_config.json"
with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))

We are going to use the key `endpoint-config-file` to update the configuration:

In [None]:
deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)

### 5.2 Update all the traffic

Let's see how we can transfer all the traffic to the new deployment

In [None]:
traffic_config = {"traffic": {blue_deployment_name: 0, green_deployment_name: 100}}

In [None]:
traffic_config_path = "traffic_config.json"
with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))

In [None]:
deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)

### 5.3 If you want, you can delete the old deployment now

In [None]:
deployment_client.delete_deployment(blue_deployment_name, endpoint=endpoint_name)

Notice that at this point, the former "blue deployment" has been deleted and the new "green deployment" has taken the place of the "blue deployment".

## 6. Delete resources

Once you are ready, delete the created resources:

In [None]:
deployment_client.delete_endpoint(endpoint_name)

> This operation deletes the endpoint all along with its deployments.