# Use batch deployments for image file processing

The following notebook demostrates how to use batch endpoints to deploy models that work with images. Particularly, we are going to deploy a TensorFlow model for the popular ImageNet classification problem.

This notebook requires:

- `tensorflow`
- `tensorflow_hub`
- `pillow`
- `azure-ai-ml`
- `azureml-mlflow`
- `pandas`
- `scipy`

## 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

### 1.1. Import the required libraries

In [None]:
from azure.ai.ml import MLClient, Input
from azure.ai.ml.entities import (
    BatchEndpoint,
    ModelBatchDeployment,
    ModelBatchDeploymentSettings,
    Model,
    AmlCompute,
    Data,
    BatchRetrySettings,
    CodeConfiguration,
    Environment,
)
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.identity import DefaultAzureCredential

### 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [None]:
# enter details of your AML workspace
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"

In [None]:
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

If you are working in a Azure Machine Learning compute, you can simply:

In [None]:
ml_client = MLClient.from_config(DefaultAzureCredential())

## 2. Registering the model

### 2.1 About the model

Let's review how the model is built. The model was built using TensorFlow along with the RestNet architecture ([Identity Mappings in Deep Residual Networks](https://arxiv.org/abs/1603.05027)). This model has the following constraints that are important to keep in mind for deployment:

* In work with images of size 244x244 (tensors of `(224, 224, 3)`).
* It requires inputs to be scaled to the range `[0,1]`.

In [None]:
import tensorflow_hub as hub
import tensorflow as tf

model = tf.keras.Sequential(
    [
        hub.KerasLayer(
            "https://tfhub.dev/google/imagenet/resnet_v2_101/classification/5"
        ),
    ]
)
model.build([None, None, None, 3])

Testing if the model works:

In [None]:
import PIL.Image as Image
import numpy as np

image_file = tf.keras.utils.get_file(
    "image.jpeg",
    "https://azuremlexampledata.blob.core.windows.net/data/imagenet/goldfish.JPEG",
)
img = Image.open(image_file).resize((244, 244))
img = np.array(img) / 255.0
batch_img = tf.expand_dims(img, axis=0)

Run the model:

In [None]:
pred = model.predict(batch_img)
pred_class = pred.argmax(axis=-1)

Getting the labels for ImageNet:

In [None]:
labels_path = tf.keras.utils.get_file(
    "ImageNetLabels.txt",
    "https://azuremlexampledata.blob.core.windows.net/data/imagenet/ImageNetLabels.txt",
)
imagenet_labels = np.array(open(labels_path).read().splitlines())

In [None]:
predicted_class_name = [
    imagenet_labels[predicted_class] for predicted_class in pred_class
]
predicted_class_name

Let's save this model locally:

In [None]:
model_local_path = "model"
model.save(model_local_path)

### 2.2 Registering the model in the workspace

We need to register the model in order to use it with Azure Machine Learning:

In [None]:
model_name = "imagenet-classifier"

In [None]:
model = ml_client.models.create_or_update(
    Model(name=model_name, path=model_local_path, type=AssetTypes.CUSTOM_MODEL)
)

Let's get a reference to the model:

In [None]:
model = ml_client.models.get(name=model_name, label="latest")

## 3 Create Batch Endpoint

Batch endpoints are endpoints that are used batch inferencing on large volumes of data over a period of time. Batch endpoints receive pointers to data and run jobs asynchronously to process the data in parallel on compute clusters. Batch endpoints store outputs to a data store for further analysis.

To create an online endpoint we will use `BatchEndpoint`. This class allows user to configure the following key aspects:
- `name` - Name of the endpoint. Needs to be unique at the Azure region level
- `auth_mode` - The authentication method for the endpoint. Currently only Azure Active Directory (Azure AD) token-based (`aad_token`) authentication is supported. 
- `description`- Description of the endpoint.

### 3.1 Configure the endpoint

First, let's create the endpoint that is going to host the batch deployments. To ensure that our endpoint name is unique, let's create a random suffix to append to it. 

> In general, you won't need to use this technique but you will use more meaningful names. Please skip the following cell if your case:

In [None]:
import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "imagenet-classifier-" + endpoint_suffix

Let's configure the endpoint:

In [None]:
endpoint = BatchEndpoint(
    name=endpoint_name,
    description="An batch service to perform ImageNet image classification",
)

### 3.2 Create the endpoint
Using the `MLClient` created earlier, we will now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues.

In [None]:
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

## 4. Create a batch deployment

A deployment is a set of resources required for hosting the model that does the actual inferencing. We will create a deployment for our endpoint using the `BatchDeployment` class.

### 4.1 Creating an scoring script to work with the model

In [None]:
%%writefile code/score-by-file/batch_driver.py

import os
import numpy as np
import pandas as pd
import tensorflow as tf
from os.path import basename
from PIL import Image
from tensorflow.keras.models import load_model


def init():
    global model
    global input_width
    global input_height

    # AZUREML_MODEL_DIR is an environment variable created during deployment
    model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")

    # load the model
    model = load_model(model_path)
    input_width = 244
    input_height = 244

def run(mini_batch):
    results = []

    for image in mini_batch:
        data = Image.open(image).resize((input_width, input_height)) # Read and resize the image
        data = np.array(data)/255.0 # Normalize
        data_batch = tf.expand_dims(data, axis=0) # create a batch of size (1, 244, 244, 3)

        # perform inference
        pred = model.predict(data_batch)

        # Compute probabilities, classes and labels
        pred_prob = tf.math.reduce_max(tf.math.softmax(pred, axis=-1)).numpy()
        pred_class = tf.math.argmax(pred, axis=-1).numpy()

        results.append([basename(image), pred_class[0], pred_prob])

    return pd.DataFrame(results)

### 4.2 Creating the compute

Batch deployments can run on any Azure ML compute that already exists in the workspace. That means that multiple batch deployments can share the same compute infrastructure. In this example, we are going to work on an AzureML compute cluster called `cpu-cluster`. Let's verify the compute exists on the workspace or create it otherwise.

In [None]:
compute_name = "cpu-cluster"
if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    compute_cluster = AmlCompute(
        name=compute_name, description="amlcompute", min_instances=0, max_instances=5
    )
    ml_client.begin_create_or_update(compute_cluster).result()

### 4.3 Creating the environment

Let's create the environment. In our case, our model runs on `TensorFlow`. Azure Machine Learning already has an environment with the required software installed, so we can reutilize this environment.

In [None]:
environment = Environment(
    name="tensorflow212-cuda11-gpu",
    conda_file="environment/conda.yaml",
    image="mcr.microsoft.com/azureml/curated/tensorflow-2.16-cuda12:latest",
)

 ### 4.4 Configuring the deployment
 
 We will create a deployment for our endpoint using the `BatchDeployment` class. This class allows user to configure the following key aspects.
- `name` - Name of the deployment.
- `endpoint_name` - Name of the endpoint to create the deployment under.
- `model` - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
- `environment` - The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.
- `code_path`- Path to the source code directory for scoring the model
- `scoring_script` - Relative path to the scoring file in the source code directory
- `compute` - Name of the compute target to execute the batch scoring jobs on
- `instance_count`- The number of nodes to use for each batch scoring job.		1
- `max_concurrency_per_instance`- The maximum number of parallel scoring_script runs per instance.
- `mini_batch_size`	- The number of files the code_configuration.scoring_script can process in one `run()` call.
- `retry_settings`- Retry settings for scoring each mini batch.		
   - `max_retries`- The maximum number of retries for a failed or timed-out mini batch (default is 3)
   - `timeout`- The timeout in seconds for scoring a mini batch (default is 30)
- `output_action`- Indicates how the output should be organized in the output file. Allowed values are `append_row` or `summary_only`. Default is `append_row`
- `output_file_name`- Name of the batch scoring output file. Default is `predictions.csv`
- `environment_variables`- Dictionary of environment variable name-value pairs to set for each batch scoring job.
- `logging_level`- The log verbosity level.	Allowed values are `warning`, `info`, `debug`. Default is `info`.

In [None]:
deployment = ModelBatchDeployment(
    name="imagenet-classifier-resnetv2",
    description="A ResNetV2 model architecture for performing ImageNet classification in batch",
    endpoint_name=endpoint.name,
    model=model,
    environment=environment,
    code_configuration=CodeConfiguration(
        code="code/score-by-file",
        scoring_script="batch_driver.py",
    ),
    compute=compute_name,
    settings=ModelBatchDeploymentSettings(
        instance_count=2,
        max_concurrency_per_instance=1,
        mini_batch_size=10,
        output_action=BatchDeploymentOutputAction.APPEND_ROW,
        output_file_name="predictions.csv",
        retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
        logging_level="info",
    ),
)

### 4.5 Create the deployment
Using the `MLClient` created earlier, we will now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues.

In [None]:
ml_client.batch_deployments.begin_create_or_update(deployment).result()

### 4.6 Testing the deployment

Once the deployment is created, it is ready to recieve jobs.

#### 4.6.1 Creating a data asset

Let's first register a data asset so we can run the job against it. This data asset is a folder containing 1000 images from the original ImageNet dataset. We are going to download it first and then create the data asset:

In [None]:
!wget https://azuremlexampledata.blob.core.windows.net/data/imagenet/imagenet-1000.zip
!unzip imagenet-1000.zip -d /tmp/imagenet-1000

Registering a data asset:

In [None]:
data_path = "/tmp/imagenet-1000"
dataset_name = "imagenet-sample-unlabeled"

imagenet_sample = Data(
    path=data_path,
    type=AssetTypes.URI_FOLDER,
    description="A sample of 1000 images from the original ImageNet dataset",
    name=dataset_name,
)

In [None]:
ml_client.data.create_or_update(imagenet_sample)

Let's wait for the data asset:

In [None]:
from time import sleep

print(f"Waiting for data asset {dataset_name}", end="")
while not any(filter(lambda m: m.name == dataset_name, ml_client.data.list())):
    sleep(10)
    print(".", end="")

print(" [DONE]")

Let's get a reference of the new data asset:

In [None]:
imagenet_sample = ml_client.data.get(name=dataset_name, label="latest")

#### 4.6.2 Creating an input for the deployment

In [None]:
input = Input(type=AssetTypes.URI_FOLDER, path=imagenet_sample.id)

#### 4.6.3 Invoke the deployment

Using the `MLClient` created earlier, we will get a handle to the endpoint. The endpoint can be invoked using the `invoke` command with the following parameters:
- `name` - Name of the endpoint
- `input_path` - Path where input data is present
- `deployment_name` - Name of the specific deployment to test in an endpoint

In [None]:
job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name, deployment_name=deployment.name, input=input
)

#### 4.6.4 Get the details of the invoked job

Let us get details and logs of the invoked job:

In [None]:
ml_client.jobs.get(job.name)

We can wait for the job to finish using the following code:

In [None]:
ml_client.jobs.stream(job.name)

### 4.7 Exploring the results

The deployment creates a child job that executes the scoring. We can get the details of it using the following code:

In [None]:
scoring_job = list(ml_client.jobs.list(parent_job_name=job.name))[0]

In [None]:
print("Job name:", scoring_job.name)
print("Job status:", scoring_job.status)
print(
    "Job duration:",
    scoring_job.creation_context.last_modified_at
    - scoring_job.creation_context.created_at,
)

#### 4.7.1 Download the results

The outputs generated by the deployment job will be placed in an output named `score`:

In [None]:
ml_client.jobs.download(name=scoring_job.name, download_path=".", output_name="score")

We can read this data using pandas library:

In [None]:
import pandas as pd

score = pd.read_csv(
    "named-outputs/score/predictions.csv",
    header=None,
    names=["file", "class", "probabilities"],
    sep=" ",
)
score["label"] = score["class"].apply(lambda pred: imagenet_labels[pred])
score

### 5. Setting the default deployment

Once the deployment works correctly as we expect, we can update the default deployment to the new deployment so any invocation of the endpoint triggers the created deployment.

In [None]:
endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

We can see the endpoint URL as follows:

In [None]:
endpoint = ml_client.batch_endpoints.get(endpoint_name)
print(f"The default deployment is {endpoint.defaults.deployment_name}")

## 6. (Optional) High throughput deployments

We can achieve high throughput in deployments that score batches of images all at once instead of iterating one by one over the mini-batch. This kind of deployments can gain 5x of performance on CPU and 20x on a GPU (depending on hardware configuration and batching/parallelization).

### 6.1 Creating an scoring script to work with the model in batch

In [None]:
%%writefile code/score-by-batch/batch_driver.py

import os
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import load_model


def init():
    global model
    global input_width
    global input_height

    # AZUREML_MODEL_DIR is an environment variable created during deployment
    model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")

    # load the model
    model = load_model(model_path)
    input_width = 244
    input_height = 244

def decode_img(file_path):
    file = tf.io.read_file(file_path)
    img = tf.io.decode_jpeg(file, channels=3)
    img = tf.image.resize(img, [input_width, input_height])
    return img/255.

def run(mini_batch):
    images_ds = tf.data.Dataset.from_tensor_slices(mini_batch)
    images_ds = images_ds.map(decode_img).batch(64)

    # perform inference
    pred = model.predict(images_ds)

    # Compute probabilities, classes and labels
    pred_prob = tf.math.reduce_max(tf.math.softmax(pred, axis=-1)).numpy()
    pred_class = tf.math.argmax(pred, axis=-1).numpy()

    return pd.DataFrame({ 
        "file": mini_batch, 
        "class": pred_class,
        "probability": pred_prob, 
    })

### 6.2 Configuring a new deployment for the high performance inference

In [None]:
ht_deployment = ModelBatchDeployment(
    name="imagenet-classifier-resnetv2-ht",
    description="A ResNetV2 model architecture for performing ImageNet classification in batch (High throughput)",
    endpoint_name=endpoint.name,
    model=model,
    environment=environment,
    code_configuration=CodeConfiguration(
        code="code/score-by-batch",
        scoring_script="batch_driver.py",
    ),
    compute=compute_name,
    settings=ModelBatchDeploymentSettings(
        instance_count=2,
        max_concurrency_per_instance=1,
        mini_batch_size=10,
        output_action=BatchDeploymentOutputAction.APPEND_ROW,
        output_file_name="predictions.csv",
        retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
        logging_level="info",
    ),
)

### 6.3 Create the deployment

In [None]:
ml_client.batch_deployments.begin_create_or_update(ht_deployment).result()

### 6.4 Invoke the new deployment

In [None]:
import time

# Let's sleep for 2 min to ensure all resources are ready. This is only for automation purposes.
time.sleep(120)

#### 6.4.1 Execute

Let's execute this specific deployment now:

In [None]:
job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name, deployment_name=ht_deployment.name, input=input
)

#### 6.4.2 Get the details of the invoked job

Let us get details and logs of the invoked job:

In [None]:
ml_client.jobs.get(job.name)

We can wait for the job to finish using the following code:

In [None]:
ml_client.jobs.stream(job.name)

### 6.5 Exploring the results

The deployment creates a child job that executes the scoring. We can get the details of it using the following code:

In [None]:
scoring_job = list(ml_client.jobs.list(parent_job_name=job.name))[0]

In [None]:
print("Job name:", scoring_job.name)
print("Job status:", scoring_job.status)
print(
    "Job duration:",
    scoring_job.creation_context.last_modified_at
    - scoring_job.creation_context.created_at,
)

#### 6.5.1 Download the results

The outputs generated by the deployment job will be placed in an output named `score`:

In [None]:
ml_client.jobs.download(name=scoring_job.name, download_path=".", output_name="score")

We can read this data using pandas library:

In [None]:
import pandas as pd

score = pd.read_csv(
    "named-outputs/score/predictions.csv",
    header=None,
    names=["file", "class", "probabilities"],
    sep=" ",
)
score["label"] = score["class"].apply(lambda pred: imagenet_labels[pred])
score

## 7. Clean up resources

Clean-up the resources created. 

In [None]:
ml_client.batch_endpoints.begin_delete(endpoint_name)