## Image and Text Embeddings Inference using Online Endpoints

This sample shows how to deploy `embeddings` type models to an online endpoint for image and text embeddings inference.

### Task
`embeddings` takes in images and/or text samples. For each image and text sample, feature embeddings are returned from the model.
 
### Model
Models that can perform the `embeddings` task are tagged with `embeddings`. We will use the `OpenAI-CLIP-Image-Text-Embeddings-vit-base-patch32` model in this notebook. If you opened this notebook from a specific model card, remember to replace the specific model name. If you don't find a model that suits your scenario or domain, you can discover and [import models from HuggingFace hub](../../import/import_model_into_registry.ipynb) and then use them for inference. 

### Inference data
We will use the [fridgeObjects](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-classification/fridgeObjects.zip) dataset.


### Outline
1. Setup pre-requisites
2. Pick a model to deploy
3. Prepare data for inference
4. Deploy the model to an online endpoint
5. Test the endpoint
6. Clean up resources - delete the endpoint

### 1. Setup pre-requisites
* Install dependencies
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.
* Connect to `azureml` system registry

In [None]:
from azure.ai.ml import MLClient
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
    ClientSecretCredential,
)
from azure.ai.ml.entities import AmlCompute
import time

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

try:
    workspace_ml_client = MLClient.from_config(credential)
    subscription_id = workspace_ml_client.subscription_id
    resource_group = workspace_ml_client.resource_group_name
    workspace_name = workspace_ml_client.workspace_name
except Exception as ex:
    print(ex)
    # Enter details of your AML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace_name = "<AML_WORKSPACE_NAME>"
workspace_ml_client = MLClient(
    credential, subscription_id, resource_group, workspace_name
)

# The models are available in the AzureML system registry, "azureml"
registry_ml_client = MLClient(
    credential,
    subscription_id,
    resource_group,
    registry_name="azureml",
)

### 2. Pick a model to deploy

Browse models in the Model Catalog in the AzureML Studio, filtering by the `embeddings` task. In this example, we use the `OpenAI-CLIP-Image-Text-Embeddings-vit-base-patch32` model. If you have opened this notebook for a different model, replace the model name accordingly.

In [None]:
model_name = "OpenAI-CLIP-Image-Text-Embeddings-vit-base-patch32"
foundation_model = registry_ml_client.models.get(name=model_name, label="latest")
print(
    f"\n\nUsing model name: {foundation_model.name}, version: {foundation_model.version}, id: {foundation_model.id} for inferencing"
)

### 3. Prepare data for inference

We will use the [fridgeObjects](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-classification/fridgeObjects.zip) dataset for multi-class classification task. The fridge object dataset is stored in a directory. There are four different folders inside:
- /water_bottle
- /milk_bottle
- /carton
- /can

This is the most common data format for multiclass image classification. Each folder title corresponds to the image label for the images contained inside. 

In [None]:
import os
import urllib
from zipfile import ZipFile

# Change to a different location if you prefer
dataset_parent_dir = "./data"

# Create data folder if it doesnt exist.
os.makedirs(dataset_parent_dir, exist_ok=True)

# Download data
download_url = "https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-classification/fridgeObjects.zip"

# Extract current dataset name from dataset url
dataset_name = os.path.split(download_url)[-1].split(".")[0]
# Get dataset path for later use
dataset_dir = os.path.join(dataset_parent_dir, dataset_name)

# Get the data zip file path
data_file = os.path.join(dataset_parent_dir, f"{dataset_name}.zip")

# Download the dataset
urllib.request.urlretrieve(download_url, filename=data_file)

# Extract files
with ZipFile(data_file, "r") as zip:
    print("extracting files...")
    zip.extractall(path=dataset_parent_dir)
    print("done")
# Delete zip file
os.remove(data_file)

In [None]:
from IPython.display import Image

sample_image = os.path.join(dataset_dir, "milk_bottle", "99.jpg")
Image(filename=sample_image)

### 4. Deploy the model to an online endpoint for real time inference
Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model.

In [None]:
import time, sys
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
)

# Endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name
timestamp = int(time.time())
online_endpoint_name = "clip-embeddings-" + str(timestamp)
# Create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="Online endpoint for "
    + foundation_model.name
    + ", for image-text-embeddings task",
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

In [None]:
from azure.ai.ml.entities import OnlineRequestSettings, ProbeSettings

deployment_name = "embeddings-mlflow-deploy"

# Create a deployment
demo_deployment = ManagedOnlineDeployment(
    name=deployment_name,
    endpoint_name=online_endpoint_name,
    model=foundation_model.id,
    instance_type="Standard_DS3_V2",  # Use GPU instance type like Standard_NC6s_v3 for faster inference
    instance_count=1,
    request_settings=OnlineRequestSettings(
        max_concurrent_requests_per_instance=1,
        request_timeout_ms=90000,
        max_queue_wait_ms=500,
    ),
    liveness_probe=ProbeSettings(
        failure_threshold=49,
        success_threshold=1,
        timeout=299,
        period=180,
        initial_delay=180,
    ),
    readiness_probe=ProbeSettings(
        failure_threshold=10,
        success_threshold=1,
        timeout=10,
        period=10,
        initial_delay=10,
    ),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {deployment_name: 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

### 5.1 Test the endpoint - base64 images

We will fetch some sample data from the test dataset and submit to online endpoint for inference.

In [None]:
import base64
import json

sample_image_1 = os.path.join(dataset_dir, "milk_bottle", "99.jpg")
sample_image_2 = os.path.join(dataset_dir, "can", "1.jpg")


def read_image(image_path):
    with open(image_path, "rb") as f:
        return f.read()


request_json = {
    "input_data": {
        "columns": ["image", "text"],
        "index": [0, 1],
        "data": [
            [
                base64.encodebytes(read_image(sample_image_1)).decode("utf-8"),
                "",
            ],  # the "text" column should contain empty string
            [base64.encodebytes(read_image(sample_image_2)).decode("utf-8"), ""],
        ],
    }
}

# Create request json
request_file_name = "sample_request_data.json"
with open(request_file_name, "w") as request_file:
    json.dump(request_json, request_file)

In [None]:
# Score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name=demo_deployment.name,
    request_file=request_file_name,
)
print(f"raw response: {response}\n")

### 5.2 Test the endpoint - text samples

We will fetch some sample data from the test dataset and submit to online endpoint for inference.

In [None]:
import base64
import json

sample_image_1 = os.path.join(dataset_dir, "milk_bottle", "99.jpg")
sample_image_2 = os.path.join(dataset_dir, "can", "1.jpg")


def read_image(image_path):
    with open(image_path, "rb") as f:
        return f.read()


request_json = {
    "input_data": {
        "columns": ["image", "text"],
        "index": [0, 1],
        "data": [
            [
                "",
                "a photo of a milk bottle",
            ],  # the "image" column should contain empty string
            ["", "a photo of a metal can"],
        ],
    }
}

# Create request json
request_file_name = "sample_request_data.json"
with open(request_file_name, "w") as request_file:
    json.dump(request_json, request_file)

In [None]:
# Score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name=demo_deployment.name,
    request_file=request_file_name,
)
print(f"raw response: {response}\n")

### 5.3 Test the endpoint - base64 images and text samples

We will fetch some sample data from the test dataset and submit to online endpoint for inference.

In [None]:
import base64
import json

sample_image_1 = os.path.join(dataset_dir, "milk_bottle", "99.jpg")
sample_image_2 = os.path.join(dataset_dir, "can", "1.jpg")


def read_image(image_path):
    with open(image_path, "rb") as f:
        return f.read()


request_json = {
    "input_data": {
        "columns": ["image", "text"],
        "index": [0, 1],
        "data": [
            [
                base64.encodebytes(read_image(sample_image_1)).decode("utf-8"),
                "a photo of a milk bottle",
            ],  # all rows should have both images and text
            [
                base64.encodebytes(read_image(sample_image_2)).decode("utf-8"),
                "a photo of a metal can",
            ],
        ],
    }
}

# Create request json
request_file_name = "sample_request_data.json"
with open(request_file_name, "w") as request_file:
    json.dump(request_json, request_file)

In [None]:
# Score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name=demo_deployment.name,
    request_file=request_file_name,
)
print(f"raw response: {response}\n")

### 6. Clean up resources - delete the online endpoint
Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint.

In [None]:
workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()