sdk/python/foundation-models/system/inference/mask-generation/mask-generation-online-endpoint.ipynb

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Mask Generation Inference using Online Endpoints\n", "\n", "This sample shows how to deploy `mask-generation` type models to an online endpoint for inference.\n", "\n", "### Task\n", "`mask-generation` takes in images and prompts (input points, input_boxes, input_labels) and for each image, generates segmentation masks based on the prompts given.\n", "\n", "### Model\n", "Models that can perform the `mask-generation` task are tagged with `mask-generation`. We will use the `facebook-sam-vit-huge` model in this notebook. If you opened this notebook from a specific model card, remember to replace the specific model name.\n", "\n", "### Inference data\n", "We will use the [odFridgeObjects](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-object-detection/odFridgeObjects.zip) dataset.\n", "\n", "\n", "### Outline\n", "1. Setup pre-requisites\n", "2. Pick a model to deploy\n", "3. Prepare data for inference\n", "4. Deploy the model to an online endpoint for real time inference\n", "5. Test the endpoint\n", "6. Clean up resources - delete the online endpoint" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. Setup pre-requisites\n", "* Install dependencies\n", "* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.\n", "* Connect to `azureml` system registry" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml import MLClient\n", "from azure.identity import (\n", " DefaultAzureCredential,\n", " InteractiveBrowserCredential,\n", ")\n", "import time\n", "\n", "try:\n", " credential = DefaultAzureCredential()\n", " credential.get_token(\"https://management.azure.com/.default\")\n", "except Exception as ex:\n", " credential = InteractiveBrowserCredential()\n", "\n", "try:\n", " workspace_ml_client = MLClient.from_config(credential)\n", " subscription_id = workspace_ml_client.subscription_id\n", " resource_group = workspace_ml_client.resource_group_name\n", " workspace_name = workspace_ml_client.workspace_name\n", "except Exception as ex:\n", " print(ex)\n", " # Enter details of your AML workspace\n", " subscription_id = \"<SUBSCRIPTION_ID>\"\n", " resource_group = \"<RESOURCE_GROUP>\"\n", " workspace_name = \"<WORKSPACE_NAME>\"\n", "workspace_ml_client = MLClient(\n", " credential, subscription_id, resource_group, workspace_name\n", ")\n", "\n", "# The models are available in the AzureML system registry, \"azureml\"\n", "registry_ml_client = MLClient(\n", " credential,\n", " subscription_id,\n", " resource_group,\n", " registry_name=\"azureml\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Pick a model to deploy\n", "\n", "Browse models in the Model Catalog in the AzureML Studio, filtering by the `mask-generation` task. In this example, we use the `facebook-sam-vit-huge` model. If you have opened this notebook for a different model, replace the model name accordingly." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_name = \"facebook-sam-vit-huge\"\n", "\n", "foundation_model = registry_ml_client.models.get(name=model_name, label=\"latest\")\n", "print(\n", " f\"\\n\\nUsing model name: {foundation_model.name}, version: {foundation_model.version}, id: {foundation_model.id} for inferencing\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3. Prepare data for inference\n", "\n", "We will use the [odFridgeObjects](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-object-detection/odFridgeObjects.zip) dataset for this mask-generation task." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import urllib\n", "from zipfile import ZipFile\n", "\n", "# Change to a different location if you prefer\n", "dataset_parent_dir = \"./data\"\n", "\n", "# Create data folder if it doesnt exist.\n", "os.makedirs(dataset_parent_dir, exist_ok=True)\n", "\n", "# Download data\n", "download_url = \"https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-object-detection/odFridgeObjects.zip\"\n", "\n", "# Extract current dataset name from dataset url\n", "dataset_name = os.path.split(download_url)[-1].split(\".\")[0]\n", "# Get dataset path for later use\n", "dataset_dir = os.path.join(dataset_parent_dir, dataset_name)\n", "\n", "# Get the data zip file path\n", "data_file = os.path.join(dataset_parent_dir, f\"{dataset_name}.zip\")\n", "\n", "# Download the dataset\n", "urllib.request.urlretrieve(download_url, filename=data_file)\n", "\n", "# Extract files\n", "with ZipFile(data_file, \"r\") as zip:\n", " print(\"extracting files...\")\n", " zip.extractall(path=dataset_parent_dir)\n", " print(\"done\")\n", "# Delete zip file\n", "os.remove(data_file)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.display import Image\n", "\n", "sample_image = os.path.join(dataset_dir, \"images\", \"99.jpg\")\n", "Image(filename=sample_image)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4. Deploy the model to an online endpoint for real time inference\n", "Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "from azure.ai.ml.entities import (\n", " ManagedOnlineEndpoint,\n", " ManagedOnlineDeployment,\n", ")\n", "\n", "# Endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name\n", "timestamp = int(time.time())\n", "online_endpoint_name = \"mask-gen-\" + str(timestamp)\n", "# Create an online endpoint\n", "endpoint = ManagedOnlineEndpoint(\n", " name=online_endpoint_name,\n", " description=\"Online endpoint for \"\n", " + foundation_model.name\n", " + \", for mask-generation task\",\n", " # auth_mode=\"key\",\n", ")\n", "workspace_ml_client.begin_create_or_update(endpoint).wait()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml.entities import OnlineRequestSettings, ProbeSettings\n", "\n", "deployment_name = \"mask-gen-mlflow-deploy\"\n", "\n", "# Create a deployment\n", "demo_deployment = ManagedOnlineDeployment(\n", " name=deployment_name,\n", " endpoint_name=online_endpoint_name,\n", " model=foundation_model.id,\n", " instance_type=\"Standard_DS5_V2\", # Use GPU instance type like Standard_NC6s_v3 for faster inference\n", " instance_count=1,\n", " request_settings=OnlineRequestSettings(\n", " max_concurrent_requests_per_instance=1,\n", " request_timeout_ms=90000,\n", " max_queue_wait_ms=500,\n", " ),\n", " liveness_probe=ProbeSettings(\n", " failure_threshold=49,\n", " success_threshold=1,\n", " timeout=299,\n", " period=180,\n", " initial_delay=180,\n", " ),\n", " readiness_probe=ProbeSettings(\n", " failure_threshold=10,\n", " success_threshold=1,\n", " timeout=10,\n", " period=10,\n", " initial_delay=10,\n", " ),\n", ")\n", "workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()\n", "endpoint.traffic = {deployment_name: 100}\n", "workspace_ml_client.begin_create_or_update(endpoint).result()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5. Test the endpoint\n", "\n", "We will fetch some sample data from the test dataset and submit to online endpoint for inference." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Input Data Parameters\n", "\n", "- **`image`**: \n", " - Column in the DataFrame containing images as base64-encoded strings or URLs.\n", "\n", "- **`input_points`**: \n", " - String representation of a numpy array with shape `(point_batch_size, num_points, 2)`.\n", " - Represents 2D spatial input points for prompt encoding.\n", " - Organized as a list of lists, converted into tensors with dimensions for point batches (i.e. how many segmentation masks do we want the model to predict per input point), number of points, and x, y coordinates.\n", " - Padding points are represented as `(0, 0)` and are excluded from embedding computations.\n", "\n", "- **`input_boxes`**: \n", " - String representation of a numpy array with shape `(num_boxes, 4)`.\n", " - Encodes input boxes for points to enhance mask generation.\n", " - Structured as a nested list that translates to tensors indicating number of boxes per image, and coordinates for each box's top left (`x1`, `y1`) and bottom right (`x2`, `y2`) points.\n", "\n", "- **`input_labels`**: \n", " - String representation of a numpy array with shape `(point_batch_size, num_points)`.\n", " - Includes labels for the points utilized by the prompt encoder.\n", " - Label types are `1` for object points, `0` for non-object points, `-1` for background points, and `-10` for padding points.\n", "\n", "- **`multimask_output`**: \n", " - Boolean indicating if multiple masks per input point (`True`) or a single mask (`False`) should be returned.\n", " - Default is set to `True`, allowing for multiple masks per point unless otherwise specified.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import base64\n", "import json\n", "\n", "sample_image = os.path.join(dataset_dir, \"images\", \"99.jpg\")\n", "\n", "\n", "def read_image(image_path):\n", " with open(image_path, \"rb\") as f:\n", " return f.read()\n", "\n", "\n", "request_json = {\n", " \"input_data\": {\n", " \"columns\": [\n", " \"image\",\n", " \"input_points\",\n", " \"input_boxes\",\n", " \"input_labels\",\n", " \"multimask_output\",\n", " ],\n", " \"index\": [0, 1, 2, 3, 4],\n", " \"data\": [\n", " # segmentation mask per input point\n", " [\n", " base64.encodebytes(read_image(sample_image)).decode(\"utf-8\"),\n", " \"[[[280,320]], [[300,350]]]\",\n", " \"\",\n", " \"\",\n", " True,\n", " ],\n", " # single segmentation mask for multiple input points\n", " [\n", " base64.encodebytes(read_image(sample_image)).decode(\"utf-8\"),\n", " \"[[[280,320], [300,350]]]\",\n", " \"\",\n", " \"\",\n", " True,\n", " ],\n", " # single segmentation mask per single bounding box\n", " [\n", " base64.encodebytes(read_image(sample_image)).decode(\"utf-8\"),\n", " \"\",\n", " \"[[125,240,375,425]]\",\n", " \"\",\n", " True,\n", " ],\n", " # segmentation mask using both bounding box and input points\n", " [\n", " base64.encodebytes(read_image(sample_image)).decode(\"utf-8\"),\n", " \"[[[280,320]]]\",\n", " \"[[125,240,375,425]]\",\n", " \"\",\n", " True,\n", " ],\n", " # segmentation mask using both bounding box and input points and labels\n", " [\n", " base64.encodebytes(read_image(sample_image)).decode(\"utf-8\"),\n", " \"[[[280,320]]]\",\n", " \"[[125,240,375,425]]\",\n", " \"[[0]]\",\n", " True,\n", " ],\n", " ],\n", " }\n", "}\n", "\n", "# Create request json\n", "request_file_name = \"sample_request_data.json\"\n", "with open(request_file_name, \"w\") as request_file:\n", " json.dump(request_json, request_file)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Score the sample_score.json file using the online endpoint with the azureml endpoint invoke method\n", "response = workspace_ml_client.online_endpoints.invoke(\n", " endpoint_name=online_endpoint_name,\n", " deployment_name=demo_deployment.name,\n", " request_file=request_file_name,\n", ")\n", "\n", "print(f\"raw response: {response}\\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Visualize input bounding box and generated mask on the image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# visualize sample input bounding box as prompt and output mask\n", "import io\n", "import base64\n", "import pandas as pd\n", "from PIL import Image\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "\n", "def show_box(box, ax):\n", " x0, y0 = box[0], box[1]\n", " w, h = box[2] - box[0], box[3] - box[1]\n", " ax.add_patch(\n", " plt.Rectangle((x0, y0), w, h, edgecolor=\"green\", facecolor=(0, 0, 0, 0), lw=2)\n", " )\n", "\n", "\n", "def show_mask(mask, ax, random_color=False):\n", " if not isinstance(mask, np.ndarray):\n", " mask = np.array(mask)\n", " mask = mask > 0\n", " if random_color:\n", " color = np.concatenate([np.random.random(3), np.array([0.6])], axis=0)\n", " else:\n", " color = np.array([30 / 255, 144 / 255, 255 / 255, 0.6])\n", " h, w = mask.shape[-2:]\n", " mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)\n", " ax.imshow(mask_image)\n", "\n", "\n", "df_from_json = pd.read_json(response)\n", "encoded_mask = df_from_json[\"response\"][3][\"predictions\"][0][\"masks_per_prediction\"][0][\n", " \"encoded_binary_mask\"\n", "]\n", "mask_iou = df_from_json[\"response\"][3][\"predictions\"][0][\"masks_per_prediction\"][0][\n", " \"iou_score\"\n", "]\n", "\n", "# Load the sample image\n", "img = Image.open(io.BytesIO(base64.b64decode(encoded_mask)))\n", "raw_image = Image.open(sample_image).convert(\"RGB\")\n", "\n", "# Display the original image and bounding box\n", "fig, axes = plt.subplots(1, 2, figsize=(15, 15))\n", "axes[0].imshow(np.array(raw_image))\n", "show_box([125, 240, 375, 425], axes[0])\n", "axes[0].title.set_text(f\"Input image with bounding box as prompt.\")\n", "axes[0].axis(\"off\")\n", "\n", "axes[1].imshow(np.array(raw_image))\n", "show_mask(img, axes[1])\n", "axes[1].title.set_text(f\"Output mask with iou score: {mask_iou:.3f}\")\n", "axes[1].axis(\"off\")\n", "# Adjust the spacing between subplots\n", "fig.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6. Clean up resources - delete the online endpoint\n", "Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "dri", "language": "python", "name": "dri" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.0" } }, "nbformat": 4, "nbformat_minor": 2 }

sdk/python/foundation-models/system/inference/mask-generation/mask-generation-online-endpoint.ipynb (492 lines of code) (raw):