training/built-in-algorithms/Image-classification-fulltraining-elastic-inference.ipynb (1,136 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using SageMaker Image Classification with Amazon Elastic Inference\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n",
"\n",
"\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"1. [Introduction](#Introduction)\n",
"2. [Prerequisites and Preprocessing](#Prequisites-and-Preprocessing)\n",
" 1. [Permissions and environment variables](#Permissions-and-environment-variables)\n",
"3. [Training the ResNet model](#Training-the-ResNet-model)\n",
" 1. [Training Parameters](#Training-parameters)\n",
" 2. [Training with SageMaker Training](#Training-with-sagemaker-training)\n",
" 3. [Training with SageMaker Automatic Model Tuning](#Tuning-with-sagemaker-automatic-model-tuning)\n",
"4. [Deploy The Model](#Deploy-the-model)\n",
" 1. [Create model](#Create-model)\n",
" 2. [Real-time inference](#Real-time-inference)\n",
" 1. [Create endpoint configuration](#Create-endpoint-configuration) \n",
" 2. [Create endpoint](#Create-endpoint) \n",
" 3. [Perform inference](#Perform-inference) \n",
" 4. [Clean up](#Clean-up)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"This notebook demonstrates how to enable and use Amazon Elastic Inference (EI) for real-time inference with SageMaker Image Classification algorithm.\n",
"\n",
"Amazon Elastic Inference (EI) is a service that provides cost-efficient hardware acceleration meant for inferences in AWS. For more information please visit: https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html\n",
"\n",
"This notebook is an adaption of the SageMaker Image Classification's [end-to-end notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/imageclassification_caltech/Image-classification-fulltraining-highlevel.ipynb), with modifications showing the changes needed to use EI for real-time inference with SageMaker Image Classification algorithm.\n",
"\n",
"In this demo, we will use the Amazon SageMaker image classification algorithm to train on the [Caltech-256 dataset](https://paperswithcode.com/dataset/caltech-256). \n",
"\n",
"To get started, we need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prequisites and Preprocessing\n",
"\n",
"### Permissions and environment variables\n",
"\n",
"Here we set up the linkage and authentication to AWS services. There are three parts to this:\n",
"\n",
"* The roles used to give learning and hosting access to your data. This will automatically be obtained from the role used to start the notebook\n",
"* The S3 bucket that you want to use for training and model data\n",
"* The Amazon SageMaker Image Classification docker image which need not be changed"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! pip install --upgrade sagemaker"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"parameters"
]
},
"outputs": [],
"source": [
"%%time\n",
"import boto3\n",
"import re\n",
"import sagemaker\n",
"from sagemaker import get_execution_role\n",
"from sagemaker import image_uris\n",
"\n",
"role = get_execution_role()\n",
"\n",
"bucket = sagemaker.Session().default_bucket()\n",
"\n",
"training_image = image_uris.retrieve(\n",
" region=boto3.Session().region_name, framework=\"image-classification\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data preparation\n",
"\n",
"Download the data and transfer to S3 for use in training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import boto3\n",
"\n",
"s3_client = boto3.client(\"s3\")\n",
"\n",
"\n",
"def upload_to_s3(channel, file):\n",
" s3 = boto3.resource(\"s3\")\n",
" data = open(file, \"rb\")\n",
" key = channel + \"/\" + file\n",
" s3.Bucket(bucket).put_object(Key=key, Body=data)\n",
"\n",
"\n",
"# caltech-256\n",
"s3_train_key = \"image-classification-full-training/train\"\n",
"s3_validation_key = \"image-classification-full-training/validation\"\n",
"s3_train = \"s3://{}/{}/\".format(bucket, s3_train_key)\n",
"s3_validation = \"s3://{}/{}/\".format(bucket, s3_validation_key)\n",
"\n",
"s3_client.download_file(\n",
" \"sagemaker-sample-files\",\n",
" \"datasets/image/caltech-256/caltech-256-60-train.rec\",\n",
" \"caltech-256-60-train.rec\",\n",
")\n",
"upload_to_s3(s3_train_key, \"caltech-256-60-train.rec\")\n",
"s3_client.download_file(\n",
" \"sagemaker-sample-files\",\n",
" \"datasets/image/caltech-256/caltech-256-60-val.rec\",\n",
" \"caltech-256-60-val.rec\",\n",
")\n",
"upload_to_s3(s3_validation_key, \"caltech-256-60-val.rec\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Training the ResNet model\n",
"\n",
"In this demo, we are using Caltech-256 dataset, which contains 30608 images of 256 objects. For the training and validation data, we follow the splitting scheme in this MXNet [example](https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/data/caltech256.sh). In particular, it randomly selects 60 images per class for training, and uses the remaining data for validation. The algorithm takes `RecordIO` file as input. The user can also provide the image files as input, which will be converted into `RecordIO` format using MXNet's [im2rec](https://mxnet.incubator.apache.org/how_to/recordio.html?highlight=im2rec) tool. It takes around 50 seconds to converted the entire Caltech-256 dataset (~1.2GB) on a p2.xlarge instance. However, for this demo, we will use record io format. \n",
"\n",
"Once we have the data available in the correct format for training, the next step is to actually train the model using the data. \n",
"\n",
"Training can be done by either calling SageMaker Training with a set of hyperparameters values to train with, or by leveraging SageMaker Automatic Model Tuning ([AMT](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html)). AMT, also known as hyperparameter tuning (HPO), finds the best version of a model by running many training jobs on your dataset using the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose.\n",
"\n",
"In this notebook, both methods are used for demonstration purposes, but the model that the HPO job creates is the one that is eventually hosted. You can instead choose to deploy the model created by the standalone training job by changing the below variable `deploy_amt_model` to False."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"deploy_amt_model = True"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training parameters\n",
"There are two kinds of parameters that need to be set for training. The first one are the parameters for the training job. These include:\n",
"\n",
"* **Input specification**: These are the training and validation channels that specify the path where training data is present. These are specified in the \"InputDataConfig\" section. The main parameters that need to be set is the \"ContentType\" which can be set to \"rec\" or \"lst\" based on the input data format and the S3Uri which specifies the bucket and the folder where the data is present. \n",
"* **Output specification**: This is specified in the \"OutputDataConfig\" section. We just need to specify the path where the output can be stored after training\n",
"* **Resource config**: This section specifies the type of instance on which to run the training and the number of hosts used for training. If \"InstanceCount\" is more than 1, then training can be run in a distributed manner. \n",
"\n",
"Apart from the above set of parameters, there are hyperparameters that are specific to the algorithm. These are:\n",
"\n",
"* **num_layers**: The number of layers (depth) for the network. We use 101 in this samples but other values such as 50, 152 can be used. \n",
"* **num_training_samples**: This is the total number of training samples. It is set to 15420 for caltech dataset with the current split\n",
"* **num_classes**: This is the number of output classes for the new dataset. Imagenet was trained with 1000 output classes but the number of output classes can be changed for fine-tuning. For caltech, we use 257 because it has 256 object categories + 1 clutter class\n",
"* **epochs**: Number of training epochs\n",
"* **learning_rate**: Learning rate for training\n",
"* **mini_batch_size**: The number of training samples used for each mini batch. In distributed training, the number of training samples used per batch will be N * mini_batch_size where N is the number of hosts on which training is run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# The algorithm supports multiple network depth (number of layers). They are 18, 34, 50, 101, 152 and 200\n",
"# For this training, we will use 18 layers\n",
"num_layers = \"18\"\n",
"# we need to specify the input image shape for the training data\n",
"image_shape = \"3,224,224\"\n",
"# we also need to specify the number of training samples in the training set\n",
"# for caltech it is 15420\n",
"num_training_samples = \"15420\"\n",
"# specify the number of output classes\n",
"num_classes = \"257\"\n",
"# batch size for training\n",
"mini_batch_size = \"64\"\n",
"# number of epochs\n",
"epochs = \"2\"\n",
"# learning rate\n",
"learning_rate = \"0.01\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training with SageMaker Training \n",
"\n",
"After setting training parameters, we kick off training, and poll for status until training is completed, which in this example, takes between 10 to 12 minutes per epoch on a p2.xlarge machine. The network typically converges after 10 epochs. However, to save the training time, we set the epochs to 2 but please keep in mind that it may not be sufficient to generate a good model. \n",
"\n",
"We run the training using Amazon SageMaker CreateTrainingJob API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%time\n",
"import time\n",
"import boto3\n",
"from time import gmtime, strftime\n",
"\n",
"s3 = boto3.client(\"s3\")\n",
"# create unique job name\n",
"job_name_prefix = \"DEMO-imageclassification\"\n",
"timestamp = time.strftime(\"-%Y-%m-%d-%H-%M-%S\", time.gmtime())\n",
"job_name = job_name_prefix + timestamp\n",
"training_params = {\n",
" # specify the training image\n",
" \"AlgorithmSpecification\": {\"TrainingImage\": training_image, \"TrainingInputMode\": \"File\"},\n",
" \"RoleArn\": role,\n",
" \"OutputDataConfig\": {\"S3OutputPath\": \"s3://{}/{}/output\".format(bucket, job_name_prefix)},\n",
" \"ResourceConfig\": {\"InstanceCount\": 1, \"InstanceType\": \"ml.p2.xlarge\", \"VolumeSizeInGB\": 50},\n",
" \"TrainingJobName\": job_name,\n",
" \"HyperParameters\": {\n",
" \"image_shape\": image_shape,\n",
" \"num_layers\": str(num_layers),\n",
" \"num_training_samples\": str(num_training_samples),\n",
" \"num_classes\": str(num_classes),\n",
" \"mini_batch_size\": str(mini_batch_size),\n",
" \"epochs\": str(epochs),\n",
" \"learning_rate\": str(learning_rate),\n",
" },\n",
" \"StoppingCondition\": {\"MaxRuntimeInSeconds\": 360000},\n",
" # Training data should be inside a subdirectory called \"train\"\n",
" # Validation data should be inside a subdirectory called \"validation\"\n",
" # The algorithm currently only supports fullyreplicated model (where data is copied onto each machine)\n",
" \"InputDataConfig\": [\n",
" {\n",
" \"ChannelName\": \"train\",\n",
" \"DataSource\": {\n",
" \"S3DataSource\": {\n",
" \"S3DataType\": \"S3Prefix\",\n",
" \"S3Uri\": s3_train,\n",
" \"S3DataDistributionType\": \"FullyReplicated\",\n",
" }\n",
" },\n",
" \"ContentType\": \"application/x-recordio\",\n",
" \"CompressionType\": \"None\",\n",
" },\n",
" {\n",
" \"ChannelName\": \"validation\",\n",
" \"DataSource\": {\n",
" \"S3DataSource\": {\n",
" \"S3DataType\": \"S3Prefix\",\n",
" \"S3Uri\": s3_validation,\n",
" \"S3DataDistributionType\": \"FullyReplicated\",\n",
" }\n",
" },\n",
" \"ContentType\": \"application/x-recordio\",\n",
" \"CompressionType\": \"None\",\n",
" },\n",
" ],\n",
"}\n",
"print(\"Training job name: {}\".format(job_name))\n",
"print(\n",
" \"\\nInput Data Location: {}\".format(\n",
" training_params[\"InputDataConfig\"][0][\"DataSource\"][\"S3DataSource\"]\n",
" )\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# create the Amazon SageMaker training job\n",
"sagemaker = boto3.client(service_name=\"sagemaker\")\n",
"sagemaker.create_training_job(**training_params)\n",
"\n",
"# confirm that the training job has started\n",
"status = sagemaker.describe_training_job(TrainingJobName=job_name)[\"TrainingJobStatus\"]\n",
"print(\"Training job current status: {}\".format(status))\n",
"\n",
"try:\n",
" # wait for the job to finish and report the ending status\n",
" sagemaker.get_waiter(\"training_job_completed_or_stopped\").wait(TrainingJobName=job_name)\n",
" training_info = sagemaker.describe_training_job(TrainingJobName=job_name)\n",
" status = training_info[\"TrainingJobStatus\"]\n",
" print(\"Training job ended with status: \" + status)\n",
"except:\n",
" print(\"Training failed to start\")\n",
" # if exception is raised, that means it has failed\n",
" message = sagemaker.describe_training_job(TrainingJobName=job_name)[\"FailureReason\"]\n",
" print(\"Training failed with the following error: {}\".format(message))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"training_info = sagemaker.describe_training_job(TrainingJobName=job_name)\n",
"status = training_info[\"TrainingJobStatus\"]\n",
"print(\"Training job ended with status: \" + status)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you see the message,\n",
"\n",
"> `Training job ended with status: Completed`\n",
"\n",
"then that means training successfully completed and the output model was stored in the output path specified by `training_params['OutputDataConfig']`.\n",
"\n",
"You can also view information about and the status of a training job using the Amazon SageMaker console. Just click on the \"Jobs\" tab."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training with SageMaker Automatic Model Tuning\n",
"\n",
"To create a tuning job using the AWS SageMaker Automatic Model Tuning API, you need to define 3 attributes. \n",
"\n",
"1. the tuning job name (string)\n",
"2. the tuning job config (to specify settings for the hyperparameter tuning job - JSON object)\n",
"3. training job definition (to configure the training jobs that the tuning job launches - JSON object).\n",
"\n",
"To learn more about that, refer to the [Configure and Launch a Hyperparameter Tuning Job](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-ex-tuning-job.html) documentation.\n",
"\n",
"Note that the tuning job will 12-17 minutes to complete."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from time import gmtime, strftime, sleep\n",
"\n",
"tuning_job_name = \"DEMO-hpo-ic-\" + strftime(\"%d-%H-%M-%S\", gmtime())\n",
"\n",
"tuning_job_config = {\n",
" # The full list of tunable hyper parameters for the Image Classification algorithm can be found here\n",
" # https://docs.aws.amazon.com/sagemaker/latest/dg/IC-tuning.html\n",
" \"ParameterRanges\": {\n",
" \"CategoricalParameterRanges\": [],\n",
" \"ContinuousParameterRanges\": [\n",
" {\n",
" \"MaxValue\": \"0.999\",\n",
" \"MinValue\": \"1e-6\",\n",
" \"Name\": \"beta_1\",\n",
" },\n",
" {\n",
" \"MaxValue\": \"0.999\",\n",
" \"MinValue\": \"1e-6\",\n",
" \"Name\": \"beta_2\",\n",
" },\n",
" {\n",
" \"MaxValue\": \"1.0\",\n",
" \"MinValue\": \"1e-8\",\n",
" \"Name\": \"eps\",\n",
" },\n",
" {\n",
" \"MaxValue\": \"0.999\",\n",
" \"MinValue\": \"1e-8\",\n",
" \"Name\": \"gamma\",\n",
" },\n",
" {\n",
" \"MaxValue\": \"0.5\",\n",
" \"MinValue\": \"1e-6\",\n",
" \"Name\": \"learning_rate\",\n",
" },\n",
" {\n",
" \"MaxValue\": \"0.999\",\n",
" \"MinValue\": \"0.0\",\n",
" \"Name\": \"momentum\",\n",
" },\n",
" {\n",
" \"MaxValue\": \"0.999\",\n",
" \"MinValue\": \"0.0\",\n",
" \"Name\": \"weight_decay\",\n",
" },\n",
" ],\n",
" \"IntegerParameterRanges\": [\n",
" {\n",
" \"MaxValue\": \"64\",\n",
" \"MinValue\": \"8\",\n",
" \"Name\": \"mini_batch_size\",\n",
" }\n",
" ],\n",
" },\n",
" # SageMaker sets the following default limits for resources used by automatic model tuning:\n",
" # https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-limits.html\n",
" \"ResourceLimits\": {\n",
" # Increase the max number of training jobs for increased accuracy (and training time).\n",
" \"MaxNumberOfTrainingJobs\": 6,\n",
" # Change parallel training jobs run by AMT to reduce total training time. Constrained by your account limits.\n",
" # if max_jobs=max_parallel_jobs then Bayesian search turns to Random.\n",
" \"MaxParallelTrainingJobs\": 2,\n",
" },\n",
" \"Strategy\": \"Bayesian\",\n",
" \"HyperParameterTuningJobObjective\": {\"MetricName\": \"validation:accuracy\", \"Type\": \"Maximize\"},\n",
"}\n",
"\n",
"training_job_definition = {\n",
" \"AlgorithmSpecification\": {\"TrainingImage\": training_image, \"TrainingInputMode\": \"File\"},\n",
" \"InputDataConfig\": [\n",
" {\n",
" \"ChannelName\": \"train\",\n",
" \"DataSource\": {\n",
" \"S3DataSource\": {\n",
" \"S3DataType\": \"S3Prefix\",\n",
" \"S3Uri\": s3_train,\n",
" \"S3DataDistributionType\": \"FullyReplicated\",\n",
" }\n",
" },\n",
" \"ContentType\": \"application/x-recordio\",\n",
" \"CompressionType\": \"None\",\n",
" },\n",
" {\n",
" \"ChannelName\": \"validation\",\n",
" \"DataSource\": {\n",
" \"S3DataSource\": {\n",
" \"S3DataType\": \"S3Prefix\",\n",
" \"S3Uri\": s3_validation,\n",
" \"S3DataDistributionType\": \"FullyReplicated\",\n",
" }\n",
" },\n",
" \"ContentType\": \"application/x-recordio\",\n",
" \"CompressionType\": \"None\",\n",
" },\n",
" ],\n",
" \"OutputDataConfig\": {\"S3OutputPath\": \"s3://{}/{}/output\".format(bucket, job_name_prefix)},\n",
" \"ResourceConfig\": {\"InstanceCount\": 1, \"InstanceType\": \"ml.p2.xlarge\", \"VolumeSizeInGB\": 50},\n",
" \"RoleArn\": role,\n",
" \"StaticHyperParameters\": {\n",
" \"num_training_samples\": str(num_training_samples),\n",
" \"num_classes\": str(num_classes),\n",
" \"num_layers\": str(num_layers),\n",
" \"image_shape\": image_shape,\n",
" \"epochs\": \"2\",\n",
" },\n",
" \"StoppingCondition\": {\"MaxRuntimeInSeconds\": 43200},\n",
"}\n",
"\n",
"print(\n",
" f\"Creating a tuning job with name: {tuning_job_name}. It will take between 12 and 17 minutes to complete.\"\n",
")\n",
"sagemaker.create_hyper_parameter_tuning_job(\n",
" HyperParameterTuningJobName=tuning_job_name,\n",
" HyperParameterTuningJobConfig=tuning_job_config,\n",
" TrainingJobDefinition=training_job_definition,\n",
")\n",
"\n",
"status = sagemaker.describe_hyper_parameter_tuning_job(HyperParameterTuningJobName=tuning_job_name)[\n",
" \"HyperParameterTuningJobStatus\"\n",
"]\n",
"print(status)\n",
"while status != \"Completed\" and status != \"Failed\":\n",
" time.sleep(60)\n",
" status = sagemaker.describe_hyper_parameter_tuning_job(\n",
" HyperParameterTuningJobName=tuning_job_name\n",
" )[\"HyperParameterTuningJobStatus\"]\n",
" print(status)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy The Model\n",
"\n",
"***\n",
"\n",
"A trained model does nothing on its own. We now want to use the model to perform inference. For this example, that means predicting the topic mixture representing a given document. \n",
"\n",
"This section involves several steps,\n",
"\n",
"1. [Create Model](#CreateModel) - Create model for the training output\n",
"1. [Host the model for real-time inference with EI](#HostTheModel) - Create an inference with EI and perform real-time inference using EI."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Model\n",
"\n",
"We now create a SageMaker Model from the training output. Using the model we will create an Endpoint Configuration to start an endpoint for real-time inference."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"import boto3\n",
"from time import gmtime, strftime\n",
"\n",
"model_name = \"DEMO-full-image-classification-model-\" + time.strftime(\n",
" \"-%Y-%m-%d-%H-%M-%S\", time.gmtime()\n",
")\n",
"print(model_name)\n",
"\n",
"if deploy_amt_model == True:\n",
" training_of_model_to_be_hosted = sagemaker.describe_hyper_parameter_tuning_job(\n",
" HyperParameterTuningJobName=tuning_job_name\n",
" )[\"BestTrainingJob\"][\"TrainingJobName\"]\n",
"else:\n",
" training_of_model_to_be_hosted = job_name\n",
"\n",
"info = sagemaker.describe_training_job(TrainingJobName=training_of_model_to_be_hosted)\n",
"model_data = info[\"ModelArtifacts\"][\"S3ModelArtifacts\"]\n",
"print(model_data)\n",
"\n",
"hosting_image = image_uris.retrieve(\n",
" region=boto3.Session().region_name, framework=\"image-classification\"\n",
")\n",
"\n",
"primary_container = {\n",
" \"Image\": hosting_image,\n",
" \"ModelDataUrl\": model_data,\n",
"}\n",
"\n",
"create_model_response = sagemaker.create_model(\n",
" ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container\n",
")\n",
"\n",
"print(create_model_response[\"ModelArn\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Real-time inference\n",
"\n",
"We now host the model with an endpoint and perform real-time inference.\n",
"\n",
"This section involves several steps,\n",
"1. [Create endpoint configuration](#CreateEndpointConfiguration) - Create a configuration defining an endpoint.\n",
"1. [Create endpoint](#CreateEndpoint) - Use the configuration to create an inference endpoint.\n",
"1. [Perform inference](#PerformInference) - Perform inference on some input data using the endpoint.\n",
"1. [Clean up](#CleanUp) - Delete the endpoint and model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create Endpoint Configuration with Amazon Elastic Inference \n",
"At launch, we will support configuring REST endpoints in hosting with multiple models, e.g. for A/B testing purposes. In order to support this, customers create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way.\n",
"\n",
"SageMaker Image Classification algorithm also supports running real-time inference with Amazon Elastic Inference (EI), a resource you can attach to your Amazon EC2 instances to accelerate your deep learning (DL) inference workloads. EI allows you to add inference acceleration to a hosted endpoint for a fraction of the cost of using a full GPU instance. Add an appropriate EI or accelerator type in addition to a CPU instance type and the model to the production variant when creating the endpoint configuration that you use to deploy a hosted endpoint.\n",
"\n",
"In this example, an `ml.eia1.large` EI is attached along with `ml.m4.xlarge` instance type to the production variant while creating the endpoint configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from time import gmtime, strftime\n",
"\n",
"timestamp = time.strftime(\"-%Y-%m-%d-%H-%M-%S\", time.gmtime())\n",
"endpoint_config_name = job_name_prefix + \"-epc-\" + timestamp\n",
"endpoint_config_response = sagemaker.create_endpoint_config(\n",
" EndpointConfigName=endpoint_config_name,\n",
" ProductionVariants=[\n",
" {\n",
" \"InstanceType\": \"ml.m4.xlarge\",\n",
" \"InitialInstanceCount\": 1,\n",
" \"ModelName\": model_name,\n",
" \"AcceleratorType\": \"ml.eia1.large\",\n",
" \"VariantName\": \"AllTraffic\",\n",
" }\n",
" ],\n",
")\n",
"\n",
"print(\"Endpoint configuration name: {}\".format(endpoint_config_name))\n",
"print(\"Endpoint configuration arn: {}\".format(endpoint_config_response[\"EndpointConfigArn\"]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create Endpoint\n",
"Next, the customer creates the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 9-11 minutes to complete."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"import time\n",
"\n",
"timestamp = time.strftime(\"-%Y-%m-%d-%H-%M-%S\", time.gmtime())\n",
"endpoint_name = job_name_prefix + \"-ep-\" + timestamp\n",
"print(\"Endpoint name: {}\".format(endpoint_name))\n",
"\n",
"endpoint_params = {\n",
" \"EndpointName\": endpoint_name,\n",
" \"EndpointConfigName\": endpoint_config_name,\n",
"}\n",
"endpoint_response = sagemaker.create_endpoint(**endpoint_params)\n",
"print(\"EndpointArn = {}\".format(endpoint_response[\"EndpointArn\"]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now the endpoint can be created. It may take a few minutes to create the endpoint..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get the status of the endpoint\n",
"response = sagemaker.describe_endpoint(EndpointName=endpoint_name)\n",
"status = response[\"EndpointStatus\"]\n",
"print(\"EndpointStatus = {}\".format(status))\n",
"\n",
"\n",
"# wait until the status has changed\n",
"sagemaker.get_waiter(\"endpoint_in_service\").wait(EndpointName=endpoint_name)\n",
"\n",
"\n",
"# print the status of the endpoint\n",
"endpoint_response = sagemaker.describe_endpoint(EndpointName=endpoint_name)\n",
"status = endpoint_response[\"EndpointStatus\"]\n",
"print(\"Endpoint creation ended with EndpointStatus = {}\".format(status))\n",
"\n",
"if status != \"InService\":\n",
" raise Exception(\"Endpoint creation failed.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you see the message,\n",
"\n",
"> `Endpoint creation ended with EndpointStatus = InService`\n",
"\n",
"then congratulations! You now have a functioning inference endpoint. You can confirm the endpoint configuration and status by navigating to the \"Endpoints\" tab in the Amazon SageMaker console.\n",
"\n",
"We will finally create a runtime object from which we can invoke the endpoint."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Perform Inference\n",
"Finally, the customer can now validate the model for use. They can obtain the endpoint from the client library using the result from previous operations and generate classifications from the trained model using that endpoint.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import boto3\n",
"\n",
"runtime = boto3.Session().client(service_name=\"runtime.sagemaker\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Download test image"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"file_name = \"/tmp/test.jpg\"\n",
"s3_client.download_file(\n",
" \"sagemaker-sample-files\",\n",
" \"datasets/image/caltech-256/256_ObjectCategories/008.bathtub/008_0007.jpg\",\n",
" file_name,\n",
")\n",
"\n",
"# test image\n",
"from IPython.display import Image\n",
"\n",
"Image(file_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Evaluation\n",
"\n",
"Evaluate the image through the network for inteference. The network outputs class probabilities and typically, one selects the class with the maximum probability as the final class output.\n",
"\n",
"**Note:** The output class detected by the network may not be accurate in this example. To limit the time taken and cost of training, we have trained the model only for a couple of epochs. If the network is trained for more epochs (say 20), then the output class will be more accurate.\n",
"\n",
"**Note:** The latency for the first inference invocation for endpoint with EI is higher than the consequent ones. Please run the cell below more than once for the first time invoking the inference for the endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"import json\n",
"import numpy as np\n",
"\n",
"with open(file_name, \"rb\") as f:\n",
" payload = f.read()\n",
" payload = bytearray(payload)\n",
"response = runtime.invoke_endpoint(\n",
" EndpointName=endpoint_name, ContentType=\"application/x-image\", Body=payload\n",
")\n",
"result = response[\"Body\"].read()\n",
"# result will be in json format and convert it to ndarray\n",
"result = json.loads(result)\n",
"# the result will output the probabilities for all classes\n",
"# find the class with maximum probability and print the class index\n",
"index = np.argmax(result)\n",
"object_categories = [\n",
" \"ak47\",\n",
" \"american-flag\",\n",
" \"backpack\",\n",
" \"baseball-bat\",\n",
" \"baseball-glove\",\n",
" \"basketball-hoop\",\n",
" \"bat\",\n",
" \"bathtub\",\n",
" \"bear\",\n",
" \"beer-mug\",\n",
" \"billiards\",\n",
" \"binoculars\",\n",
" \"birdbath\",\n",
" \"blimp\",\n",
" \"bonsai-101\",\n",
" \"boom-box\",\n",
" \"bowling-ball\",\n",
" \"bowling-pin\",\n",
" \"boxing-glove\",\n",
" \"brain-101\",\n",
" \"breadmaker\",\n",
" \"buddha-101\",\n",
" \"bulldozer\",\n",
" \"butterfly\",\n",
" \"cactus\",\n",
" \"cake\",\n",
" \"calculator\",\n",
" \"camel\",\n",
" \"cannon\",\n",
" \"canoe\",\n",
" \"car-tire\",\n",
" \"cartman\",\n",
" \"cd\",\n",
" \"centipede\",\n",
" \"cereal-box\",\n",
" \"chandelier-101\",\n",
" \"chess-board\",\n",
" \"chimp\",\n",
" \"chopsticks\",\n",
" \"cockroach\",\n",
" \"coffee-mug\",\n",
" \"coffin\",\n",
" \"coin\",\n",
" \"comet\",\n",
" \"computer-keyboard\",\n",
" \"computer-monitor\",\n",
" \"computer-mouse\",\n",
" \"conch\",\n",
" \"cormorant\",\n",
" \"covered-wagon\",\n",
" \"cowboy-hat\",\n",
" \"crab-101\",\n",
" \"desk-globe\",\n",
" \"diamond-ring\",\n",
" \"dice\",\n",
" \"dog\",\n",
" \"dolphin-101\",\n",
" \"doorknob\",\n",
" \"drinking-straw\",\n",
" \"duck\",\n",
" \"dumb-bell\",\n",
" \"eiffel-tower\",\n",
" \"electric-guitar-101\",\n",
" \"elephant-101\",\n",
" \"elk\",\n",
" \"ewer-101\",\n",
" \"eyeglasses\",\n",
" \"fern\",\n",
" \"fighter-jet\",\n",
" \"fire-extinguisher\",\n",
" \"fire-hydrant\",\n",
" \"fire-truck\",\n",
" \"fireworks\",\n",
" \"flashlight\",\n",
" \"floppy-disk\",\n",
" \"football-helmet\",\n",
" \"french-horn\",\n",
" \"fried-egg\",\n",
" \"frisbee\",\n",
" \"frog\",\n",
" \"frying-pan\",\n",
" \"galaxy\",\n",
" \"gas-pump\",\n",
" \"giraffe\",\n",
" \"goat\",\n",
" \"golden-gate-bridge\",\n",
" \"goldfish\",\n",
" \"golf-ball\",\n",
" \"goose\",\n",
" \"gorilla\",\n",
" \"grand-piano-101\",\n",
" \"grapes\",\n",
" \"grasshopper\",\n",
" \"guitar-pick\",\n",
" \"hamburger\",\n",
" \"hammock\",\n",
" \"harmonica\",\n",
" \"harp\",\n",
" \"harpsichord\",\n",
" \"hawksbill-101\",\n",
" \"head-phones\",\n",
" \"helicopter-101\",\n",
" \"hibiscus\",\n",
" \"homer-simpson\",\n",
" \"horse\",\n",
" \"horseshoe-crab\",\n",
" \"hot-air-balloon\",\n",
" \"hot-dog\",\n",
" \"hot-tub\",\n",
" \"hourglass\",\n",
" \"house-fly\",\n",
" \"human-skeleton\",\n",
" \"hummingbird\",\n",
" \"ibis-101\",\n",
" \"ice-cream-cone\",\n",
" \"iguana\",\n",
" \"ipod\",\n",
" \"iris\",\n",
" \"jesus-christ\",\n",
" \"joy-stick\",\n",
" \"kangaroo-101\",\n",
" \"kayak\",\n",
" \"ketch-101\",\n",
" \"killer-whale\",\n",
" \"knife\",\n",
" \"ladder\",\n",
" \"laptop-101\",\n",
" \"lathe\",\n",
" \"leopards-101\",\n",
" \"license-plate\",\n",
" \"lightbulb\",\n",
" \"light-house\",\n",
" \"lightning\",\n",
" \"llama-101\",\n",
" \"mailbox\",\n",
" \"mandolin\",\n",
" \"mars\",\n",
" \"mattress\",\n",
" \"megaphone\",\n",
" \"menorah-101\",\n",
" \"microscope\",\n",
" \"microwave\",\n",
" \"minaret\",\n",
" \"minotaur\",\n",
" \"motorbikes-101\",\n",
" \"mountain-bike\",\n",
" \"mushroom\",\n",
" \"mussels\",\n",
" \"necktie\",\n",
" \"octopus\",\n",
" \"ostrich\",\n",
" \"owl\",\n",
" \"palm-pilot\",\n",
" \"palm-tree\",\n",
" \"paperclip\",\n",
" \"paper-shredder\",\n",
" \"pci-card\",\n",
" \"penguin\",\n",
" \"people\",\n",
" \"pez-dispenser\",\n",
" \"photocopier\",\n",
" \"picnic-table\",\n",
" \"playing-card\",\n",
" \"porcupine\",\n",
" \"pram\",\n",
" \"praying-mantis\",\n",
" \"pyramid\",\n",
" \"raccoon\",\n",
" \"radio-telescope\",\n",
" \"rainbow\",\n",
" \"refrigerator\",\n",
" \"revolver-101\",\n",
" \"rifle\",\n",
" \"rotary-phone\",\n",
" \"roulette-wheel\",\n",
" \"saddle\",\n",
" \"saturn\",\n",
" \"school-bus\",\n",
" \"scorpion-101\",\n",
" \"screwdriver\",\n",
" \"segway\",\n",
" \"self-propelled-lawn-mower\",\n",
" \"sextant\",\n",
" \"sheet-music\",\n",
" \"skateboard\",\n",
" \"skunk\",\n",
" \"skyscraper\",\n",
" \"smokestack\",\n",
" \"snail\",\n",
" \"snake\",\n",
" \"sneaker\",\n",
" \"snowmobile\",\n",
" \"soccer-ball\",\n",
" \"socks\",\n",
" \"soda-can\",\n",
" \"spaghetti\",\n",
" \"speed-boat\",\n",
" \"spider\",\n",
" \"spoon\",\n",
" \"stained-glass\",\n",
" \"starfish-101\",\n",
" \"steering-wheel\",\n",
" \"stirrups\",\n",
" \"sunflower-101\",\n",
" \"superman\",\n",
" \"sushi\",\n",
" \"swan\",\n",
" \"swiss-army-knife\",\n",
" \"sword\",\n",
" \"syringe\",\n",
" \"tambourine\",\n",
" \"teapot\",\n",
" \"teddy-bear\",\n",
" \"teepee\",\n",
" \"telephone-box\",\n",
" \"tennis-ball\",\n",
" \"tennis-court\",\n",
" \"tennis-racket\",\n",
" \"theodolite\",\n",
" \"toaster\",\n",
" \"tomato\",\n",
" \"tombstone\",\n",
" \"top-hat\",\n",
" \"touring-bike\",\n",
" \"tower-pisa\",\n",
" \"traffic-light\",\n",
" \"treadmill\",\n",
" \"triceratops\",\n",
" \"tricycle\",\n",
" \"trilobite-101\",\n",
" \"tripod\",\n",
" \"t-shirt\",\n",
" \"tuning-fork\",\n",
" \"tweezer\",\n",
" \"umbrella-101\",\n",
" \"unicorn\",\n",
" \"vcr\",\n",
" \"video-projector\",\n",
" \"washing-machine\",\n",
" \"watch-101\",\n",
" \"waterfall\",\n",
" \"watermelon\",\n",
" \"welding-mask\",\n",
" \"wheelbarrow\",\n",
" \"windmill\",\n",
" \"wine-bottle\",\n",
" \"xylophone\",\n",
" \"yarmulke\",\n",
" \"yo-yo\",\n",
" \"zebra\",\n",
" \"airplanes-101\",\n",
" \"car-side-101\",\n",
" \"faces-easy-101\",\n",
" \"greyhound\",\n",
" \"tennis-shoes\",\n",
" \"toad\",\n",
" \"clutter\",\n",
"]\n",
"print(\"Result: label - \" + object_categories[index] + \", probability - \" + str(result[index]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Clean up\n",
"\n",
"When we're done with the endpoint, we can just delete it and the backing instances will be released. Run the following cell to delete the endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sagemaker.delete_endpoint(EndpointName=endpoint_name)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Notebook CI Test Results\n",
"\n",
"This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
}
],
"metadata": {
"celltoolbar": "Tags",
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (Data Science)",
"language": "python",
"name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/datascience-1.0"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 4
}