# Create Endpoint 


---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

---


In this notebook, you will learn basics about hosting your trained model on Amazon SageMaker for inference. There are two ways you can use Amazon SageMaker for inference:
1. Set up persistent endpoint for real-time online inference
2. Gather data to be predicted in batch and use SageMaker batch transform for offline inference. 

In this notebook, we focus on the first option and we will discuss batch transform in another notebook. 

You are highly recommeneded to go through [the section on model deployment](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-deployment.html) in the official docs before moving on.


The pricing for setting up an endpoint can be found [here](https://aws.amazon.com/sagemaker/pricing/)

Like a [CreateTrainingJob](https://github.com/hsl89/amazon-sagemaker-examples/blob/sagemaker-fundamentals/sagemaker-fundamentals/create-training-job/create-training-job.ipynb), Amazon SageMaker interacts with your inference logic via a containerized enviornment. 

The following APIs are relavent:
* [`CreateModel`](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model)
* [`CreateEndpointConfig`](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config)
* [`CreateEndpoint`](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint)

You are highly recommended to go through them. It's okay if you don't understand everything, we will go through them in detail in this notebook. 

The outline of this notebook is:
* Create an IAM role for SageMaker
* Build an inference image
* Test the inference image / container locally and push it to ECR
* Use the ECR address of the inference container to define a model by calling `CreateModel`
* Specify configuration of an endpoint by calling `CreateEndpointConfig`
* Use model definition from 3 and endpoint configuration from 4 to create an endpoint by calling `CreateEndpoint`
* Invoke the endpoint by using SageMaker runtime client 

In [2]:
# setups
import boto3
import datetime
import pprint
import os
import time
import re

pp = pprint.PrettyPrinter(indent=1)

## Set up a service role for SageMaker

Review [notebook on execution role](https://github.com/hsl89/amazon-sagemaker-examples/blob/execution-role/sagemaker-fundamentals/execution-role/execution-role.ipynb) for step-by-step instructions on how to create an IAM Role.

The service role is intended to be assumed by the SageMaker service to procure resources in your AWS account on your behalf.

1. If you are running this this notebook on SageMaker infrastructure like Notebook Instances or Studio, then we will use the role you used to spin up those resources

2. If you are running this notebook on an EC2 instance, then we will create a service role attach `AmazonSageMakerFullAccess` to it. If you already have a SageMaker service role, you can paste its role_arn here.


First get some useful functions we created there to help us creating an execution role. 

In [None]:
%%bash
cp ../execution-role/iam_helpers.py .

In [None]:
# set up service role for SageMaker
from iam_helpers import create_execution_role

sts = boto3.client("sts")
caller = sts.get_caller_identity()

if ":user/" in caller["Arn"]:  # as IAM user
    # either paste in a role_arn with or create a new one and attach
    # AmazonSageMakerFullAccess
    role_name = "sm"
    role_arn = create_execution_role(role_name=role_name)["Role"]["Arn"]

    # attach the permission to the role
    # skip it if you want to use a SageMaker service that
    # already has AmazonFullSageMakerFullAccess
    iam.attach_role_policy(
        RoleName=role_name, PolicyArn="arn:aws:iam::aws:policy/AmazonSageMakerFullAccess"
    )
elif "assumed-role" in caller["Arn"]:  # on SageMaker infra
    assumed_role = caller["Arn"]
    role_arn = re.sub(r"^(.+)sts::(\d+):assumed-role/(.+?)/.*$", r"\1iam::\2:role/\3", assumed_role)
else:
    print("I assume you are on an EC2 instance launched with an IAM role")
    role_arn = caller["Arn"]

## Build an inference image

You inference image must be a self-contained web server. When you run your inference container locally, it should listen on port 8080 and accept POST requests to the `/invocations` endpoint. The payload of the POST requests is the content of the data that you want your model to predict. Since the inference container is essentially a web server, you should expect it to look differently from the container we used for [`CreateTrainingJob`](https://github.com/hsl89/amazon-sagemaker-examples/blob/sagemaker-fundamentals/sagemaker-fundamentals/create-training-job/create-training-job.ipynb). 

In this notebook, we use a minimal python stack to build our web server:
![Request serving stack](stack.png)

### Further readings on the serving stack

* [Overview of the stack](https://flask.palletsprojects.com/en/1.1.x/deploying/uwsgi/)
* [Ngnix homepage](https://www.nginx.com/resources/wiki/start/) 
* [WSGI homepage](https://gunicorn.org/)
* [Flask homepage](https://flask.palletsprojects.com/en/1.1.x/)

### How SageMaker runs your container

SageMaker runs your container like

```sh
docker run <image> serve
```

This means you need to have an executable called `serve` in the `PATH`. In this notebook, we will create a python script as an **executable** and put it in the working directory of the docker image. 
        
The folder `container/src` contains the configs and entry point of the web server

In [77]:
!ls  container/src

nginx.conf  predictor.py  serve  wsgi.py


#### Entrypoint for Ngnixs server

`serve` is a python executable that is intended to be used as the entrypoint for the inference image.

In [78]:
!cat container/src/serve

#!/usr/bin/env python

# This file implements the scoring service shell. You don't necessarily need to modify it for various
# algorithms. It starts nginx and gunicorn with the correct configurations and then simply waits until
# gunicorn exits.
#
# The flask server is specified to be the app object in wsgi.py
#
# We set the following parameters:
#
# Parameter                Environment Variable              Default Value
# ---------                --------------------              -------------
# number of workers        MODEL_SERVER_WORKERS              the number of CPU cores
# timeout                  MODEL_SERVER_TIMEOUT              60 seconds

import multiprocessing
import os
import signal
import subprocess
import sys

cpu_count = multiprocessing.cpu_count()

model_server_timeout = os.environ.get('MODEL_SERVER_TIMEOUT', 60)
model_server_workers = int(os.environ.get('MODEL_SERVER_WORKERS', cpu_count))

def sigterm_handler(nginx_pid, gunicorn_pid):
    try:
        os.kill(nginx_p

#### Config file for Ngnix server
`nginx.conf` is the config file for the nginx server.

In [79]:
!cat container/src/nginx.conf

worker_processes 1;
daemon off; # Prevent forking


pid /tmp/nginx.pid;
error_log /var/log/nginx/error.log;

events {
  # defaults
}

http {
  include /etc/nginx/mime.types;
  default_type application/octet-stream;
  access_log /var/log/nginx/access.log combined;
  
  upstream gunicorn {
    server unix:/tmp/gunicorn.sock;
  }

  server {
    listen 8080 deferred;
    client_max_body_size 5m;

    keepalive_timeout 5;
    proxy_read_timeout 1200s;

    location ~ ^/(ping|invocations) {
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect off;
      proxy_pass http://gunicorn;
    }

    location / {
      return 404 "{}";
    }
  }
}


#### WSGI config

In [80]:
!cat container/src/wsgi.py

import predictor as myapp

# This is just a simple wrapper for gunicorn to find your app.
# If you want to change the algorithm file, simply change "predictor" above to the
# new file.

app = myapp.app


#### Inference logic

The most important file in `container/src` is `predictor.py`. It contains the inference logic. Other files in the `container/src` can be used **as it**. But you will need to customize `predictor.py` to implement your own inference logic. 

In [None]:
!pygmentize container/src/predictor.py

## Build the container

We build the container from `container/Dockderfile`. And let's call this image `example-serve`. 

In [81]:
!cat container/Dockerfile

# Build an image that can do training and inference in SageMaker
# This is a Python 3 image that uses the nginx, gunicorn, flask stack
# for serving inferences in a stable way.

FROM ubuntu:18.04

MAINTAINER Amazon AI <sage-learner@amazon.com>


RUN apt-get -y update && apt-get install -y --no-install-recommends \
         wget \
         python3-pip \
         python3-setuptools \
         nginx \
         ca-certificates \
    && rm -rf /var/lib/apt/lists/*

RUN ln -s /usr/bin/python3 /usr/bin/python
RUN ln -s /usr/bin/pip3 /usr/bin/pip

# Here we get all python packages.
# There's substantial overlap between scipy and numpy that we eliminate by
# linking them together. Likewise, pip leaves the install caches populated which uses
# a significant amount of space. These optimizations save a fair amount of space in the
# image, which reduces start up time.
RUN pip --no-cache-dir install numpy==1.16.2 scipy==1.2.1 scikit-learn==0.20.2 pandas flask gunicorn

# Set some environment variabl

In [None]:
%%sh
# build the image
cd container/

# tag it as example-image:latest
docker build -t example-serve:latest .

## Test your image

Like in the [notebook for CreateTrainingJob](https://github.com/hsl89/amazon-sagemaker-examples/blob/sagemaker-fundamentals/sagemaker-fundamentals/create-training-job/create-training-job.ipynb), we replicate the Amazon SageMaker hosting environment and test your image locally before serving in production. You are encouraged to read through the section on [Use Your Own Inference Code with Hosting Services](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html) and think about how would you replicate SageMaker hosting environment before moving on. 

Like for `CreateTrainingJob`, SageMaker reserves `/opt/ml` directory in your image to inject ML-related info for `CreateEndpoint`. In particular, it downloads your trained model artifact and inject it in the directory `/opt/ml/model`. When calling `CreateEndpoint` you will need to tell SageMaker the S3 URI of your model artifact. SageMaker will use then pull the artifact and inject it into `/opt/ml/model`. This means when defining your own inference logic, you should load your trained model from `/opt/ml/model`. 

We will use docker python client to run your image and we will mount `container/local_test/ml` to `/opt/ml` as docker volume. 

In [None]:
# look at what's inside `container/ml`
!ls container/local_test/ml

The inference logic we implemented in `container/src/predictor.py` under `def inference():` does not require a real ML model. Therefore we do not need to inject anything for the purpose of local test. We will discuss how to load a real model in a more advanced notebook. 

<span style="color:red"> TODO for Dev:  add link to the advanced notebook when it is ready</span>.

#### Run the container

To run the container `example-serve`, open a terminal in the current directory and go to `container/local_test`

```sh
cd container/local_test
```

Then run the following command

```sh
docker run -v ml:/opt/ml -p 8080:8080 --rm example-serve:latest serve 
```

`-v ml:/opt/ml` binds the directory `ml` (in `container/local_test`) to `/opt/ml` in the image as a docker volume.

`-p 8080:8080` exposes port 8080 inside container as port 8080 on the hos

`--rm` removes the container from daemon when it is stopped. 

We suggest you to run the image from the shell instead of within the notebook because when you are debugging your own container, you can more easily stdout from the container when you have a shell process running it. 

#### Ping your container
Once your container is up, you can ping it at `http://localhost:8080`. 

To trigger the logic under `def ping():` in `container/src/predictor.py`, do

```sh
curl localhost:8080/ping
```

To trigger the logic under `def inference():` in `container/src/predictor.py` with a json string, do

```sh
curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"key":"value"}' \
  http://localhost:8080/invocations
```

To trigger the logic under `def inference():` in `container/src/predictor.py` with a non-json payload, do
```
curl --header "Content-Type: text/csv" \
  --request POST \
  http://localhost:8080/invocations
```

To stop the container, go to the terminal that runs your container and press `Control + C`. Alternatively, you can find out it container id by grepping for a docker process that binds port 8080 on the host and manually remove it.

```sh
docker rm -f $(docker ps | grep -e "0.0.0.0:8080->8080/tcp" | awk '{print $1}'
```

## Push the image to ECR
Now you have tested your image, the next thing to do is to push it to your ECR so that SageMaker can download it. We have discussed this in the [previous notebook on `CreateTrainingJob`](https://github.com/hsl89/amazon-sagemaker-examples/blob/sagemaker-fundamentals/sagemaker-fundamentals/create-training-job/create-training-job.ipynb) in the section where we push the training image to ECR. 

### Create a repo

In [3]:
ecr = boto3.client("ecr")

try:
    # The repository might already exist
    # in your ECR
    cr_res = ecr.create_repository(repositoryName="example-serve")
    pp.pprint(cr_res)
except Exception as e:
    print(e)

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '393',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Tue, 23 Mar 2021 00:24:00 GMT',
                                      'x-amzn-requestid': '547615ed-af77-4f7a-ac0e-bff9aa37d34b'},
                      'HTTPStatusCode': 200,
                      'RequestId': '547615ed-af77-4f7a-ac0e-bff9aa37d34b',
                      'RetryAttempts': 0},
 'repository': {'createdAt': datetime.datetime(2021, 3, 23, 0, 24, 1, tzinfo=tzlocal()),
                'encryptionConfiguration': {'encryptionType': 'AES256'},
                'imageScanningConfiguration': {'scanOnPush': False},
                'imageTagMutability': 'MUTABLE',
                'registryId': '688520471316',
                'repositoryArn': 'arn:aws:ecr:us-west-2:688520471316:repository/example-serve',
                'repositoryName': 'example-serve',
                'repositoryUri': '68852

### Push the image to ECR

In [4]:
%%bash
account=$(aws sts get-caller-identity --query Account | sed -e 's/^"//' -e 's/"$//')
region=$(aws configure get region)
ecr_account=${account}.dkr.ecr.${region}.amazonaws.com

# Give docker your ECR login password
aws ecr get-login-password --region $region | docker login --username AWS --password-stdin $ecr_account

# Fullname of the repo
fullname=$ecr_account/example-serve:latest

#echo $fullname
# Tag the image with the fullname
docker tag example-serve:latest $fullname

# Push to ECR
docker push $fullname

Login Succeeded
The push refers to repository [688520471316.dkr.ecr.us-west-2.amazonaws.com/example-serve]
4d7e5149d4e3: Preparing
b520e3bd5eba: Preparing
3a3b4090fe28: Preparing
5b850ff3c508: Preparing
408c63ea099b: Preparing
9f10818f1f96: Preparing
27502392e386: Preparing
c95d2191d777: Preparing
9f10818f1f96: Waiting
c95d2191d777: Waiting
3a3b4090fe28: Pushed
5b850ff3c508: Pushed
4d7e5149d4e3: Pushed
9f10818f1f96: Pushed
27502392e386: Pushed
c95d2191d777: Pushed
408c63ea099b: Pushed
b520e3bd5eba: Pushed
latest: digest: sha256:24bb29a270095e8a3491c89288cdffe45fe03bc46f728bbd1d0a54acea31f711 size: 1989


https://docs.docker.com/engine/reference/commandline/login/#credentials-store



## Create model
Now we use the image you just pushed to ECR to create a model in Amazon SageMaker. This is done by calling `CreateModel` API. Once a model is created we will be able to host it on an Amazon SageMaker endpoint by creating an endpoint configuration and calling `CreateEndpoint` API. 

In [54]:
sm_boto3 = boto3.client("sagemaker")

region = boto3.Session().region_name
account_id = boto3.client("sts").get_caller_identity()["Account"]

image_uri = "{}.dkr.ecr.{}.amazonaws.com/example-serve:latest".format(account_id, region)

cm_res = sm_boto3.create_model(
    ModelName="example-serve",  # name the of the model does not need to be the same as the image repob
    Containers=[
        {
            "Image": image_uri,
        },
    ],
    ExecutionRoleArn=role["Arn"],
    EnableNetworkIsolation=False,
)

pp.pprint(cm_res)

{'ModelArn': 'arn:aws:sagemaker:us-west-2:688520471316:model/example-serve',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '75',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 10 Mar 2021 23:01:20 GMT',
                                      'x-amzn-requestid': '58071bde-0339-4400-9e05-17dca25ca4bc'},
                      'HTTPStatusCode': 200,
                      'RequestId': '58071bde-0339-4400-9e05-17dca25ca4bc',
                      'RetryAttempts': 0}}


## Create endpoint configuration

Next we configure the resources we need to deploy this model by creating an endpoint configuration. This is done by calling `CreateEndpointConfig` API. For more info about this API, read its [API reference](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config).

In [55]:
model_name = "example-serve"  # model defined above
initial_instance_count = 1
instance_type = "ml.t2.medium"

variant_name = "AMeaningfulProdVarName"  # ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}

production_variants = [
    {
        "VariantName": variant_name,
        "ModelName": model_name,
        "InitialInstanceCount": initial_instance_count,
        "InstanceType": instance_type,
    }
]

endpoint_config_name = "ExampleServeConfig"  # ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}

endpoint_config = {
    "EndpointConfigName": endpoint_config_name,
    "ProductionVariants": production_variants,
}

ep_conf_res = sm_boto3.create_endpoint_config(**endpoint_config)

In [56]:
pp.pprint(ep_conf_res)

{'EndpointConfigArn': 'arn:aws:sagemaker:us-west-2:688520471316:endpoint-config/exampleserveconfig',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '99',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 10 Mar 2021 23:42:10 GMT',
                                      'x-amzn-requestid': 'a9b17d8e-0ac9-472e-a43b-e26878925854'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'a9b17d8e-0ac9-472e-a43b-e26878925854',
                      'RetryAttempts': 0}}


## Create Endpoint
Put everything together, we are ready to create an endpoint using the model and the endpoing configuration. We will create an endpoint by calling `CreateEndpoint` API. The API reference is [here](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint). 

In [59]:
endpoint_name = "example-endpoint"
ep_res = sm_boto3.create_endpoint(
    EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name
)
pp.pprint(ep_res)

{'EndpointArn': 'arn:aws:sagemaker:us-west-2:688520471316:endpoint/example-endpoint',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '84',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 10 Mar 2021 23:47:37 GMT',
                                      'x-amzn-requestid': '473836e2-9010-4895-a13f-3ba86bf9e187'},
                      'HTTPStatusCode': 200,
                      'RequestId': '473836e2-9010-4895-a13f-3ba86bf9e187',
                      'RetryAttempts': 0}}


### Inspect endpoint status
It takes a litte while for the endpoint to be fully ready, because SageMaker needs to provision the EC2 instance hosting it. To get an update on the endpoint status, we can call `DescribeEndpoint`. 

In [61]:
ep_des_res = sm_boto3.describe_endpoint(EndpointName=endpoint_name)


pp.pprint(ep_des_res)

{'CreationTime': datetime.datetime(2021, 3, 10, 23, 47, 38, 119000, tzinfo=tzlocal()),
 'EndpointArn': 'arn:aws:sagemaker:us-west-2:688520471316:endpoint/example-endpoint',
 'EndpointConfigName': 'ExampleServeConfig',
 'EndpointName': 'example-endpoint',
 'EndpointStatus': 'Creating',
 'LastModifiedTime': datetime.datetime(2021, 3, 10, 23, 47, 38, 119000, tzinfo=tzlocal()),
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '256',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 10 Mar 2021 23:51:09 GMT',
                                      'x-amzn-requestid': '4adc90e4-75cf-4700-99db-e10c09727b67'},
                      'HTTPStatusCode': 200,
                      'RequestId': '4adc90e4-75cf-4700-99db-e10c09727b67',
                      'RetryAttempts': 0}}


---
`EndpointStatus` field from `ep_des_res` takes the following value (See [AWS API Documentation](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeEndpoint.html#sagemaker-DescribeEndpoint-response-EndpointStatus)):

`OutOfService`: Endpoint is not available to take incoming requests.

`Creating`: CreateEndpoint is executing.

`Updating`: UpdateEndpoint or UpdateEndpointWeightsAndCapacities is executing.

`SystemUpdating`: Endpoint is undergoing maintenance and cannot be updated or deleted or re-scaled until it has completed. This maintenance operation does not change any customer-specified values such as VPC config, KMS encryption, model, instance type, or instance count.

`RollingBack`: Endpoint fails to scale up or down or change its variant weight and is in the process of rolling back to its previous configuration. Once the rollback completes, endpoint returns to an InService status. This transitional status only applies to an endpoint that has autoscaling enabled and is undergoing variant weight or capacity changes as part of an UpdateEndpointWeightsAndCapacities call or when the UpdateEndpointWeightsAndCapacities operation is called explicitly.

`InService`: Endpoint is available to process incoming requests.

`Deleting`: DeleteEndpoint is executing.

`Failed`: Endpoint could not be created, updated, or re-scaled. Use DescribeEndpoint:FailureReason for information about the failure. DeleteEndpoint is the only operation that can be performed on a failed endpoint.

---

To get real time update on the endpoint status, we can call `DescribeEndpoint` once every few seconds until the status becomes `InService` or `Failed`.

In [63]:
import time

creating = True

while creating:
    ep_des_res = sm_boto3.describe_endpoint(EndpointName=endpoint_name)
    pp.pprint(ep_des_res)
    time.sleep(15)
    if ep_des_res["EndpointStatus"] != "Creating":
        creating = False

{'CreationTime': datetime.datetime(2021, 3, 10, 23, 47, 38, 119000, tzinfo=tzlocal()),
 'EndpointArn': 'arn:aws:sagemaker:us-west-2:688520471316:endpoint/example-endpoint',
 'EndpointConfigName': 'ExampleServeConfig',
 'EndpointName': 'example-endpoint',
 'EndpointStatus': 'InService',
 'LastModifiedTime': datetime.datetime(2021, 3, 10, 23, 56, 2, 741000, tzinfo=tzlocal()),
 'ProductionVariants': [{'CurrentInstanceCount': 1,
                         'CurrentWeight': 1.0,
                         'DeployedImages': [{'ResolutionTime': datetime.datetime(2021, 3, 10, 23, 47, 41, 524000, tzinfo=tzlocal()),
                                             'ResolvedImage': '688520471316.dkr.ecr.us-west-2.amazonaws.com/example-serve@sha256:24bb29a270095e8a3491c89288cdffe45fe03bc46f728bbd1d0a54acea31f711',
                                             'SpecifiedImage': '688520471316.dkr.ecr.us-west-2.amazonaws.com/example-serve:latest'}],
                         'DesiredInstanceCount': 1,
         

## Test the endpoint
Now the endpoint is in service, let's invoke it with [SageMaker runtime client](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker-runtime.html)

In [69]:
# invoke endpoint
import json

sm_runtime = boto3.client("sagemaker-runtime")

body = json.dumps("a json string")
content_type = "application/json"

# respnse type
accept = "text/plain"

res = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    Body=body,  # encoded input data
    ContentType=content_type,  # I told the endpoint what's the encode
    Accept=accept,  # I told the endpoint how I want to decode its response
)

pp.pprint(res)

{'Body': <botocore.response.StreamingBody object at 0x7f50a501aac8>,
 'ContentType': 'text/plain; charset=utf-8',
 'InvokedProductionVariant': 'AMeaningfulProdVarName',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '41',
                                      'content-type': 'text/plain; '
                                                      'charset=utf-8',
                                      'date': 'Thu, 11 Mar 2021 00:16:24 GMT',
                                      'x-amzn-invoked-production-variant': 'AMeaningfulProdVarName',
                                      'x-amzn-requestid': '51a58074-4e0d-4886-93b4-888e6835ba14'},
                      'HTTPStatusCode': 200,
                      'RequestId': '51a58074-4e0d-4886-93b4-888e6835ba14',
                      'RetryAttempts': 0}}


In [68]:
# decode the response body
res_body = res["Body"]
res_body.read().decode("utf-8")

'I am fed with json. Therefore, I am happy'

## Clean up
Congratulations! You now understand the basics of a creating an endpoint on Amazon SageMaker. The endpoint you just created does not do too much ML. So feel free to delete all relevant resources. 

In [72]:
# delete the ECR repo
ecr = boto3.client("ecr")
del_repo_res = ecr.delete_repository(repositoryName="example-serve", force=True)

pp.pprint(del_repo_res)

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '288',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Thu, 11 Mar 2021 00:29:08 GMT',
                                      'x-amzn-requestid': '17914b8b-edc1-4f2c-acdf-81ecef87084f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '17914b8b-edc1-4f2c-acdf-81ecef87084f',
                      'RetryAttempts': 0},
 'repository': {'createdAt': datetime.datetime(2021, 3, 10, 21, 49, tzinfo=tzlocal()),
                'imageTagMutability': 'MUTABLE',
                'registryId': '688520471316',
                'repositoryArn': 'arn:aws:ecr:us-west-2:688520471316:repository/example-serve',
                'repositoryName': 'example-serve',
                'repositoryUri': '688520471316.dkr.ecr.us-west-2.amazonaws.com/example-serve'}}


In [73]:
# delete the model
del_model_res = sm_boto3.delete_model(ModelName=model_name)

pp.pprint(del_model_res)

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Thu, 11 Mar 2021 00:29:22 GMT',
                                      'x-amzn-requestid': 'c24e0777-f7fd-4481-a005-2a61810279f9'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'c24e0777-f7fd-4481-a005-2a61810279f9',
                      'RetryAttempts': 0}}


In [75]:
# delete endpoint config
del_ep_config_res = sm_boto3.delete_endpoint_config(EndpointConfigName=endpoint_config_name)

pp.pprint(del_ep_config_res)

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Thu, 11 Mar 2021 00:30:34 GMT',
                                      'x-amzn-requestid': '5f4fe112-a192-4dbb-ae63-586fbce5d265'},
                      'HTTPStatusCode': 200,
                      'RequestId': '5f4fe112-a192-4dbb-ae63-586fbce5d265',
                      'RetryAttempts': 0}}


In [76]:
# delete the endpoint
del_ep_res = sm_boto3.delete_endpoint(EndpointName=endpoint_name)

pp.pprint(del_ep_res)

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Thu, 11 Mar 2021 00:32:13 GMT',
                                      'x-amzn-requestid': 'fb77806c-7aeb-471b-8de0-190a9b592c6d'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'fb77806c-7aeb-471b-8de0-190a9b592c6d',
                      'RetryAttempts': 0}}


## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker-fundamentals|create-endpoint|create_endpoint.ipynb)
