In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Predictive maintenance using Vertex AI

<table align="left">
  <td style="text-align: center">
        <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/workbench/predictive_maintainance/predictive_maintenance_usecase.ipynb">
        <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo"><br> Open in Colab
        </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fworkbench%2Fpredictive_maintainance%2Fpredictive_maintenance_usecase.ipynb">
      <img width="32px" src="https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
<a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/workbench/predictive_maintainance/predictive_maintenance_usecase.ipynb" target='_blank'>
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Open in Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/workbench/predictive_maintainance/predictive_maintenance_usecase.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>
<br/><br/><br/>


## Table of contents
* [Overview](#section-1)
* [Objective](#section-2)
* [Dataset](#section-3)
* [Costs](#section-4)
* [Data analysis](#section-5)
* [Fit a regression model](#section-6)
* [Evaluate the trained model](#section-7)
* [Save the model](#section-8)
* [Running a notebook end-to-end using the executor](#section-9)
* [Hosting the model on Vertex AI](#section-10)
  * [Create an endpoint](#section-11)
  * [Deploy the model to the created endpoint](#section-12)
  * [Test calling the endpoint](#section-13)
* [Clean up](#section-14)

## Overview
<a name="section-1"></a>

In this notebook, you go through a predictive maintenance usecase on industrial data using machine learning techniques, deploy the machine learning model on Vertex AI, and automate the workflow using the executor feature of Vertex AI Workbench.

*Note: This notebook file is developed to run in a [Vertex AI Workbench managed notebooks](https://console.cloud.google.com/vertex-ai/workbench/list/managed) instance using the XGBoost (Local) kernel. Some components of this notebook may not work in other notebook environments.*

Learn more about [Vertex AI Workbench](https://cloud.google.com/vertex-ai/docs/workbench/introduction) and [Vertex AI training](https://cloud.google.com/vertex-ai/docs/training/overview).

### Objective
<a name="section-2"></a>

In this tutorial, you learn how to use the executor feature of Vertex AI Workbench to automate a workflow to train and deploy a model.

This tutorial uses the following Google Cloud ML services:

- Vertex AI training
- Vertex AI model evaluation

The steps performed are:

- Loading the required dataset from a Cloud Storage bucket.
- Analyzing the fields present in the dataset.
- Selecting the required data for the predictive maintenance model.
- Training an XGBoost regression model for predicting the remaining useful life.
- Evaluating the model.
- Running the notebook end-to-end as a training job using Executor.
- Deploying the model on Vertex AI.
- Clean up.

### Dataset
<a name="section-3"></a>

The dataset used in this notebook is a part of the [NASA Turbofan Engine Degradation Simulation dataset](https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/), which consists of simulated time-series data for four sets of fleet engines under different combinations of operational conditions and fault modes. A version of this dataset which is saved to a public Cloud Storage bucket is used in this notebook. In this notebook, one of the engine's simulated data (FD001) is used to analyze and train a model that can predict the engine's remaining useful life.

### Costs
<a name="section-4"></a>

This tutorial uses the following billable components of Google Cloud:

- Vertex AI
- Cloud Storage

Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Kernel selection
Select <b>XGBoost</b> kernel while running this notebook on Vertex AI Workbench's managed instances. Otherwise, ensure that the following libraries are installed in the environment where this notebook is being run.
- XGBoost
- Pandas
- Seaborn
- Sklearn

Along with the above libraries, th`e following google-cloud libraries are also used in this notebook.

- google.cloud.aiplatform
- google.cloud.storage

## Get started

### Install Vertex AI SDK for Python and other required packages


In [None]:
! pip3 install --quiet --upgrade   google-cloud-aiplatform \
                                    google-cloud-storage \
                                    xgboost==1.7.1 \
                                    seaborn \
                                    scikit-learn \
                                    fsspec \
                                    gcsfs \
                                    pandas -q

### Restart runtime (Colab only)

To use the newly installed packages, you must restart the runtime on Google Colab.

In [None]:
import sys

if "google.colab" in sys.modules:

    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

Authenticate your environment on Google Colab.


In [None]:
import sys

if "google.colab" in sys.modules:

    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information and initialize Vertex AI SDK for Python

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

# set the project id
! gcloud config set project $PROJECT_ID

LOCATION = "us-central1"  # @param {type: "string"}

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

In [None]:
BUCKET_URI = f"gs://your-bucket-name-{PROJECT_ID}-unique"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l {LOCATION} {BUCKET_URI}

#### UUID

To avoid name collisions between users on created resources, create a UUID for each session instance. Append these UUIDs to the respective names of the resources created in this tutorial.

In [None]:
import random
import string


# Generate a uuid of a specifed length(default=8)
def generate_uuid(length: int = 8) -> str:
    return "".join(random.choices(string.ascii_lowercase + string.digits, k=length))


UUID = generate_uuid()

### Import the required libraries

In [None]:
import matplotlib.pyplot as plt
import pandas as pd

%matplotlib inline
import os

import numpy as np
import seaborn as sns
import xgboost as xgb
from google.cloud import aiplatform, storage
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

## Initialize Vertex AI SDK for Python


In [None]:
aiplatform.init(project=PROJECT_ID, location=LOCATION)

Load the data and check the data shape.

In [None]:
# load the data from the source
INPUT_PATH = "gs://cloud-samples-data/ai-platform-unified/datasets/tabular/predictive_maintenance.csv"  # data source
raw_data = pd.read_csv(INPUT_PATH, sep=" ", header=None)
# check the data
print(raw_data.shape)
raw_data.head()

The data itself doesn't contain any feature names and thus needs its columns to be renamed. The data source already provides some data description. Apparently, the <b>ID</b> column represents the unit-number of the fleet-engine and <b>Cycle</b> represents the time in cycles. <b>OpSet1</b>,<b>Opset2</b> & <b>Opset3</b> represent the three operational settings that are described in the original data source and have a substantial effect on engine performance. The rest of the fields show sensor readings collected from 21 different sensors.

In [None]:
# name the columns (based on the original data source page)
raw_data = raw_data[[f for f in range(0, 26)]]
raw_data.columns = [
    "ID",
    "Cycle",
    "OpSet1",
    "OpSet2",
    "OpSet3",
    "SensorMeasure1",
    "SensorMeasure2",
    "SensorMeasure3",
    "SensorMeasure4",
    "SensorMeasure5",
    "SensorMeasure6",
    "SensorMeasure7",
    "SensorMeasure8",
    "SensorMeasure9",
    "SensorMeasure10",
    "SensorMeasure11",
    "SensorMeasure12",
    "SensorMeasure13",
    "SensorMeasure14",
    "SensorMeasure15",
    "SensorMeasure16",
    "SensorMeasure17",
    "SensorMeasure18",
    "SensorMeasure19",
    "SensorMeasure20",
    "SensorMeasure21",
]
raw_data.head()

## Data Analysis
<a name="section-5"></a>
The current dataset consists of timeseries data for various unit IDs. The data is represented in terms of cycles. Lets first see the distribution of number of cycles across the units.

In [None]:
# plot the cycle count for each IDs
raw_data[["ID", "Cycle"]].groupby(by=["ID"]).count().plot(kind="bar", figsize=(12, 5))

On an average, there seem to be around 225 cycles per each ID in the dataset. Next, lets check the data types of the fields and the number of null records in the data.

In [None]:
# check the data-types
raw_data.info()

The data doesn't have any null records or any categorical fields. Next, lets check the numerical distribution of the fields.

In [None]:
# check the numerical characteristics of the data
raw_data.describe().T

Features **OpSet3**, **SensorMeasure1**, **SensorMeasure10**, **SensorMeasure18** & **SensorMeasure19** seem to be constant throughout the dataset and thus can be eliminated. Apart from the fields that are constant throughout the data, fields that are correlated highly can also be considered for dropping. Having highly correlated fields in the data often leads to multi-collinearity situation which unnecessarily increases the size of feature-space even if it doesn't affect the accuracy much. Such fields can be identified through correlation-matrices and heatmaps.

In [None]:
# plot the correlation matrix
plt.figure(figsize=(15, 10))
cols = [
    i
    for i in raw_data.columns
    if i
    not in [
        "ID",
        "Cycle",
        "OpSet3",
        "SensorMeasure1",
        "SensorMeasure10",
        "SensorMeasure18",
        "SensorMeasure19",
    ]
]
corr_mat = raw_data[cols].corr()
matrix = np.triu(corr_mat)

sns.heatmap(corr_mat, annot=True, mask=matrix, fmt=".1g")
plt.show()

Fields **SensorMeasure7**, **SensorMeasure12**, **SensorMeasure20** & **SensorMeasure21** correlate highly with many other fields. These fields can be omitted. Further, **SensorMeasure8**, **SensorMeasure11** and **SensorMeasure4** seem highly correlated with each other and so any one of them, for example, **SensorMeasure4**, can be kept and the rest can be omitted.

In [None]:
cols = [
    i
    for i in cols
    if i
    not in [
        "SensorMeasure7",
        "SensorMeasure12",
        "SensorMeasure20",
        "SensorMeasure21",
        "SensorMeasure8",
        "SensorMeasure11",
    ]
]
corr_mat = raw_data[cols].corr()
matrix = np.triu(corr_mat)
plt.figure(figsize=(9, 5))
sns.heatmap(corr_mat, annot=True, mask=matrix, fmt=".1g")
plt.show()

As the current objective is to predict the remaining useful life (RUL) of each unit (ID), the target variable needs to be identified. Since you're dealing with a timeseries data that represents the lifetime of a unit, remaining useful life of a unit can be calculated by subtracting the current cycle from the maximum cycle of that unit.

					RUL = Max. Cycle - Current Cycle    
## RUL calculation and Feature selection

In [None]:
# get max-cycle of the ids
cols = ["ID", "Cycle"] + cols
max_cycles_df = (
    raw_data.groupby(["ID"], sort=False)["Cycle"]
    .max()
    .reset_index()
    .rename(columns={"Cycle": "MaxCycleID"})
)
# merge back to original dataset
FD001_df = pd.merge(raw_data, max_cycles_df, how="inner", on="ID")
# calculate rul from max-cycle and current-cycle
FD001_df["RUL"] = FD001_df["MaxCycleID"] - FD001_df["Cycle"]

To ensure that the target field is generated properly, the RUL field can be plotted.

In [None]:
# plot the RUL vs Cycles
one_engine = []
for i, r in FD001_df.iterrows():
    rul = r["RUL"]
    one_engine.append(rul)
    if rul == 0:
        plt.plot(one_engine)
        one_engine = []

plt.grid()

The above plot suggests that the RUL, in other words, the remaining cycles, is decreasing as the current cycle increases which is expected. Further, lets see the how the other fields relate to RUL in the current dataset.

In [None]:
# plot feature vs the RUL
def plot_feature(feature):
    plt.figure(figsize=(10, 5))
    for i in FD001_df["ID"].unique():
        if i % 10 == 0:  # only plot every 10th ID
            plt.plot("RUL", feature, data=FD001_df[FD001_df["ID"] == i])
    plt.xlim(250, 0)  # reverse the x-axis so RUL counts down to zero
    plt.xticks(np.arange(0, 275, 25))
    plt.ylabel(feature)
    plt.xlabel("RUL")
    plt.show()


for i in cols:
    if i not in ["ID", "Cycle"]:
        plot_feature(i)

The following set of observations can be made from the outcome of the above cell :
- Fields **SensorMeasure5** and **SensorMeasure16** don't show much variance with the RUL and seem constant all the time. Hence, they can be removed.
- Fields **SensorMeasure2**, **SensorMeasure3**, **SensorMeasure4**, **SensorMeasure13**, **SensorMeasure15** & **SensorMeasure17** show a similar rising trend.
- **SensorMeasure9** and **SensorMeasure14** show a similar trend.
- **SensorMeasure6** shows a flatline most of the time except in a very few places and therefore can be ignored.

In [None]:
# remove the unnecessary fields
cols = [
    i
    for i in cols
    if i not in ["ID", "SensorMeasure5", "SensorMeasure6", "SensorMeasure16"]
]
cols

## Split the data into train and test

Divide the dataset with the selected features into train and test sets.

In [None]:
# split data into train and test
X = FD001_df[cols].copy()
y = FD001_df["RUL"].copy()

# split the data into 70-30 ratio of train-test
X_train, X_test, y_train, y_test = train_test_split(
    X, y, train_size=0.7, random_state=36
)
X_train.shape, y_train.shape, X_test.shape, y_test.shape

## Fit a regression model
<a name="section-6"></a>

Initialize and train a regression model using the XGBoost library with the calculated RUL as the target feature.

In [None]:
model = xgb.XGBRegressor()
model.fit(X_train, y_train)

## Evaluate the trained model
<a name="section-7"></a>

Check the R2 scores of the model on train and test sets.

In [None]:
# print test R2 score
y_train_pred = model.predict(X_train)
train_score = r2_score(y_train, y_train_pred)
y_test_pred = model.predict(X_test)
test_score = r2_score(y_test, y_test_pred)
print("Train score:", train_score)
print("Test score:", test_score)

Check the RMSE errors on train and test sets.

In [None]:
# print train and test RMSEs
train_error = mean_squared_error(y_train, y_train_pred, squared=False)
test_error = mean_squared_error(y_test, y_test_pred, squared=False)
print("Train error:", train_error)
print("Test error:", test_error)

Plot the predicted values against the target values. The closer the plot to a straight line passing through origin with a unit slope, the better the model. 

In [None]:
# plot the train and test predictions
plt.scatter(y_train, y_train_pred)
plt.xlabel("Target")
plt.ylabel("Prediction")
plt.title("Train")
plt.show()
plt.scatter(y_test, y_test_pred)
plt.xlabel("Target")
plt.ylabel("Prediction")
plt.title("Test")
plt.show()

## Save the model
<a name="section-8"></a>

Save the model to a booster file.

In [None]:
# save the trained model to a local file "model.bst"
FILE_NAME = "model.bst"
model.save_model(FILE_NAME)

Copy the model to the cloud-storage bucket

In [None]:
# Upload the saved model file to Cloud Storage
BLOB_PATH = "mfg_predictive_maintenance/"
BLOB_NAME = os.path.join(BLOB_PATH, FILE_NAME)
bucket = storage.Client().bucket(BUCKET_URI[5:])
blob = bucket.blob(BLOB_NAME)
blob.upload_from_filename(FILE_NAME)

## Running a notebook end-to-end using executor
<a name="section-9"></a>

**Note:** This section can only be considered when running this notebook on Managed instances from Vertex AI Workbench.
### Automating the notebook execution
All the steps followed until now can be run as a training job without using any additional code using the Vertex AI Workbench executor. The executor can help you run a notebook file from start to end, with your choice of the environment, machine type, input parameters, and other characteristics. After setting up an execution, the notebook is executed as a job in Vertex AI custom training. Your jobs can be monitored from the Executor pane in the left sidebar.

<img src="images/executor.PNG">

The executor also lets you choose the environment and machine type while automating the runs similar to Vertex AI training jobs without switching to the training jobs UI. Apart from the custom container that replicates the existing kernel by default, pre-built environments like TensorFlow Enterprise, PyTorch, and others can also be selected to run the notebook. The required compute power can be specified by choosing from the list of machine types available, including GPUs.

### Scheduled runs on executor

Notebook runs can also be scheduled recurringly with the executor. To do so, select Schedule-based recurring executions as the run type instead of One-time execution. The frequency of the job and the time when it executes is provided when you create the execution.

<img src="https://storage.googleapis.com/gweb-cloudblog-publish/images/7_Vertex_AI_Workbench.max-1100x1100.jpg">

### Parameterizing the variables

The executor lets you run a notebook with different sets of input parameters. If you’ve added parameter tags to any of your notebook cells, you can pass in your parameter values to the executor. More about how to use this feature can be found on this [blog](https://cloud.google.com/blog/products/ai-machine-learning/schedule-and-execute-notebooks-with-vertex-ai-workbench).

<img src="https://storage.googleapis.com/gweb-cloudblog-publish/images/6_Vertex_AI_Workbench.max-700x700.jpg">


## Hosting the model on Vertex AI
<a name="section-10"></a>

### Create a model resource

The saved model from the Cloud Storage can be deployed easily using the Vertex AI SDK. To do so, first create a model resource.

In [None]:
ARTIFACT_GCS_PATH = f"{BUCKET_URI}/{BLOB_PATH}"

Give a display name to the Vertex AI model resource.

In [None]:
# Set the model-dsiplay-name
MODEL_DISPLAY_NAME = "[your-model-display-name]"  # @param {type:"string"}

# Otherwise, use the default name
if (
    MODEL_DISPLAY_NAME == "[your-model-display-name]"
    or MODEL_DISPLAY_NAME is None
    or MODEL_DISPLAY_NAME == ""
):
    MODEL_DISPLAY_NAME = "pred_maint_model_" + UUID

print(MODEL_DISPLAY_NAME)

In [None]:
# Create a Vertex AI model resource

model = aiplatform.Model.upload(
    display_name=MODEL_DISPLAY_NAME,
    artifact_uri=ARTIFACT_GCS_PATH,
    serving_container_image_uri="us-docker.pkg.dev/vertex-ai/prediction/xgboost-cpu.1-7:latest",
)

model.wait()

print(model.display_name)
print(model.resource_name)

### Create an Endpoint
<a name="section-11"></a>


Next, create an endpoint resource for deploying the model.

In [None]:
# Set the endpoint-dsiplay-name
ENDPOINT_DISPLAY_NAME = "[your-endpoint-display-name]"  # @param {type:"string"}

# Otherwise, use the default name
if (
    ENDPOINT_DISPLAY_NAME == "[your-endpoint-display-name]"
    or ENDPOINT_DISPLAY_NAME is None
    or ENDPOINT_DISPLAY_NAME == ""
):
    ENDPOINT_DISPLAY_NAME = "pred_maint_endpoint_" + UUID

print(ENDPOINT_DISPLAY_NAME)

In [None]:
# Create the Endpoint resource
endpoint = aiplatform.Endpoint.create(display_name=ENDPOINT_DISPLAY_NAME)

print(endpoint.display_name)
print(endpoint.resource_name)

### Deploy the model to the created Endpoint
<a name="section-12"></a>


Configure the following parameters and deploy the model to the created endpoint.

- `endpoint`: The `Endpoint` object created using Vertex AI SDK.
- `deployed_model_display_name`: A display-name for the deployment.
- `machine_type`: Type of the machine required for the deployment environment. See [here](https://cloud.google.com/vertex-ai/docs/predictions/configure-compute) for references.

In [None]:
# deploy the model to the endpoint
model.deploy(
    endpoint=endpoint,
    deployed_model_display_name=MODEL_DISPLAY_NAME + "_deployment",
    machine_type="n1-standard-2",
)

model.wait()

print(model.display_name)
print(model.resource_name)

## Test calling the endpoint
<a name="section-13"></a>

Send some sample data to the deployed model on the endpoint to get predictions.

In [None]:
# get predictions on sample data
instances = X_test.iloc[0:2].to_numpy().tolist()
print(endpoint.predict(instances=instances).predictions)

## Clean up
<a name="section-14"></a>

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:
* Vertex AI Model
* Vertex AI Endpoint
* Cloud Storage bucket

Set `delete_bucket` to **True** to delete the Cloud Storage bucket.

In [None]:
# Undeploy all the models from the endpoint
endpoint.undeploy_all()

# Delete the endpoint resource
endpoint.delete()

# Delete the model resource
model.delete()

# Delete the Cloud Storage bucket
delete_bucket = False
if delete_bucket:
    ! gsutil -m rm -r $BUCKET_URI

!rm -rf model.bst