projects/vision-ai-edge-platform/pipelines/anomaly/vi-anomaly-detection.ipynb (323 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"# Copyright 2024 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# http://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License.\n",
"\n",
"# cloud-solutions/vision-ai-edge-platform-v0.0.1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Anomaly Detection ML Model using AutoML Vision\n",
"\n",
"## Overview\n",
"This notebook is an example Vertex AI Pipeline for bike pedal anomaly detection using AutoML Vision Classification model. With this notebook you can create a dataset and train a new AutoML Vision model for anomaly detection at the edge location."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configuring Google Cloud Project\n",
"\n",
"Configure Google Cloud Project for preparing required services for training."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"PROJECT_ID = \"YOUR-PROJECT-ID\"\n",
"LOCATION = \"us-central1\"\n",
"BUCKET_URI= f\"gs://vision-ai-edge-{PROJECT_ID}\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Enable Google Cloud Service APIs required."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! gcloud services enable compute.googleapis.com\n",
"! gcloud services enable storage.googleapis.com\n",
"! gcloud services enable aiplatform.googleapis.com"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a new Cloud Storage Bucket for dataset and training job."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! gsutil mb -l {LOCATION} {BUCKET_URI}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a service account for Vertex AI Pipeline."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"SERVICE_ACCOUNT_NAME = \"vision-ai-edge\"\n",
"SERVICE_ACCOUNT_EMAIL = SERVICE_ACCOUNT_NAME + \"@\" + PROJECT_ID + \".iam.gserviceaccount.com\"\n",
"\n",
"! gcloud iam service-accounts create {SERVICE_ACCOUNT_NAME} --display-name=\"Vision AI Edge Service Account\"\n",
"\n",
"! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT_EMAIL}:roles/storage.objectCreator $BUCKET_URI\n",
"\n",
"! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT_EMAIL}:roles/storage.objectViewer $BUCKET_URI\n",
"\n",
"! gcloud projects add-iam-policy-binding {PROJECT_ID} --member=\"serviceAccount:{SERVICE_ACCOUNT_EMAIL}\" --role=\"roles/aiplatform.user\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Preparing example dataset\n",
"\n",
"Download the bike-pedal.zip file that contains example images and annotation.\n",
"\n",
"Create a bike-pedals folder and decompress the bike-pedals.zip file inside of the folder."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"DATASET_URL=\"https://storage.googleapis.com/solutions-public-assets/vision-ai-edge-platform/bike-pedals.zip\"\n",
"DATASET_NAME=\"bike-pedals\"\n",
"\n",
"! wget -qN {DATASET_URL}\n",
"! unzip -qn {DATASET_NAME}.zip -d {DATASET_NAME}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Generate input-files.jsonl from template file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"BUCKET_URI_SED=BUCKET_URI.replace('/', '\\\\/')\n",
"\n",
"! sed 's/<<BUCKET_URI>>/{BUCKET_URI_SED}/g' \\\n",
" ./{DATASET_NAME}/input-files-template.jsonl > \\\n",
" ./{DATASET_NAME}/input-files.jsonl\n",
"\n",
"! head ./{DATASET_NAME}/input-files.jsonl"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Upload the bike-pedal folder to the GCS Bucket."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! gsutil -m cp -r {DATASET_NAME} {BUCKET_URI}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a Vertex AI Dataset"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"from typing import Any, Dict, List\n",
"\n",
"import google.cloud.aiplatform as aip\n",
"import kfp\n",
"from kfp import compiler\n",
"\n",
"PIPELINE_ROOT = f\"{BUCKET_URI}/pipeline_root/flowers\"\n",
"aip.init(project=PROJECT_ID, \n",
" staging_bucket=BUCKET_URI,\n",
" service_account=SERVICE_ACCOUNT_EMAIL)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define Vertex AI Pipeline"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"@kfp.dsl.pipeline(name=\"anomaly-detection-bike-pedals-v1\")\n",
"def pipeline(project: str = PROJECT_ID, region: str = LOCATION):\n",
" from google_cloud_pipeline_components.v1.automl.training_job import \\\n",
" AutoMLImageTrainingJobRunOp\n",
" from google_cloud_pipeline_components.v1.dataset import \\\n",
" ImageDatasetCreateOp\n",
"\n",
" ds_op = ImageDatasetCreateOp(\n",
" project=project,\n",
" display_name=\"bike-pedals\",\n",
" gcs_source=BUCKET_URI + \"/bike-pedals/input-files.jsonl\",\n",
" import_schema_uri=aip.schema.dataset.ioformat.image.single_label_classification,\n",
" )\n",
"\n",
" training_job_run_op = AutoMLImageTrainingJobRunOp(\n",
" project=project,\n",
" display_name=\"vi-anomaly-pedal\",\n",
" prediction_type=\"classification\",\n",
" model_type=\"MOBILE_TF_HIGH_ACCURACY_1\",\n",
" dataset=ds_op.outputs[\"dataset\"],\n",
" model_display_name=\"vi-anomaly-pedal\",\n",
" training_fraction_split=0.6,\n",
" validation_fraction_split=0.2,\n",
" test_fraction_split=0.2,\n",
" budget_milli_node_hours=8000,\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Compile the pipeline"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [],
"source": [
"compiler.Compiler().compile(\n",
" pipeline_func=pipeline, package_path=\"vi_anomaly_pedal_pipeline.yaml\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run the pipeline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"import string\n",
"\n",
"random_suffix = ''.join(random.choice(string.ascii_lowercase + string.digits) for _ in range(16))\n",
"\n",
"DISPLAY_NAME = \"vi_anomaly_pedal_\" + random_suffix\n",
"\n",
"job = aip.PipelineJob(\n",
" display_name=DISPLAY_NAME,\n",
" template_path=\"vi_anomaly_pedal_pipeline.yaml\",\n",
" pipeline_root=PIPELINE_ROOT,\n",
" enable_caching=False,\n",
")\n",
"\n",
"job.run()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cleaning up\n",
"\n",
"To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud\n",
"project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.20"
}
},
"nbformat": 4,
"nbformat_minor": 2
}