projects/vision-ai-edge-platform/pipelines/anomaly/vi-anomaly-detection.ipynb (323 lines of code) (raw):

{ "cells": [ { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# http://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License.\n", "\n", "# cloud-solutions/vision-ai-edge-platform-v0.0.1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Anomaly Detection ML Model using AutoML Vision\n", "\n", "## Overview\n", "This notebook is an example Vertex AI Pipeline for bike pedal anomaly detection using AutoML Vision Classification model. With this notebook you can create a dataset and train a new AutoML Vision model for anomaly detection at the edge location." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configuring Google Cloud Project\n", "\n", "Configure Google Cloud Project for preparing required services for training." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "PROJECT_ID = \"YOUR-PROJECT-ID\"\n", "LOCATION = \"us-central1\"\n", "BUCKET_URI= f\"gs://vision-ai-edge-{PROJECT_ID}\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Enable Google Cloud Service APIs required." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! gcloud services enable compute.googleapis.com\n", "! gcloud services enable storage.googleapis.com\n", "! gcloud services enable aiplatform.googleapis.com" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new Cloud Storage Bucket for dataset and training job." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! gsutil mb -l {LOCATION} {BUCKET_URI}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a service account for Vertex AI Pipeline." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SERVICE_ACCOUNT_NAME = \"vision-ai-edge\"\n", "SERVICE_ACCOUNT_EMAIL = SERVICE_ACCOUNT_NAME + \"@\" + PROJECT_ID + \".iam.gserviceaccount.com\"\n", "\n", "! gcloud iam service-accounts create {SERVICE_ACCOUNT_NAME} --display-name=\"Vision AI Edge Service Account\"\n", "\n", "! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT_EMAIL}:roles/storage.objectCreator $BUCKET_URI\n", "\n", "! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT_EMAIL}:roles/storage.objectViewer $BUCKET_URI\n", "\n", "! gcloud projects add-iam-policy-binding {PROJECT_ID} --member=\"serviceAccount:{SERVICE_ACCOUNT_EMAIL}\" --role=\"roles/aiplatform.user\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparing example dataset\n", "\n", "Download the bike-pedal.zip file that contains example images and annotation.\n", "\n", "Create a bike-pedals folder and decompress the bike-pedals.zip file inside of the folder." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DATASET_URL=\"https://storage.googleapis.com/solutions-public-assets/vision-ai-edge-platform/bike-pedals.zip\"\n", "DATASET_NAME=\"bike-pedals\"\n", "\n", "! wget -qN {DATASET_URL}\n", "! unzip -qn {DATASET_NAME}.zip -d {DATASET_NAME}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate input-files.jsonl from template file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "BUCKET_URI_SED=BUCKET_URI.replace('/', '\\\\/')\n", "\n", "! sed 's/<<BUCKET_URI>>/{BUCKET_URI_SED}/g' \\\n", " ./{DATASET_NAME}/input-files-template.jsonl > \\\n", " ./{DATASET_NAME}/input-files.jsonl\n", "\n", "! head ./{DATASET_NAME}/input-files.jsonl" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Upload the bike-pedal folder to the GCS Bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! gsutil -m cp -r {DATASET_NAME} {BUCKET_URI}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a Vertex AI Dataset" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "from typing import Any, Dict, List\n", "\n", "import google.cloud.aiplatform as aip\n", "import kfp\n", "from kfp import compiler\n", "\n", "PIPELINE_ROOT = f\"{BUCKET_URI}/pipeline_root/flowers\"\n", "aip.init(project=PROJECT_ID, \n", " staging_bucket=BUCKET_URI,\n", " service_account=SERVICE_ACCOUNT_EMAIL)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Vertex AI Pipeline" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "@kfp.dsl.pipeline(name=\"anomaly-detection-bike-pedals-v1\")\n", "def pipeline(project: str = PROJECT_ID, region: str = LOCATION):\n", " from google_cloud_pipeline_components.v1.automl.training_job import \\\n", " AutoMLImageTrainingJobRunOp\n", " from google_cloud_pipeline_components.v1.dataset import \\\n", " ImageDatasetCreateOp\n", "\n", " ds_op = ImageDatasetCreateOp(\n", " project=project,\n", " display_name=\"bike-pedals\",\n", " gcs_source=BUCKET_URI + \"/bike-pedals/input-files.jsonl\",\n", " import_schema_uri=aip.schema.dataset.ioformat.image.single_label_classification,\n", " )\n", "\n", " training_job_run_op = AutoMLImageTrainingJobRunOp(\n", " project=project,\n", " display_name=\"vi-anomaly-pedal\",\n", " prediction_type=\"classification\",\n", " model_type=\"MOBILE_TF_HIGH_ACCURACY_1\",\n", " dataset=ds_op.outputs[\"dataset\"],\n", " model_display_name=\"vi-anomaly-pedal\",\n", " training_fraction_split=0.6,\n", " validation_fraction_split=0.2,\n", " test_fraction_split=0.2,\n", " budget_milli_node_hours=8000,\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compile the pipeline" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "compiler.Compiler().compile(\n", " pipeline_func=pipeline, package_path=\"vi_anomaly_pedal_pipeline.yaml\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run the pipeline" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "import string\n", "\n", "random_suffix = ''.join(random.choice(string.ascii_lowercase + string.digits) for _ in range(16))\n", "\n", "DISPLAY_NAME = \"vi_anomaly_pedal_\" + random_suffix\n", "\n", "job = aip.PipelineJob(\n", " display_name=DISPLAY_NAME,\n", " template_path=\"vi_anomaly_pedal_pipeline.yaml\",\n", " pipeline_root=PIPELINE_ROOT,\n", " enable_caching=False,\n", ")\n", "\n", "job.run()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cleaning up\n", "\n", "To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud\n", "project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.20" } }, "nbformat": 4, "nbformat_minor": 2 }