people-and-planet-ai/weather-forecasting/notebooks/1-overview.ipynb (611 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "g4jtzXwEvW2-"
},
"outputs": [],
"source": [
"# @title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
"\n",
"# Licensed to the Apache Software Foundation (ASF) under one\n",
"# or more contributor license agreements. See the NOTICE file\n",
"# distributed with this work for additional information\n",
"# regarding copyright ownership. The ASF licenses this file\n",
"# to you under the Apache License, Version 2.0 (the\n",
"# \"License\"); you may not use this file except in compliance\n",
"# with the License. You may obtain a copy of the License at\n",
"#\n",
"# http://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing,\n",
"# software distributed under the License is distributed on an\n",
"# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
"# KIND, either express or implied. See the License for the\n",
"# specific language governing permissions and limitations\n",
"# under the License."
],
"id": "g4jtzXwEvW2-"
},
{
"cell_type": "markdown",
"metadata": {
"id": "HtysPAVSvcMg"
},
"source": [
"# 🌦️ Weather forecasting -- _Overview_\n",
"\n",
"[](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/weather-forecasting/notebooks/1-overview.ipynb)\n",
"\n",
"This sample is broken into the following notebooks:\n",
"\n",
"*  **🧭 Overview**:\n",
" Go through what we want to achieve, and explore the data we want to use as _inputs and outputs_ for our model.\n",
"\n",
"* [ **🗄️ Create the dataset**](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/weather-forecasting/notebooks/2-dataset.ipynb):\n",
" Use [Apache Beam](https://beam.apache.org/)\n",
" to fetch data from [Earth Engine](https://earthengine.google.com/) in parallel, and create a dataset for our model in [Dataflow](https://cloud.google.com/dataflow).\n",
"\n",
"* [ **🧠 Train the model**](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/weather-forecasting/notebooks/3-training.ipynb):\n",
" Build a simple _Fully Convolutional Network_ in [PyTorch](https://pytorch.org/) and train it in [Vertex AI](https://cloud.google.com/vertex-ai/docs/training/custom-training) with the dataset we created.\n",
"\n",
"* [ **🔮 Model predictions**](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/weather-forecasting/notebooks/4-predictions.ipynb):\n",
" Get predictions from the model with data it has never seen before.\n",
"\n",
"This sample leverages geospatial satellite and precipitation data from [Google Earth Engine](https://earthengine.google.com/).\n",
"Using satellite imagery, you'll build and train a model for rain \"nowcasting\" i.e. predicting the amount of rainfall for a given geospatial region and time in the immediate future.\n",
"\n",
"* ⏲️ **Time estimate**: ~5 minutes\n",
"* 💰 **Cost estimate**: _free_\n",
"\n",
"💚 This is one of many **machine learning how-to samples** inspired from **real climate solutions** aired on the [People and Planet AI 🎥 series](https://www.youtube.com/playlist?list=PLIivdWyY5sqI-llB35Dcb187ZG155Rs_7)."
],
"id": "HtysPAVSvcMg"
},
{
"cell_type": "markdown",
"metadata": {
"id": "AENPVmeYwqml"
},
"source": [
"## 📒 Using this interactive notebook\n",
"\n",
"Click the **run** icons ▶️ of each section within this notebook.\n",
"\n",
"\n",
"\n",
"> 💡 Alternatively, you can run the currently selected cell with `Ctrl + Enter` (or `⌘ + Enter` in a Mac).\n",
"\n",
"This **notebook code lets you train and deploy an ML model** from end-to-end. When you run a code cell, the code runs in the notebook's runtime, so you're not making any changes to your personal computer.\n",
"\n",
"> ⚠️ **To avoid any errors**, wait for each section to finish in their order before clicking the next “run” icon.\n",
"\n",
"This sample must be connected to a **Google Cloud project**, but nothing else is needed other than your Google Cloud project.\n",
"\n",
"You can use an _existing project_. Alternatively, you can create a new Cloud project [with cloud credits for free.](https://cloud.google.com/free/docs/gcp-free-tier)"
],
"id": "AENPVmeYwqml"
},
{
"cell_type": "markdown",
"metadata": {
"id": "aCDdVrxbw8je"
},
"source": [
"# 🎬 Before you begin\n",
"\n",
"Let's start by cloning the GitHub repository, and installing some dependencies."
],
"id": "aCDdVrxbw8je"
},
{
"cell_type": "code",
"source": [
"# Now let's get the code from GitHub and navigate to the sample.\n",
"!git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git\n",
"%cd python-docs-samples/people-and-planet-ai/weather-forecasting"
],
"metadata": {
"id": "W-fPxkYD9FaP"
},
"id": "W-fPxkYD9FaP",
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Upgrade `setuptools` to install packages from pyproject.toml files.\n",
"!pip install --quiet --upgrade --no-warn-conflicts pip setuptools\n",
"\n",
"# Install the `weather-data` local package.\n",
"!pip install serving/weather-data"
],
"metadata": {
"id": "AlcsK6pd-x0I"
},
"id": "AlcsK6pd-x0I",
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "mHvEEW6oyFGV"
},
"source": [
"## ☁️ My Google Cloud resources\n",
"\n",
"Make sure you have followed these steps to configure your Google Cloud project:\n",
"\n",
"1. Enable the APIs: _Earth Engine_\n",
"\n",
" <button>\n",
"\n",
" [Click here to enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=earthengine.googleapis.com)\n",
" </button>\n",
"\n",
"1. Register your\n",
" [Compute Engine default service account](https://console.cloud.google.com/iam-admin/iam)\n",
" on Earth Engine.\n",
"\n",
" <button>\n",
"\n",
" [Click here to register your service account on Earth Engine](https://signup.earthengine.google.com/#!/service_accounts)\n",
" </button>\n",
"\n",
"Once you have everything ready, you can go ahead and fill in your Google Cloud resources in the following code cell.\n",
"Make sure you run it!"
],
"id": "mHvEEW6oyFGV"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "YMPNUR0pyRvy"
},
"outputs": [],
"source": [
"from __future__ import annotations\n",
"\n",
"import os\n",
"from google.colab import auth\n",
"\n",
"# Please fill in these values.\n",
"project = \"\" # @param {type:\"string\"}\n",
"\n",
"# Quick input validations.\n",
"assert project, \"⚠️ Please provide a Google Cloud project ID\"\n",
"\n",
"# Authenticate to Colab.\n",
"auth.authenticate_user()\n",
"\n",
"# Set GOOGLE_CLOUD_PROJECT for google.auth.default().\n",
"os.environ[\"GOOGLE_CLOUD_PROJECT\"] = project\n",
"\n",
"# Set the gcloud project for other gcloud commands.\n",
"!gcloud config set project {project}"
],
"id": "YMPNUR0pyRvy"
},
{
"cell_type": "markdown",
"metadata": {
"id": "bp05WbBM596J"
},
"source": [
"# 🧭 Overview\n",
"\n",
"The goal of our model is using satellite images to do _weather forecasting_.\n",
"Specifically, we want to predict the amount of rainfall, measured in millimeters per hour, for the next two to six hours in the future.\n",
"This kind of short term forecasting is called [weather _nowcasting_](https://en.wikipedia.org/wiki/Nowcasting_(meteorology)).\n",
"\n",
"When working with satellite data, each image has the shape `(width, height, bands)`.\n",
"**Bands** contain _numeric values_ for each pixel in the image, like the measurements from specific satellite instruments for different ranges of the electromagnetic spectrum, or the probabilities of different classifications.\n",
"If you're familiar with image classification problems, you can think of the bands as similar to an image's RGB channels."
],
"id": "bp05WbBM596J"
},
{
"cell_type": "markdown",
"metadata": {
"id": "n6X3DTeYYAXM"
},
"source": [
"## ☔️ Precipitation\n",
"\n",
"We use [NASA's Global Precipitation Measurement (GPM)](https://developers.google.com/earth-engine/datasets/catalog/NASA_GPM_L3_IMERG_V06) to get the amount of _precipitation_ of rain and snow, measured as millimeters per hour.\n",
"We're interested in the `precipitationCal` band, which gives us the _calibrated_ precipitation amount.\n",
"\n",
"This is what we want to predict, so we'll use them for our _labels_.\n",
"But it's also useful for the model to look at the precipitation from the _past_, so we'll also use it as _inputs_.\n",
"\n",
"In the [`serving/data.py`](../serving/data.py) file, we defined a function called `get_gpm_sequence` which returns us an `ee.Image` with the precipitation values for the time sequence we give it.\n",
"Each time step is stored in a different band with the index as a prefix.\n",
"For example, the band corresponding to the first time step in the sequence would be `0_precipitationCal`, and the second time step would be `1_precipitationCal`."
],
"id": "n6X3DTeYYAXM"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "57gC1gfC9_Gc"
},
"outputs": [],
"source": [
"from datetime import datetime\n",
"import folium\n",
"import ee\n",
"\n",
"from weather.data import get_gpm_sequence\n",
"\n",
"\n",
"def gpm_layer(image: ee.Image, label: str, i: int) -> folium.TileLayer:\n",
" vis_params = {\n",
" \"bands\": [f\"{i}_precipitationCal\"],\n",
" \"min\": 0.0,\n",
" \"max\": 20.0,\n",
" \"palette\": [\n",
" \"000096\",\n",
" \"0064ff\",\n",
" \"00b4ff\",\n",
" \"33db80\",\n",
" \"9beb4a\",\n",
" \"ffeb00\",\n",
" \"ffb300\",\n",
" \"ff6400\",\n",
" \"eb1e00\",\n",
" \"af0000\",\n",
" ],\n",
" }\n",
" # Mask (hide) pixels with no precipitation to see the map below.\n",
" image = image.mask(image.gt(0.1))\n",
" return folium.TileLayer(\n",
" name=f\"[{label}] Precipitation\",\n",
" tiles=image.getMapId(vis_params)[\"tile_fetcher\"].url_format,\n",
" attr='Map Data © <a href=\"https://earthengine.google.com/\">Google Earth Engine</a>',\n",
" overlay=True,\n",
" )\n",
"\n",
"\n",
"# Get the Earth Engine images.\n",
"dates = [datetime(2019, 9, 2, 18)]\n",
"image = get_gpm_sequence(dates)\n",
"\n",
"# Show map.\n",
"map = folium.Map([25, -90], zoom_start=5)\n",
"for i, date in enumerate(dates):\n",
" gpm_layer(image, str(date), i).add_to(map)\n",
"folium.LayerControl().add_to(map)\n",
"map"
],
"id": "57gC1gfC9_Gc"
},
{
"cell_type": "markdown",
"metadata": {
"id": "ztJlKSjlMGAc"
},
"source": [
"\n",
"\n",
"> 💡 This is [Hurricane Dorian](https://en.wikipedia.org/wiki/Hurricane_Dorian), the strongest Category 5 hurricane on record in the Bahamas."
],
"id": "ztJlKSjlMGAc"
},
{
"cell_type": "markdown",
"metadata": {
"id": "y3NRvQndX66i"
},
"source": [
"## 🌨 Cloud and moisture\n",
"\n",
"To predict precipitation, it's also useful to take a look at the _cloud_ and _moisture_.\n",
"We use data from [GOES-16 Cloud and Moisture Imagery](https://developers.google.com/earth-engine/datasets/catalog/NOAA_GOES_16_MCMIPF), which was the first satellite from the [Geostationary Operational Environmental Satellites (GOES)](https://en.wikipedia.org/wiki/Geostationary_Operational_Environmental_Satellite) mission, operated by [NASA](https://en.wikipedia.org/wiki/NASA) and [NOAA](https://en.wikipedia.org/wiki/National_Oceanic_and_Atmospheric_Administration).\n",
"It includes measurements from the _visible_, _near-infrared_, and _infrared_ spectrum.\n",
"It is a [geostationary](https://en.wikipedia.org/wiki/Geostationary_orbit) satellite, so its orbit is synchronized with the Earth's rotation, and it provides a view centered in the Americas.\n",
"\n",
"In the [`serving/data.py`](../serving/data.py) file, we defined a function called `get_goes16_sequence` which returns us an `ee.Image` with the cloud and moisture data for the time sequence we give it."
],
"id": "y3NRvQndX66i"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-T5BGzzZH57f"
},
"outputs": [],
"source": [
"from datetime import datetime\n",
"import folium\n",
"import ee\n",
"\n",
"from weather.data import get_goes16_sequence\n",
"\n",
"\n",
"def goes16_layer(image: ee.Image, label: str, i: int) -> folium.TileLayer:\n",
" vis_params = {\n",
" \"bands\": [f\"{i}_CMI_C02\", f\"{i}_CMI_C03\", f\"{i}_CMI_C01\"],\n",
" \"min\": 0.0,\n",
" \"max\": 3000.0,\n",
" }\n",
" return folium.TileLayer(\n",
" name=f\"[{label}] Cloud and moisture\",\n",
" tiles=image.getMapId(vis_params)[\"tile_fetcher\"].url_format,\n",
" attr='Map Data © <a href=\"https://earthengine.google.com/\">Google Earth Engine</a>',\n",
" overlay=True,\n",
" )\n",
"\n",
"\n",
"# Get the Earth Engine image.\n",
"dates = [datetime(2019, 9, 2, 18)]\n",
"image = get_goes16_sequence(dates)\n",
"\n",
"# Show map.\n",
"map = folium.Map([25, -90], zoom_start=5)\n",
"for i, date in enumerate(dates):\n",
" goes16_layer(image, str(date), i).add_to(map)\n",
"folium.LayerControl().add_to(map)\n",
"map"
],
"id": "-T5BGzzZH57f"
},
{
"cell_type": "markdown",
"source": [
""
],
"metadata": {
"id": "WdpLS9eYaltD"
},
"id": "WdpLS9eYaltD"
},
{
"cell_type": "markdown",
"metadata": {
"id": "gqUhsl1UE2Xs"
},
"source": [
"## 🏔 Elevation\n",
"\n",
"Elevation could also give the model useful information.\n",
"We use the [MERIT Terrain DEM](https://developers.google.com/earth-engine/datasets/catalog/MERIT_DEM_v1_0_3) dataset to get the elvation.\n",
"\n",
"In the [`serving/data.py`](../serving/data.py) file, we defined a function called `get_elevation` which returns us an `ee.Image` with the elevation measured in meters."
],
"id": "gqUhsl1UE2Xs"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "vv_HM9f-KExw"
},
"outputs": [],
"source": [
"import folium\n",
"\n",
"from weather.data import get_elevation\n",
"\n",
"\n",
"def elevation_layer() -> folium.TileLayer:\n",
" image = get_elevation()\n",
" vis_params = {\n",
" \"bands\": [\"elevation\"],\n",
" \"min\": 0.0,\n",
" \"max\": 3000.0,\n",
" \"palette\": [\n",
" \"000000\",\n",
" \"478FCD\",\n",
" \"86C58E\",\n",
" \"AFC35E\",\n",
" \"8F7131\",\n",
" \"B78D4F\",\n",
" \"E2B8A6\",\n",
" \"FFFFFF\",\n",
" ],\n",
" }\n",
" return folium.TileLayer(\n",
" name=\"Elevation\",\n",
" tiles=image.getMapId(vis_params)[\"tile_fetcher\"].url_format,\n",
" attr='Map Data © <a href=\"https://earthengine.google.com/\">Google Earth Engine</a>',\n",
" overlay=True,\n",
" )\n",
"\n",
"\n",
"# Show map.\n",
"map = folium.Map([25, -90], zoom_start=5)\n",
"elevation_layer().add_to(map)\n",
"folium.LayerControl().add_to(map)\n",
"map"
],
"id": "vv_HM9f-KExw"
},
{
"cell_type": "markdown",
"source": [
""
],
"metadata": {
"id": "KfueoVPBapZp"
},
"id": "KfueoVPBapZp"
},
{
"cell_type": "markdown",
"metadata": {
"id": "AO4CyGgjYOME"
},
"source": [
"## 🛰 Inputs\n",
"\n",
"In this example, we also consider multiple images across time, since weather forecasting is more accurate when we look at how the cloud cover changes over a period of time.\n",
"Particularly, we consider 3 data points -- 4 hours prior, 2 hours prior, and current.\n",
"\n",
"> 💡 To give the model a better picture, we chose to feed it with _at least three_ data points from the past.\n",
"> With only a single point, the model wouldn't know if the rain is increasing or decreasing.\n",
"> Two points would give it an idea of the trend.\n",
"> Three or more points would give it an idea of how fast it's changing.\n",
"> The more points, the more it can see.\n",
"\n",
"In the [`serving/data.py`](serving/data.py) file, we defined a function called `get_inputs_image` which returns us an `ee.Image` with bands for all the time steps for cloud and moisture, and for precipitation, alongside with the elevation."
],
"id": "AO4CyGgjYOME"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5SbSErwSUCtZ"
},
"outputs": [],
"source": [
"from datetime import datetime, timedelta\n",
"import folium\n",
"\n",
"from weather.data import get_inputs_image\n",
"\n",
"# Get the Earth Engine image.\n",
"date = datetime(2019, 9, 2, 18)\n",
"image = get_inputs_image(date)\n",
"\n",
"# Get 4 hours prior, 2 hours prior, and current time.\n",
"input_hour_deltas = [-4, -2, 0]\n",
"\n",
"# Show map.\n",
"map = folium.Map([25, -90], zoom_start=5)\n",
"elevation_layer().add_to(map)\n",
"for i, h in enumerate(input_hour_deltas):\n",
" label = str(date + timedelta(hours=h))\n",
" goes16_layer(image, label, i).add_to(map)\n",
" gpm_layer(image, label, i).add_to(map)\n",
"folium.LayerControl().add_to(map)\n",
"map"
],
"id": "5SbSErwSUCtZ"
},
{
"cell_type": "markdown",
"metadata": {
"id": "IMOd2hVAp2h9"
},
"source": [
"\n",
"\n",
"> 💡 You can hide and show layers from the top-right corner widget to see all the inputs for the model."
],
"id": "IMOd2hVAp2h9"
},
{
"cell_type": "markdown",
"metadata": {
"id": "kRZlrlaXYRA0"
},
"source": [
"## ✅ Labels\n",
"\n",
"We chose to predict precipitation for 2 and 6 hours in the future, but it could be anything as long as we have the right _labels_.\n",
"\n",
"In the [`serving/data.py`](../serving/data.py) file, we defined a function called `get_labels_image` which returns us an `ee.Image` with bands for each time step of precipitation."
],
"id": "kRZlrlaXYRA0"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ppYBbnWyWGCR"
},
"outputs": [],
"source": [
"from datetime import datetime, timedelta\n",
"import folium\n",
"\n",
"from weather.data import get_labels_image, OUTPUT_HOUR_DELTAS\n",
"\n",
"# Get the Earth Engine image.\n",
"date = datetime(2019, 9, 3, 18)\n",
"image = get_labels_image(date)\n",
"\n",
"# Show map.\n",
"map = folium.Map([25, -90], zoom_start=5)\n",
"for i, h in enumerate(OUTPUT_HOUR_DELTAS):\n",
" label = str(date + timedelta(hours=h))\n",
" gpm_layer(image, label, i).add_to(map)\n",
"folium.LayerControl().add_to(map)\n",
"map"
],
"id": "ppYBbnWyWGCR"
},
{
"cell_type": "markdown",
"source": [
""
],
"metadata": {
"id": "XdVso6ela0Hz"
},
"id": "XdVso6ela0Hz"
},
{
"cell_type": "markdown",
"source": [
"# ⛳️ What's next?\n",
"\n",
"* [ **🗄️ Create the dataset**](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/weather-forecasting/notebooks/2-dataset.ipynb):\n",
" Use [Apache Beam](https://beam.apache.org/)\n",
" to fetch data from [Earth Engine](https://earthengine.google.com/) in parallel, and create a dataset for our model in [Dataflow](https://cloud.google.com/dataflow).\n",
"\n",
"* [ **🧠 Train the model**](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/weather-forecasting/notebooks/3-training.ipynb):\n",
" Build a simple _Fully Convolutional Network_ in [PyTorch](https://pytorch.org/) and train it in [Vertex AI](https://cloud.google.com/vertex-ai/docs/training/custom-training) with the dataset we created.\n",
"\n",
"* [ **🔮 Model predictions**](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/weather-forecasting/notebooks/4-predictions.ipynb):\n",
" Get predictions from the model with data it has never seen before."
],
"metadata": {
"id": "jzGYXELPApGN"
},
"id": "jzGYXELPApGN"
}
],
"metadata": {
"colab": {
"provenance": [],
"toc_visible": true
},
"environment": {
"kernel": "python3",
"name": "tf2-gpu.2-6.m82",
"type": "gcloud",
"uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-6:m82"
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}