### <font color='#4285f4'>Overview</font>

**Overview**: This notebook will take you step by step through on how to create Entry Groups, Aspect Types, Entry Types and Entry(s).  You will then setup governance on a series of entries both custom and system (BigQuery) specifying the values for your aspects.

You can then search for tables, Pub/Sub, analytics hub, Vertex models and more by the different aspect types, etc.

**Process Flow**:
1.  **Create helper methods:**
    *   A method to check for the existence of an item (to avoid recreation on re-runs).
    *   A method to create each artifact.

2.  **Create custom artifacts:**
    *   Entry Group
    *   Aspect Type
    *   Entry Type (containing Aspect Type(s))
    *   Entry (containing an Entry Type, placed within an Entry Group)

3.  **Associate an Entry Type** with a BigQuery table.

4.  **Update the table overview and contacts (roles)** on a BigQuery table.

5.  **Create Aspect Types and Entry Types** to associate with each table and column in our BigQuery tables.

6.  **Apply to all zones:**
    *   Raw
    *   Enriched
    *   Curated

7.  **Update the BigQuery overview and contacts.**

Notes:
* This notebook uses REST API calls to create the Entry Groups, Aspect Types, Entry Types and Entry(s).  You can also do this in Terraform.  Please see the sample code provided in the notebook.
* If you get the ERROR, cannot write to BigQuery during the PATCH command (updateDataplexSystemEntry_BigQueryTable), you might need to wait several hours before the cell will work for new projects.
    ```
    {
    "error": {
        "code": 403,
        "message": "Write access to project 'xxx' was denied: If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.",
        "status": "PERMISSION_DENIED"
    }
    }
    ```

Cost:
* Approximate cost: Less than a dollar

Author:
* Adam Paternostro

In [None]:
# Architecture Diagram
from IPython.display import Image
Image(url='https://storage.googleapis.com/data-analytics-golden-demo/colab-diagrams/BigQuery-Data-Governance-Data-Governance.png', width=1200)

In [None]:
# Architecture Diagram
from IPython.display import Image
Image(url='https://storage.googleapis.com/data-analytics-golden-demo/colab-diagrams/BigQuery-Data-Governance-Data-Governance-Arch.png', width=1200)

### <font color='#4285f4'>Video Walkthrough</font>

[Video](https://storage.googleapis.com/data-analytics-golden-demo/colab-videos/Data-Governance.mp4)


In [None]:
from IPython.display import HTML

HTML("""
<video width="800" height="600" controls>
  <source src="https://storage.googleapis.com/data-analytics-golden-demo/colab-videos/Data-Governance.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>
""")

### <font color='#4285f4'>License</font>

```
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
```

### <font color='#4285f4'>Pip installs</font>

In [None]:
# PIP Installs (if necessary)
import sys

# !{sys.executable} -m pip install REPLACE-ME

### <font color='#4285f4'>Initialize</font>

In [None]:
from PIL import Image
from IPython.display import HTML
import IPython.display
import google.auth
import requests
import json
import uuid
import base64
import os
import cv2
import random
import time
import datetime
import base64
import random

import logging
from tenacity import retry, wait_exponential, stop_after_attempt, before_sleep_log, retry_if_exception

In [None]:
# Set these (run this cell to verify the output)

bigquery_location = "${bigquery_location}"
region = "${dataplex_region}"
location = "${location}"
random_extension="${random_extension}"

# Get the current date and time
now = datetime.datetime.now()

# Format the date and time as desired
formatted_date = now.strftime("%Y-%m-%d-%H-%M")

# Get some values using gcloud
project_id = os.environ["GOOGLE_CLOUD_PROJECT"]
user = !(gcloud auth list --filter=status:ACTIVE --format="value(account)")

if len(user) != 1:
  raise RuntimeError(f"user is not set: {user}")
user = user[0]

print(f"project_id = {project_id}")
print(f"user = {user}")

### <font color='#4285f4'>Helper Methods</font>

#### restAPIHelper
Calls the Google Cloud REST API using the current users credentials.

In [None]:
def restAPIHelper(url: str, http_verb: str, request_body: str) -> str:
  """Calls the Google Cloud REST API passing in the current users credentials"""

  import requests
  import google.auth
  import json

  # Get an access token based upon the current user
  creds, project = google.auth.default()
  auth_req = google.auth.transport.requests.Request()
  creds.refresh(auth_req)
  access_token=creds.token

  headers = {
    "Content-Type" : "application/json",
    "Authorization" : "Bearer " + access_token
  }

  if http_verb == "GET":
    response = requests.get(url, headers=headers)
  elif http_verb == "POST":
    response = requests.post(url, json=request_body, headers=headers)
  elif http_verb == "PUT":
    response = requests.put(url, json=request_body, headers=headers)
  elif http_verb == "PATCH":
    response = requests.patch(url, json=request_body, headers=headers)
  elif http_verb == "DELETE":
    response = requests.delete(url, headers=headers)
  else:
    raise RuntimeError(f"Unknown HTTP verb: {http_verb}")

  if response.status_code == 200:
    return json.loads(response.content)
    #image_data = json.loads(response.content)["predictions"][0]["bytesBase64Encoded"]
  else:
    error = f"Error restAPIHelper -> ' Status: '{response.status_code}' Text: '{response.text}'"
    raise RuntimeError(error)

#### RunQuery
Runs a query in BigQuery

In [None]:
def RunQuery(sql):
  import time
  from google.cloud import bigquery
  client = bigquery.Client()

  if (sql.startswith("SELECT") or sql.startswith("WITH")):
      df_result = client.query(sql).to_dataframe()
      return df_result
  else:
    job_config = bigquery.QueryJobConfig(priority=bigquery.QueryPriority.INTERACTIVE)
    query_job = client.query(sql, job_config=job_config)

    # Check on the progress by getting the job's updated state.
    query_job = client.get_job(
        query_job.job_id, location=query_job.location
    )
    print("Job {} is currently in state {} with error result of {}".format(query_job.job_id, query_job.state, query_job.error_result))

    while query_job.state != "DONE":
      time.sleep(2)
      query_job = client.get_job(
          query_job.job_id, location=query_job.location
          )
      print("Job {} is currently in state {} with error result of {}".format(query_job.job_id, query_job.state, query_job.error_result))

    if query_job.error_result == None:
      return True
    else:
      raise Exception(query_job.error_result)

### <font color='#4285f4'>Entry Group - Helper Methods</font>

#### existsEntryGroup
- Tests to see if a Entry Group exists
- Returns True/False

In [None]:
def existsEntryGroup(project_id, entryGroupId, entryGroupLocation):
  """Checks to see if an Entry Group already exists"""

  # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryGroups/list
  url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryGroupLocation}/entryGroups"

  json_result = restAPIHelper(url, "GET", None)
  print(f"existsEntryGroup (GET) json_result: {json_result}")

  # Test to see if exists, if so return
  if "entryGroups" in json_result:
    for item in json_result["entryGroups"]:
      # print(f"Name: {item['name']}")
      if item["name"] == f"projects/{project_id}/locations/{entryGroupLocation}/entryGroups/{entryGroupId}":
        # print(f"Entry Group {entryGroupId} already exists")
        return True

  return False

#### createEntryGroup
- Creates an Entry Group if it does not exist

##### Sample Terraform

```
resource "google_dataplex_entry_group" "my_entry_group" {
  project = var.project_id
  entry_group_id = "my-entry-group"
  location = "global"

  labels = { "tag": "test-tf" }
  display_name = "My Entry Group (entry group)"
  description = "Entry group used for testing"
}
```

##### Sample REST API Code

In [None]:
def createEntryGroup(project_id, entryGroupId, entryGroupLocation, entryGroupName, entryGroupDescription):
  """Creates an Entry Group if it does not exist"""

  if existsEntryGroup(project_id, entryGroupId, entryGroupLocation) == False:
    # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryGroups/create
    url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryGroupLocation}/entryGroups?entryGroupId={entryGroupId}"

    data = {
        "displayName": entryGroupName,
        "description": entryGroupDescription
    }

    json_result = restAPIHelper(url, "POST", data)
    print(f"createEntryGroup (POST) json_result: {json_result}")
  else:
    print(f"createEntryGroup (POST) Entry Group {entryGroupId} already exists")


### <font color='#4285f4'>Aspect Type - Helper Methods</font>

#### existsAspectType
- Tests to see if a Aspect Type exists
- Returns True/False

In [None]:
def existsAspectType(project_id, aspectTypeId, aspectTypeLocation):
  """Checks to see if an Entry Group already exists"""

  # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.aspectTypes/list
  url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{aspectTypeLocation}/aspectTypes"

  json_result = restAPIHelper(url, "GET", None)
  print(f"existsAspectType (GET) json_result: {json_result}")

  # Test to see if exists, if so return
  if "aspectTypes" in json_result:
    for item in json_result["aspectTypes"]:
      print(f"Name: {item['name']}")
      if item["name"] == f"projects/{project_id}/locations/{aspectTypeLocation}/aspectTypes/{aspectTypeId}":
        # print(f"Aspect Type {aspectTypeId} already exists")
        return True

  return False

#### createAspectType
- Creates an Aspect Type if it does not exist

##### Sample Terraform

```
resource "google_dataplex_aspect_type" "my_aspect_type" {
  project = var.project_id  
  aspect_type_id = "my-aspect-type"
  location = "us"

  labels = { "tag": "test-tf" }
  display_name = "My Aspect Type (aspect type)"
  description = "PII data aspect type"
  metadata_template = <<EOF
{
  "name": "tf-test-template",
  "type": "record",
  "recordFields": [
    {
      "name": "type",
      "type": "enum",
      "annotations": {
        "displayName": "Type",
        "description": "Specifies the type of view represented by the entry."
      },
      "index": 1,
      "constraints": {
        "required": true
      },
      "enumValues": [
        {
          "name": "FILE",
          "index": 1
        }
      ]
    }
  ]
}
EOF
}

```

##### REST API Code

In [None]:
def createAspectType(project_id, aspectTypeId, aspectTypeLocation, aspectTypeName, aspectTypeDescription, metadataTemplate):
  """Creates an Aspect Type if it does not exist"""

  if existsAspectType(project_id, aspectTypeId, aspectTypeLocation) == False:
    # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.aspectTypes/create
    url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{aspectTypeLocation}/aspectTypes?aspectTypeId={aspectTypeId}"

    data = {
        "displayName": aspectTypeName,
        "description": aspectTypeDescription,
        "metadataTemplate": metadataTemplate
    }

    json_result = restAPIHelper(url, "POST", data)
    print(f"createAspectType (POST) json_result: {json_result}")
  else:
    print(f"createAspectType (POST) Aspect Type {aspectTypeId} already exists")


### <font color='#4285f4'>Entry Type - Helper Methods</font>

#### existsEntryType
- Tests to see if a Entry Type exists
- Returns True/False

In [None]:
def existsEntryType(project_id, entryTypeId, entryTypeLocation):
  """Checks to see if an Entry Type already exists"""

  # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryTypes/list
  url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryTypeLocation}/entryTypes"

  json_result = restAPIHelper(url, "GET", None)
  print(f"existsAspectType (GET) json_result: {json_result}")

  # Test to see if exists, if so return
  if "entryTypes" in json_result:
    for item in json_result["entryTypes"]:
      print(f"Name: {item['name']}")
      if item["name"] == f"projects/{project_id}/locations/{entryTypeLocation}/entryTypes/{entryTypeId}":
        # print(f"Entry Type {entryTypeId} already exists")
        return True

  return False

#### createEntryType
- Creates an Entry Type if it does not exist

##### Sample Terraform

```
resource "google_dataplex_entry_type" "my_entry_type" {
  project = var.project_id
  entry_type_id = "my-entry-type"
  location = "us"

  labels = { "tag": "test-tf" }
  display_name = "My Entry Type (entry type)"
  description = "My Entry Type entry type"

  type_aliases = ["TABLE", "DATABASE"]
  platform = "GCS"
  system = "BigQuery"

  required_aspects {
    type = google_dataplex_aspect_type.my_aspect_type.name
  }

  depends_on = [google_dataplex_aspect_type.my_aspect_type]
}
```

##### REST API Code

In [None]:
def createEntryType(project_id, entryTypeId, entryTypeLocation, entryTypeName, entryTypeDescription, type_aliases, platform, system, requiredAspects):
  """Creates an Entry Type if it does not exist"""

  if existsEntryType(project_id, entryTypeId, entryTypeLocation) == False:
    # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryTypes/create
    url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryTypeLocation}/entryTypes?entryTypeId={entryTypeId}"

    data = {
        "displayName": entryTypeName,
        "description": entryTypeDescription,
        "type_aliases": type_aliases,
        "platform": platform,
        "system": system,
        "requiredAspects": requiredAspects,
    }

    json_result = restAPIHelper(url, "POST", data)
    print(f"createEntryType (POST) json_result: {json_result}")
  else:
    print(f"createEntryType (POST) Entry Type {entryTypeId} already exists")


### <font color='#4285f4'>Entry (**Custom**) - Helper Methods</font>

#### existsEntry
- Tests to see if a Entry exists
- Returns True/False

In [None]:
def existsEntry(project_id, entryGroupId, entryGroupLocation, entryId):
  """Checks to see if an Entry already exists"""

  # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryGroups.entries/list
  url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/{entryGroupId}/entries"

  json_result = restAPIHelper(url, "GET", None)
  print(f"existsAspectType (GET) json_result: {json_result}")

  # Test to see if exists, if so return
  if "entries" in json_result:
    for item in json_result["entries"]:
      print(f"Name: {item['name']}")
      if item["name"] == f"projects/{project_id}/locations/{entryGroupLocation}/entryGroups/{entryGroupId}/entries/{entryId}":
        print(f"Entry {entryId} already exists in Entry Group {entryGroupId}")
        return True

  return False

#### createEntry
- Creates an Entry if it does not exist

##### Sample Terraform

```
Terraform currently not available
```

##### REST API Code

In [None]:
def createEntry(project_id,
                entryGroupId, entryGroupLocation,
                entryTypeId, entryTypeLocation,
                entryId, entryName, entryDescription, system, aspects):
  """Creates an Entry (custom) if it does not exist"""

  if existsEntry(project_id, entryGroupId, entryGroupLocation, entryId) == False:
    # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryGroups.entries/create
    url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/{entryGroupId}/entries?entryId={entryId}"

    data = {
        "entrySource": {
            "displayName": entryName,
            "description": entryDescription,
            "system": system  # This allow us to then search for "Custom" Entries
            },
        "entryType": f"projects/{project_id}/locations/{entryTypeLocation}/entryTypes/{entryTypeId}",
        "aspects": aspects,
    }

    json_result = restAPIHelper(url, "POST", data)
    print(f"createEntry (POST) json_result: {json_result}")
  else:
    print(f"createEntry (POST) Entry {entryTypeId} already exists in Entry Group {entryGroupId}")

### <font color='#4285f4'>Entry - Update Dataplex Metadata on BigQuery Table / Column (**System Entry**) - Helper Methods</font>

#### updateDataplexSystemEntry_BigQueryTable
- Adds an entry type and aspect to a built in system type (e.g. BigQuery Table)

##### Sample Terraform

```
Terraform currently not available
```

##### REST API Code

In [None]:
def updateDataplexSystemEntry_BigQueryTable(project_id,
                                           entryGroupLocation,
                                           bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                           entryTypeId, entryTypeLocation,
                                           aspects):
  """Associates an Entry Type and Aspect Type to a BigQuery (Dataplex System Entry Group)"""

  # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryGroups.entries/patch
  url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/" + \
          f"@bigquery/entries/bigquery.googleapis.com/projects/{bigqueryProjectId}/datasets/{bigqueryDataset}/tables/{bigqueryTable}?update_mask=aspects"

  # IMPORTANT NOTE:
  # If you uncomment out the entryType below you will Replace the System Entry Type of "BigQuery Table" which is not a reccomended pattern
  # You should just add the aspects as "optional" in order to avoid replacing the default System Entry Type.
  data = {
      # "entryType": f"projects/{project_id}/locations/{entryTypeLocation}/entryTypes/{entryTypeId}",
      "aspects": aspects
  }

  json_result = restAPIHelper(url, "PATCH", data)
  print(f"updateDataplexSystemEntry_BigQueryTable (PATCH) json_result: {json_result}")

#### updateDataplexMetatdata_BigQueryTable
- Updates the overview and roles Dataplex metadata on a BigQuery table

##### Sample Terraform

```
Terraform currently not available
```

##### REST API Code

In [None]:
def updateDataplexMetatdata_BigQueryTable(project_id,
                                              entryGroupLocation,
                                              bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                              entryTypeId, entryTypeLocation,
                                              overviewText, roleList):
  """Updates the Overview text and the Roles (replaces them)"""

  # https://cloud.google.com/dataplex/docs/reference/rest/v1/projects.locations.entryGroups.entries/patch
  url = f"https://dataplex.googleapis.com/v1/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/" + \
        f"@bigquery/entries/bigquery.googleapis.com/projects/{bigqueryProjectId}/datasets/{bigqueryDataset}/tables/{bigqueryTable}?update_mask=aspects"

  data = {
      "aspects": {
          "dataplex-types.global.overview": {
              "data": {
                  "content": overviewText
                  }
              },
          "dataplex-types.global.contacts": {
              "data": {
                  "identities": []
                  }
              }
          }
      }

  for item in roleList:
    data["aspects"]["dataplex-types.global.contacts"]["data"]["identities"].append( {"role": item["role"], "name": item["name"] } )

  json_result = restAPIHelper(url, "PATCH", data)
  print(f"updateDataplexMetatdata_BigQueryTable (PATCH) json_result: {json_result}")

### <font color='#4285f4'>Example: Create Data Data Governance Structure</font>

##### Entry Group

In [None]:
myEntryGroupId = "my-entry-group"
myEntryGroupLocation = "global"
myEntryGroupName = "My Entry Group" + f" ({random_extension})"
myEntryGroupDescription = "A test entry group"

existsEntryGroup(project_id, myEntryGroupId, myEntryGroupLocation)

createEntryGroup(project_id, myEntryGroupId, myEntryGroupLocation, myEntryGroupName, myEntryGroupDescription)

print(f"To view Entry Groups: https://console.cloud.google.com/dataplex/catalog/entry-groups?project={project_id}")

##### Aspect Type

In [None]:
myAspectTypeId = "my-aspect-type"
myAspectTypeName = "My Aspect Type" + f" ({random_extension})"
myAspectTypeDescription = "Test aspect type"
myAspectTypeLocation = "global"
myAspectTypeTemplateName = "my-metadataTemplate-template"

existsAspectType(project_id, myAspectTypeId, myAspectTypeLocation)

metadataTemplate = {
  "name": myAspectTypeTemplateName,
  "type": "record",
  "recordFields": [
    {
      "name": "type",
      "type": "enum",
      "annotations": {
        "displayName": "Type",
        "description": "Specifies the type of view represented by the entry."
      },
      "index": 1,
      "constraints": {
        "required": True
      },
      "enumValues": [
        {
          "name": "FILE",
          "index": 1
        }
      ]
    }
  ]
}
createAspectType(project_id, myAspectTypeId, myAspectTypeLocation, myAspectTypeName, myAspectTypeDescription, metadataTemplate)
print(f"To view Aspect Types: https://console.cloud.google.com/dataplex/catalog/aspect-types?project={project_id}")

##### Entry Type

In [None]:
myEntryGroupId = "my-entry-group"
myEntryGroupLocation = "global"

myEntryTypeId = "my-entry-type"
myEntryTypeName = "My Entry Type" + f" ({random_extension})"
myEntryTypeDescription = "Test entry type"
myEntryTypeLocation = "global"
myEntryTypeTypeAliases = ["LISTING"]
myEntryTypePlatform = "GCS"
myEntryTypeSystem = "Custom"

myAspectId = "my-aspect-type"
myAspectLocation = "global"

existsEntryType(project_id, myEntryTypeId, myEntryTypeLocation)

requiredAspects = [
    {
        "type": f"projects/{project_id}/locations/{myAspectLocation}/aspectTypes/{myAspectId}"
    }
]
createEntryType(project_id, myEntryTypeId, myEntryTypeLocation, myEntryTypeName, myEntryTypeDescription, myEntryTypeTypeAliases, myEntryTypePlatform, myEntryTypeSystem, requiredAspects)
print(f"To view Entry Types: https://console.cloud.google.com/dataplex/catalog/entry-types?project={project_id}")

##### Entry (Custom Entry)

In [None]:
myEntryGroupId = "my-entry-group"
myEntryGroupLocation = "global"

myEntryTypeId = "my-entry-type"
myEntryTypeLocation = "global"

myAspectId = "my-aspect-type"
myAspectLocation = "global"

myEntryId = "my-entry"
myEntryName = "My Entry" + f" ({random_extension})"
myEntryDescription = "Test entry (custom)"
myEntrySystem = "Custom" # This will show as a custom object

existsEntry(project_id, myEntryGroupId, myEntryGroupLocation, myEntryId)

aspects = {
    f"{project_id}.{myAspectLocation}.{myAspectId}": {
        "data": {"type": "FILE"}
        }
    }
createEntry(project_id,
            myEntryGroupId, myEntryGroupLocation,
            myEntryTypeId, myEntryTypeLocation,
            myEntryId, myEntryName, myEntryDescription, myEntrySystem, aspects)
print(f"To view Entry: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{myEntryGroupLocation}/entryGroups/{myEntryGroupId}/entries/{myEntryId}?&project={project_id}")

##### Entry - Update Dataplex Metadata on BigQuery Table (System Entry)

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_raw_dataset}"
bigqueryTable = "customer"

myEntryTypeId = "my-entry-type"
myEntryTypeLocation = "global"

myAspectId = "my-aspect-type"
myAspectLocation = "global"

aspects = {
    f"{project_id}.{myAspectLocation}.{myAspectId}": {
        "data": {"type": "FILE"}
        }
    }

updateDataplexSystemEntry_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       myEntryTypeId, myEntryTypeLocation,
                                       aspects)

overviewText = "This is a test description for a BigQuery table."
roleList = [
    {
      "role" : "Project Manager",
      "name" : "Bugs Bunny"
    },
    {
      "role" : "Owner",
      "name" : "Google"
    }
    ]
updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       myEntryTypeId, myEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")


### <font color='#4285f4'>BigQuery Data Governance Use Case: Create Data Data Governance Structure</font>

##### **Create Aspect Types**
- We will be decorating our BigQuery Tables with Aspect Types and since BigQuery tables are System "Entry(s)" they are already in an Entry Group.

###### Aspect Type: Data Domain

In [None]:
dataDomainAspectId = "data-domain-aspect-type"
dataDomainAspectTypeName = "Data Domain" + f" ({random_extension})"
dataDomainAspectTypeDescription = "This classification shows how much the data has been processed and prepared for use. Think of it as the 'maturity' level of the data. Raw data is just as it was initially collected. Enriched data has been cleaned and may have additional information added to it. Curated data is the most refined and is ready for business analysis and reporting."
dataDomainAspectLocation = "global"
dataDomainAspectTemplateName = f"{dataDomainAspectId}-metadataTemplate"

metadataTemplate = {
  "name": dataDomainAspectTemplateName,
  "type": "record",
  "recordFields": [
    {
      "name": "zone",
      "type": "enum",
      "annotations": {
        "displayName": "Zone",
        "description": "Indicates the level of processing the data has undergone (Raw, Enriched, or Curated)."
      },
      "index": 1,
      "constraints": {
        "required": True
      },
      "enumValues": [
        {
          "name": "Raw",
          "index": 1
        },
        {
          "name": "Enriched",
          "index": 2
        },
        {
          "name": "Curated",
          "index": 3
        },
      ]
    }
  ]
}
createAspectType(project_id, dataDomainAspectId, dataDomainAspectLocation, dataDomainAspectTypeName, dataDomainAspectTypeDescription, metadataTemplate)
print(f"To view Aspect Types: https://console.cloud.google.com/dataplex/catalog/aspect-types?project={project_id}")

###### Aspect Type: Data Retention

In [None]:
dataRetentionAspectId = "data-retention-aspect-type"
dataRetentionAspectTypeName = "Data Retention" + f" ({random_extension})"
dataRetentionAspectTypeDescription = "This aspect type defines how long a data asset should be retained, along with the relevant policies."
dataRetentionAspectLocation = "global"
dataRetentionAspectTemplateName = f"{dataRetentionAspectId}-metadataTemplate"

metadataTemplate = {
    "name": dataRetentionAspectTemplateName,
    "type": "record",
    "recordFields": [
        {
            "name": "retention-days",
            "type": "int",
            "annotations": {
                "displayName": "Retention Days",
                "description": "The number of days this data asset should be retained before deletion or archiving."
            },
            "index": 1,
            "constraints": {
                "required": True
            }
        },
        {
           "name": "retention-policy",
           "type": "string",
           "annotations": {
               "displayName": "Retention Policy",
               "description": "A URL or reference to the policy that governs the data retention for this asset."
           },
            "index": 2,
            "constraints": {
                "required": False
            }
        }
    ]
}

createAspectType(project_id, dataRetentionAspectId, dataRetentionAspectLocation, dataRetentionAspectTypeName, dataRetentionAspectTypeDescription, metadataTemplate)
print(f"To view Aspect Types: https://console.cloud.google.com/dataplex/catalog/aspect-types?project={project_id}")

###### Aspect Type: Data Goverance

In [None]:
dataGovernanceAspectId = "data-governance-aspect-type"
dataGovernanceAspectTypeName = "Data Governance" + f" ({random_extension})"
dataGovernanceAspectTypeDescription = "This aspect type defines if a table contains Personally Identifiable Information (PII) and provides table level governance information."
dataGovernanceAspectLocation = "global"
dataGovernanceAspectTemplateName = f"{dataGovernanceAspectId}-metadataTemplate"


metadataTemplate = {
    "name": dataGovernanceAspectTemplateName,
    "type": "record",
    "recordFields": [
         {
            "name": "data-steward",
            "type": "string",
            "annotations": {
                "displayName": "Data Steward",
                "description": "The name or ID of the data steward responsible for this table."
            },
            "index": 1,
             "constraints": {
                "required": True
            }
        },
        {
            "name": "owner-group",
            "type": "string",
            "annotations": {
                "displayName": "Owner Group",
                "description": "The IAM group or team responsible for this table."
            },
            "index": 2,
              "constraints": {
                "required": True
            }
        },
        {
            "name": "business-owner",
            "type": "string",
            "annotations": {
                "displayName": "Business Owner",
                "description": "Name of the owner or contact for the data asset"
            },
            "index": 3,
              "constraints": {
                "required": True
            }
        },
        {
            "name": "documentation-url",
            "type": "string",
            "annotations": {
                "displayName": "Documentation URL",
                 "description": "URL to documentation about the table, including access, usage, etc."
            },
            "index": 4,
            "constraints": {
                "required": False
            }
        },
       {
            "name": "data-lifecycle",
            "type": "enum",
             "annotations": {
                "displayName": "Data Lifecycle",
                "description": "The lifecycle stage of the asset (Dev, Test, QA, Production, Deprecated)"
            },
            "index": 5,
            "constraints": {
                "required": True
            },
            "enumValues": [
                {
                  "name": "Dev",
                  "index": 1
                },
                {
                  "name": "Test",
                  "index": 2
                },
               {
                  "name": "QA",
                  "index": 3
                },
                {
                  "name": "Production",
                   "index": 4
                },
                {
                  "name": "Deprecated",
                   "index": 6
                },
            ]
        },
      {
        "name": "classification-level",
        "type": "enum",
        "annotations": {
          "displayName": "Classification Level",
          "description": "Indicates the sensitivity and access restrictions for this data asset (Public, Internal, Confidential, Restricted)."
        },
        "index": 6,
        "constraints": {
          "required": True
        },
        "enumValues": [
          {
            "name": "Public",
            "index": 1
          },
          {
            "name": "Internal",
            "index": 2
          },
          {
            "name": "Confidential",
            "index": 3
          },
          {
            "name": "Restricted",
            "index": 4
          },
          ]
      },
        {
          "name": "data-sensitivity-level",
          "type": "enum",
            "annotations": {
              "displayName": "Data Sensitivity Level",
              "description": "The general sensitivity classification of the table. (Low, Medium, High, Critical)"
          },
          "index": 7,
          "constraints": {
              "required": True
          },
          "enumValues": [
              {
                "name": "Low",
                "index": 1
              },
              {
                "name": "Medium",
                "index": 2
              },
              {
                "name": "High",
                "index": 3
              },
              {
                "name": "Critical",
                  "index": 4
              },
          ]
      },
      {
          "name": "contains-pii",
          "type": "bool",
          "annotations": {
              "displayName": "Contains PII",
              "description": "Indicates if this table contains any Personally Identifiable Information (PII)."
          },
          "index": 8,
          "constraints": {
              "required": True
          }
      }
    ]
}

createAspectType(project_id, dataGovernanceAspectId, dataGovernanceAspectLocation, dataGovernanceAspectTypeName, dataGovernanceAspectTypeDescription, metadataTemplate)
print(f"To view Aspect Types: https://console.cloud.google.com/dataplex/catalog/aspect-types?project={project_id}")

###### Aspect Type: Data Sensitivity

In [None]:
dataSensitivityAspectId = "data-sensitivity-aspect-type"
dataSensitivityAspectTypeName = "Column Data Sensitivity" + f" ({random_extension})"
dataSensitivityAspectTypeDescription = "This aspect type defines if a column contains Personally Identifiable Information (PII) and provides column level governance information."
dataSensitivityAspectLocation = "global"
dataSensitivityAspectTemplateName = f"{dataSensitivityAspectId}-metadataTemplate"

metadataTemplate = {
    "name": dataSensitivityAspectTemplateName,
    "type": "record",
    "recordFields": [
        {
            "name": "contains-pii",
            "type": "bool",
            "annotations": {
                "displayName": "Contains PII",
                "description": "Indicates if this column contains any Personally Identifiable Information (PII)."
            },
            "index": 1,
            "constraints": {
                "required": True
            }
        },
        {
            "name": "pii-type",
            "type": "string",
            "annotations": {
                "displayName": "PII Type",
                "description": "The type of PII contained within this column (e.g., Name, Email, Phone Number, etc.)."
            },
            "index": 2,
             "constraints": {
                "required": False
            }
        },
        {
           "name": "data-sensitivity-level",
           "type": "enum",
           "annotations": {
                "displayName": "Data Sensitivity Level",
                "description": "The sensitivity level of the data for data masking or other protection needs. (Low, Medium, High, Critical)"
            },
            "index": 3,
            "constraints": {
                "required": True
            },
            "enumValues": [
                {
                  "name": "Low",
                  "index": 1
                },
                {
                  "name": "Medium",
                  "index": 2
                },
               {
                  "name": "High",
                  "index": 3
                },
                {
                  "name": "Critical",
                   "index": 4
                },
            ]
        },
        {
            "name": "compliance-requirements",
            "type": "array",
            "arrayItems":
                {
                "name": "compliance-requirements-metadata-template",
                "type": "string"
                },
            "annotations": {
                "displayName": "Compliance Requirements",
                "description": "List of regulations that are relevant to this column (e.g., GDPR, CCPA, HIPAA)."
            },
            "index": 4,
            "constraints": {
                "required": False
            }
        }
    ]
}

createAspectType(project_id, dataSensitivityAspectId, dataSensitivityAspectLocation, dataSensitivityAspectTypeName, dataSensitivityAspectTypeDescription, metadataTemplate)
print(f"To view Aspect Types: https://console.cloud.google.com/dataplex/catalog/aspect-types?project={project_id}")

##### **Create Entry Types**
- Create the templates that hold the aspect types we just created.

###### Entry Type: Governed Table

In [None]:
# Governed Table Entry Type
governedTableEntryTypeId = "governed-table"
governedTableEntryTypeName = "Governed Table" + f" ({random_extension})"
governedTableEntryTypeLocation = "global"
governedTableDescription = "A table, either physical or logical, that is actively managed under data governance principles."
governedTableLocation = "global"

governedTableRequiredAspects = [
    {
        "type": f"projects/{project_id}/locations/{governedTableLocation}/aspectTypes/{dataDomainAspectId}"
    },
    {
        "type": f"projects/{project_id}/locations/{governedTableLocation}/aspectTypes/{dataRetentionAspectId}"
    },
    {
       "type": f"projects/{project_id}/locations/{governedTableLocation}/aspectTypes/{dataGovernanceAspectId}"
    }
]

createEntryType(project_id, governedTableEntryTypeId, governedTableEntryTypeLocation, governedTableEntryTypeName, governedTableDescription, ["TABLE"], "GCS", "BigQuery", governedTableRequiredAspects)
print(f"To view Entry Types: https://console.cloud.google.com/dataplex/catalog/entry-types?project={project_id}")


###### Entry Type: Governed Column

In [None]:
# Governed Column Entry Type
governedColumnEntryTypeId = "governed-column"
governedColumnEntryTypeName = "Governed Column" + f" ({random_extension})"
governedColumnEntryTypeLocation = "global"
governedColumnDescription = "A column, either physical or logical, that is actively managed under data governance principles."
governedColumnLocation = "global"

governedColumnRequiredAspects = [
   {
        "type": f"projects/{project_id}/locations/{governedColumnLocation}/aspectTypes/{dataSensitivityAspectId}"
    }
]

# The set of allowed type aliases are: [BUCKET, CLUSTER, CODE_ASSET, CONNECTION, DASHBOARD, DASHBOARD_ELEMENT, DATABASE, DATABASE_SCHEMA,
#                                       DATASET, DATA_EXCHANGE, DATA_SOURCE_CONNECTION, DATA_STREAM, EXPLORE, FEATURE_GROUP,
#                                       FEATURE_ONLINE_STORE, FEATURE_VIEW, FILESET, FOLDER, FUNCTION, GLOSSARY, GLOSSARY_CATEGORY,
#                                       GLOSSARY_TERM, LISTING, LOOK, MODEL, POLICY, REPOSITORY, RESOURCE, ROUTINE, SERVICE, TABLE,
#                                       TOPIC, VIEW]"

createEntryType(project_id, governedColumnEntryTypeId, governedColumnEntryTypeLocation,governedColumnEntryTypeName, governedColumnDescription, ["DATABASE_SCHEMA"], "GCS", "BigQuery", governedColumnRequiredAspects)
print(f"To view Entry Types: https://console.cloud.google.com/dataplex/catalog/entry-types?project={project_id}")

##### **Assign Aspect Types to BigQuery Tables (and columns)**
- Assign the aspect type to each table and specify the values.

###### Raw Zone: Customer Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_raw_dataset}"
bigqueryTable = "customer"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Raw"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/customer-table-documentation",
         "data-lifecycle": "Dev",
         "classification-level": "Restricted",
        "data-sensitivity-level": "High",
        "contains-pii": True
        }}
}


# PII Columns
pii_columns = {
    "ssn": {"pii_type": "ssn"},
    "first_name": {"pii_type": "Name"},
    "last_name": {"pii_type": "Name"},
    "email": {"pii_type": "Email"},
    "phone": {"pii_type": "Phone Number"},
     "ip_address": {"pii_type": "IP Address"},
    "address" : {"pii_type": "Street Address"}
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["customer_id", "gender", "city", "state", "zip"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is customer master data and contains PII."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Raw Zone: Customer Transaction Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_raw_dataset}"
bigqueryTable = "customer_transaction"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Raw"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/customer-transaction-table-documentation",
         "data-lifecycle": "Dev",
         "classification-level": "Restricted",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["transaction_id", "customer_id", "order_date", "order_time", "transaction_type", "region", "quantity", "product", "product_category", "price"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the customer transaction table and contains order details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Raw Zone: Product Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_raw_dataset}"
bigqueryTable = "product"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Raw"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/product-table-documentation",
         "data-lifecycle": "Dev",
         "classification-level": "Restricted",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["description", "product"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the product table and contains product details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Raw Zone: Product Category Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_raw_dataset}"
bigqueryTable = "product_category"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Raw"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "JohnSmith",
         "owner-group": "data-governance-team",
         "business-owner": "JohnSmith",
         "documentation-url": "http://yourcompany.com/product-category-table-documentation",
         "data-lifecycle": "Dev",
         "classification-level": "Restricted",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["product_category", "description"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the product category table and contains product category details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "John Smith"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Enriched Zone: Customer Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_enriched_dataset}"
bigqueryTable = "customer"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Enriched"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/customer-table-documentation",
         "data-lifecycle": "QA",
         "classification-level": "Restricted",
        "data-sensitivity-level": "High",
        "contains-pii": True
        }}
}


# PII Columns
pii_columns = {
    "ssn": {"pii_type": "ssn"},
    "first_name": {"pii_type": "Name"},
    "last_name": {"pii_type": "Name"},
    "email": {"pii_type": "Email"},
    "phone": {"pii_type": "Phone Number"},
     "ip_address": {"pii_type": "IP Address"},
    "address" : {"pii_type": "Street Address"},
    "credit_card_number" : {"pii_type": "Credit Card Number"}
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["customer_id", "gender", "city", "state", "zip"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is customer master data and contains PII."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Enriched Zone: Order Detail Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_enriched_dataset}"
bigqueryTable = "order_detail"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Enriched"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "JohnSmith",
         "owner-group": "data-governance-team",
         "business-owner": "JohnSmith",
         "documentation-url": "http://yourcompany.com/order-detail-table-documentation",
         "data-lifecycle": "QA",
         "classification-level": "Restricted",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["order_id", "product_id", "quantity", "price"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the order detail table and contains individual order item details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "John Smith"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Enriched Zone: Order Header Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_enriched_dataset}"
bigqueryTable = "order_header"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Enriched"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/order-header-table-documentation",
         "data-lifecycle": "QA",
         "classification-level": "Restricted",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["customer_id", "order_id", "region", "order_datetime"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the order header table and contains the overall order details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Enriched Zone: Product Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_enriched_dataset}"
bigqueryTable = "product"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Enriched"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "JohnSmith",
         "owner-group": "data-governance-team",
         "business-owner": "JohnSmith",
         "documentation-url": "http://yourcompany.com/product-table-documentation",
         "data-lifecycle": "QA",
         "classification-level": "Restricted",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["product_id", "product_name", "product_description", "product_category_id"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the product table and contains enriched product details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "John Smith"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Enriched Zone: Product Category Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_enriched_dataset}"
bigqueryTable = "product_category"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Enriched"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/product-category-table-documentation",
         "data-lifecycle": "QA",
         "classification-level": "Restricted",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["product_category_id", "product_category_name", "product_category_description"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the product category table and contains enriched product category details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Curated Zone: Customer Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_curated_dataset}"
bigqueryTable = "customer"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Curated"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/customer-table-documentation",
         "data-lifecycle": "Production",
         "classification-level": "Restricted",
        "data-sensitivity-level": "High",
        "contains-pii": True
        }}
}


# PII Columns
pii_columns = {
    "ssn": {"pii_type": "ssn"},
    "first_name": {"pii_type": "Name"},
    "last_name": {"pii_type": "Name"},
    "email": {"pii_type": "Email"},
    "phone": {"pii_type": "Phone Number"},
     "ip_address": {"pii_type": "IP Address"},
    "address" : {"pii_type": "Street Address"},
     "credit_card_number" : {"pii_type": "Credit Card Number"}
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["customer_id", "gender", "city", "state", "zip"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is customer master data and contains PII."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Curated Zone: Customer Training Data Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_curated_dataset}"
bigqueryTable = "customer_training_data"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Curated"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "ml-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/customer-training-data-table-documentation",
         "data-lifecycle": "Production",
         "classification-level": "Internal",
        "data-sensitivity-level": "Medium",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["customer_id", "total_spent"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This table is for training ML models and contains customer spending information. This data is private."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "ML Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Curated Zone: Order Detail Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_curated_dataset}"
bigqueryTable = "order_detail"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Curated"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "JohnSmith",
         "owner-group": "data-governance-team",
         "business-owner": "JohnSmith",
         "documentation-url": "http://yourcompany.com/order-detail-table-documentation",
         "data-lifecycle": "Production",
         "classification-level": "Internal",
        "data-sensitivity-level": "Medium",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["order_id", "product_id", "quantity", "price"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the order detail table and contains individual order item details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "John Smith"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Curated Zone: Order Header Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_curated_dataset}"
bigqueryTable = "order_header"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Curated"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/order-header-table-documentation",
         "data-lifecycle": "Production",
         "classification-level": "Internal",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["customer_id", "order_id", "region", "order_datetime"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the order header table and contains the overall order details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Curated Zone: Product Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_curated_dataset}"
bigqueryTable = "product"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Curated"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "JohnSmith",
         "owner-group": "data-governance-team",
         "business-owner": "JohnSmith",
         "documentation-url": "http://yourcompany.com/product-table-documentation",
         "data-lifecycle": "Production",
         "classification-level": "Public",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["product_id", "product_name", "product_description", "product_category_id"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the product table and contains product details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "John Smith"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Curated Zone: Product Category Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_curated_dataset}"
bigqueryTable = "product_category"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Curated"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 365,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Jane Doe",
         "owner-group": "data-governance-team",
         "business-owner": "Jane Doe",
         "documentation-url": "http://yourcompany.com/product-category-table-documentation",
         "data-lifecycle": "Production",
         "classification-level": "Public",
        "data-sensitivity-level": "Low",
        "contains-pii": False
        }}
}


# PII Columns
pii_columns = {
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = ["product_category_id", "product_category_name", "product_category_description"]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the product category table and contains product category details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

###### Curated Zone: Sales Table

In [None]:
entryGroupLocation = "us"  # This has to be "us" since our tables are US multi-region

bigqueryProjectId = project_id
bigqueryDataset = "${bigquery_governed_data_curated_dataset}"
bigqueryTable = "sales"

aspects = {
    f"{project_id}.{governedTableEntryTypeLocation}.{dataDomainAspectId}": {
        "data": {
            "zone": "Curated"
            }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataRetentionAspectId}": {
        "data": {
            "retention-days": 180,
            "retention-policy": "http://yourcompany.com/retention-policy"
        }},
    f"{project_id}.{governedTableEntryTypeLocation}.{dataGovernanceAspectId}": {
        "data": {
         "data-steward": "Sales Team",
         "owner-group": "data-governance-team",
         "business-owner": "Sales Team",
         "documentation-url": "http://yourcompany.com/product-category-table-documentation",
         "data-lifecycle": "Production",
         "classification-level": "Public",
        "data-sensitivity-level": "High",
        "contains-pii": True
        }}
}

# PII Columns
pii_columns = { 
    "ssn": {"pii_type": "ssn"},
    "first_name": {"pii_type": "Name"},
    "last_name": {"pii_type": "Name"},
    "email": {"pii_type": "Email"},
    "phone": {"pii_type": "Phone Number"},
    "ip_address": {"pii_type": "IP Address"},
    "address" : {"pii_type": "Street Address"},
    "credit_card_number" : {"pii_type": "Credit Card Number"}     
}

for column_name, pii_info in pii_columns.items():
    aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
        "data": {
            "contains-pii": True,
            "pii-type": pii_info["pii_type"],
            "data-sensitivity-level": "High",
            "compliance-requirements": ["GDPR", "CCPA"]
        }}

non_pii_columns = [
     "product_name", 
     "product_description", 
     "product_category_name", 
     "product_category_description", 
     "region", 
     "order_datetime", 
     "price", 
     "quantity", 
     "customer_id", 
     "gender", 
     "city", 
     "state", 
     "zip"
]

for column_name in non_pii_columns:
      aspects[f"{project_id}.{governedColumnEntryTypeLocation}.{dataSensitivityAspectId}@Schema.{column_name}"] = {
          "data": {
            "contains-pii": False,
            "data-sensitivity-level": "Low"
        }}

updateDataplexSystemEntry_BigQueryTable(project_id,
                                        entryGroupLocation,
                                        bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                        governedTableEntryTypeId, governedTableEntryTypeLocation,
                                        aspects)

overviewText = "This is the product category table and contains product category details."
roleList = [
    {
      "role" : "Data Steward",
      "name" : "Jane Doe"
    },
    {
      "role" : "Owner",
      "name" : "Data Team"
    }
    ]

updateDataplexMetatdata_BigQueryTable(project_id,
                                       entryGroupLocation,
                                       bigqueryProjectId, bigqueryDataset, bigqueryTable,
                                       governedTableEntryTypeId, governedTableEntryTypeLocation,
                                       overviewText, roleList)

print(f"To view Table: https://console.cloud.google.com/dataplex/dp-entries/projects/{project_id}/locations/{entryGroupLocation}/entryGroups/@bigquery/entries/bigquery.googleapis.com%2Fprojects%2F{bigqueryProjectId}%2Fdatasets%2F{bigqueryDataset}%2Ftables%2F{bigqueryTable}?&project={project_id}")

### <font color='#4285f4'>Clean Up</font>

In [None]:
# Placeholder

### <font color='#4285f4'>Reference Links</font>


- [REPLACE-ME](https://REPLACE-ME)