ai-ml/deep-learning-ai-gemini/L5

{ "cells": [ { "cell_type": "code", "source": [ "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ], "metadata": { "id": "ZjNqTQeHDq1P" }, "id": "ZjNqTQeHDq1P", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "### Install Vertex AI SDK and other required packages" ], "metadata": { "id": "kxyOUgGuD1cB" }, "id": "kxyOUgGuD1cB" }, { "cell_type": "code", "source": [ "%pip install --upgrade --user --quiet google-cloud-aiplatform" ], "metadata": { "id": "fV6kVGI6D3Qx" }, "id": "fV6kVGI6D3Qx", "execution_count": 87, "outputs": [] }, { "cell_type": "markdown", "source": [ "### Restart runtime\n", "\n", "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n", "\n", "The restart might take a minute or longer. After it's restarted, continue to the next step." ], "metadata": { "id": "ohM3m2TBD4-A" }, "id": "ohM3m2TBD4-A" }, { "cell_type": "code", "source": [ "import IPython\n", "\n", "app = IPython.Application.instance()\n", "app.kernel.do_shutdown(True)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "_VFkGMCxD7IS", "outputId": "e64a809b-3c3a-4a6f-abc4-d48e408974db" }, "id": "_VFkGMCxD7IS", "execution_count": 88, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'status': 'ok', 'restart': True}" ] }, "metadata": {}, "execution_count": 88 } ] }, { "cell_type": "markdown", "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "If you're running this notebook on Google Colab, run the cell below to authenticate your environment." ], "metadata": { "id": "7wBEATTEEII5" }, "id": "7wBEATTEEII5" }, { "cell_type": "code", "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ], "metadata": { "id": "NHvtRAsQEJUE" }, "id": "NHvtRAsQEJUE", "execution_count": 1, "outputs": [] }, { "cell_type": "markdown", "source": [ "### Set Google Cloud project information and initialize Vertex AI SDK\n", "\n", "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", "\n", "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ], "metadata": { "id": "wcjLp7syENBX" }, "id": "wcjLp7syENBX" }, { "cell_type": "code", "source": [ "PROJECT_ID = \"dlai-test\" # @param {type:\"string\"}\n", "LOCATION = \"us-central1\" # @param {type:\"string\"}\n", "\n", "\n", "import vertexai\n", "\n", "vertexai.init(project=PROJECT_ID, location=LOCATION)" ], "metadata": { "id": "1WDlDM8uENqR" }, "id": "1WDlDM8uENqR", "execution_count": 2, "outputs": [] }, { "cell_type": "markdown", "id": "679851f7-54ed-437d-ae67-1d173913bb28", "metadata": { "id": "679851f7-54ed-437d-ae67-1d173913bb28" }, "source": [ "# Lesson 5: Developing Use Cases with Videos\n", "\n", "In this lesson, you'll go through Gemini's Multimodality capabilities, by passing Videos and Texts as input." ] }, { "cell_type": "markdown", "id": "a8196ba0-f9a0-419f-b1b9-6e4f11819386", "metadata": { "id": "a8196ba0-f9a0-419f-b1b9-6e4f11819386" }, "source": [ "- Import the [Vertex AI](https://cloud.google.com/vertex-ai?hl=en) SDK." ] }, { "cell_type": "code", "execution_count": 3, "id": "400f430e-c96f-48b4-ad6e-13b80f7be8e8", "metadata": { "id": "400f430e-c96f-48b4-ad6e-13b80f7be8e8" }, "outputs": [], "source": [ "import vertexai" ] }, { "cell_type": "code", "execution_count": 4, "id": "54bfc337-3411-4787-b378-bc0b88a48cec", "metadata": { "id": "54bfc337-3411-4787-b378-bc0b88a48cec" }, "outputs": [], "source": [ "vertexai.init(project = PROJECT_ID,\n", " location = LOCATION)" ] }, { "cell_type": "markdown", "id": "7a14c80d-e9e2-4ba2-9362-388736225365", "metadata": { "id": "7a14c80d-e9e2-4ba2-9362-388736225365" }, "source": [ "**Note:** In the latest version, `from vertexai.preview.generative_models` has been changed to `from vertexai.generative_models`.\n", "\n", "`from vertexai.preview.generative_models` can still be used." ] }, { "cell_type": "code", "execution_count": 5, "id": "27015df2-de75-4a34-9756-ef5ffc58a647", "metadata": { "tags": [], "id": "27015df2-de75-4a34-9756-ef5ffc58a647" }, "outputs": [], "source": [ "from vertexai.generative_models import GenerativeModel" ] }, { "cell_type": "markdown", "id": "507beaa1-662b-4a78-90f0-6218c4c91d00", "metadata": { "id": "507beaa1-662b-4a78-90f0-6218c4c91d00" }, "source": [ "- Load the `gemini-pro-vision` model.\n", "- When specifying `gemini-pro-vision`, the [gemini-1.0-pro-vision](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemini-pro-vision) model is used." ] }, { "cell_type": "code", "execution_count": 6, "id": "d49b1dae-ecdd-4c31-bd23-770890273382", "metadata": { "tags": [], "id": "d49b1dae-ecdd-4c31-bd23-770890273382" }, "outputs": [], "source": [ "multimodal_model = GenerativeModel(\"gemini-1.5-flash-001\")" ] }, { "cell_type": "markdown", "id": "99cb7797-709f-450f-bc35-6a8df4eab2fe", "metadata": { "id": "99cb7797-709f-450f-bc35-6a8df4eab2fe" }, "source": [ "## Digital Marketer" ] }, { "cell_type": "code", "execution_count": 7, "id": "4f0970ab-f175-47c1-95f0-aa815b0cbe7f", "metadata": { "tags": [], "id": "4f0970ab-f175-47c1-95f0-aa815b0cbe7f" }, "outputs": [], "source": [ "file_path_1 = \"github-repo/img/gemini/multimodality_usecases_overview/vertex-ai-langchain.mp4\"\n", "video_uri_1 = f\"gs://{file_path_1}\"\n", "video_url_1 = f\"https://storage.googleapis.com/{file_path_1}\"" ] }, { "cell_type": "code", "execution_count": 8, "id": "d49ab4a5-66ee-492e-abc8-01c3de7b8bf0", "metadata": { "tags": [], "id": "d49ab4a5-66ee-492e-abc8-01c3de7b8bf0" }, "outputs": [], "source": [ "import IPython" ] }, { "cell_type": "code", "execution_count": 9, "id": "bbcb3eda-4c1b-4d9d-abcb-445de4001d0e", "metadata": { "tags": [], "colab": { "base_uri": "https://localhost:8080/", "height": 274 }, "id": "bbcb3eda-4c1b-4d9d-abcb-445de4001d0e", "outputId": "bd04fc63-7c03-4cf8-fc4a-ab26e04ab99e" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "<IPython.core.display.Video object>" ], "text/html": [ "<video src=\"https://storage.googleapis.com/github-repo/img/gemini/multimodality_usecases_overview/vertex-ai-langchain.mp4\" controls width=\"450\" >\n", " Your browser does not support the <code>video</code> element.\n", " </video>" ] }, "metadata": {}, "execution_count": 9 } ], "source": [ "IPython.display.Video(video_url_1, width=450)" ] }, { "cell_type": "code", "execution_count": 10, "id": "c3884d16-1ad1-4b51-9df4-af2d01ee5d48", "metadata": { "tags": [], "id": "c3884d16-1ad1-4b51-9df4-af2d01ee5d48" }, "outputs": [], "source": [ "from vertexai.generative_models import (\n", " GenerationConfig,\n", " GenerativeModel,\n", " Part,\n", ")" ] }, { "cell_type": "code", "execution_count": 12, "id": "78f3ed51-3275-4ecd-979c-7e8b7a4576ad", "metadata": { "tags": [], "id": "78f3ed51-3275-4ecd-979c-7e8b7a4576ad" }, "outputs": [], "source": [ "video_1 = Part.from_uri(video_uri_1, mime_type=\"video/mp4\")" ] }, { "cell_type": "markdown", "id": "3a7fe8a7-9a36-4507-ba6d-eeb65eaf4de3", "metadata": { "id": "3a7fe8a7-9a36-4507-ba6d-eeb65eaf4de3" }, "source": [ "- Structure your prompt(s).\n", "- Be specific with what you want the model to do for you.\n", "- You can even specify the output format of the response from the model.\n", "- In this case, you are asking for the response to be in JSON format." ] }, { "cell_type": "code", "execution_count": 13, "id": "945f492e-1abf-402f-b449-361620ee10b9", "metadata": { "tags": [], "id": "945f492e-1abf-402f-b449-361620ee10b9" }, "outputs": [], "source": [ "role = \"\"\"\n", "You are a great digital marketer working on a new video.\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 14, "id": "6a759702-d13c-47cf-a858-61c60f89b95a", "metadata": { "tags": [], "id": "6a759702-d13c-47cf-a858-61c60f89b95a" }, "outputs": [], "source": [ "tasks = \"\"\"\n", "You will add the video to your website and to do this you\n", "need to complete some tasks. Please make sure your answer\n", "is structured.\n", "\n", "Tasks:\n", "- What is the title of the video?\n", "- Write a summary of what is in the video.\n", "- Generate metadata for the video in JSON that includes:\\\n", "Title, short description, language, and company.\n", "\"\"\"\n", "\n", "# tasks = \"\"\"\n", "# You will add the video to your website and to do this you\n", "# need to complete some tasks. Please make sure your answer\n", "# is structured.\n", "\n", "# Tasks:\n", "# - What is the title of the video?\n", "# - Write a summary of what is in the video.\n", "# - Generate metadata for the video that includes:\\\n", "# Title, short description, language, and company.\n", "# \"\"\"" ] }, { "cell_type": "markdown", "id": "fb8bdbed-890c-435d-90ba-6328722dac39", "metadata": { "id": "fb8bdbed-890c-435d-90ba-6328722dac39" }, "source": [ "- You can choose the number of variables you want for your prompt.\n", "- More variables means you have more flexibility in making specific changes to your prompts while keeping everyhting else the same." ] }, { "cell_type": "code", "execution_count": null, "id": "9791c1d6-39af-44f6-b633-212df6315138", "metadata": { "tags": [], "id": "9791c1d6-39af-44f6-b633-212df6315138" }, "outputs": [], "source": [ "# format_json = \"Please output the metadata in JSON\"" ] }, { "cell_type": "code", "execution_count": 15, "id": "1185aa10-ece7-473d-aaa5-243185e84d42", "metadata": { "tags": [], "id": "1185aa10-ece7-473d-aaa5-243185e84d42" }, "outputs": [], "source": [ "contents_1 = [video_1, role, tasks]\n", "\n", "# contents_1 = [video_1, role, tasks, format_json]" ] }, { "cell_type": "markdown", "id": "171592df-1777-4b32-99ac-ff0ac42eadda", "metadata": { "id": "171592df-1777-4b32-99ac-ff0ac42eadda" }, "source": [ "- Feel free to change the `temperature`" ] }, { "cell_type": "code", "execution_count": 16, "id": "d8527c49-8a9f-4f8f-988f-5ccc8dc216d3", "metadata": { "tags": [], "id": "d8527c49-8a9f-4f8f-988f-5ccc8dc216d3" }, "outputs": [], "source": [ "generation_config_1 = GenerationConfig(\n", " temperature=0.1,\n", ")" ] }, { "cell_type": "code", "execution_count": 24, "id": "40e4e8d7-8028-46d4-9eac-6e2dc803db17", "metadata": { "tags": [], "id": "40e4e8d7-8028-46d4-9eac-6e2dc803db17" }, "outputs": [], "source": [ "responses = multimodal_model.generate_content(\n", " contents_1,\n", " generation_config=generation_config_1,\n", " stream=False\n", ")" ] }, { "cell_type": "markdown", "id": "16350384-1495-4ebc-9ddf-0b0746525a0c", "metadata": { "id": "16350384-1495-4ebc-9ddf-0b0746525a0c" }, "source": [ "**Note**: If you set `stream=True`, you'll print your responses as:\n", "```Python\n", "for response in responses:\n", " print(response.text, end=\"\")\n", "```" ] }, { "cell_type": "markdown", "id": "3a6c4d91-1251-4b83-a4e8-89075e33c49c", "metadata": { "id": "3a6c4d91-1251-4b83-a4e8-89075e33c49c" }, "source": [ "**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get." ] }, { "cell_type": "code", "execution_count": 25, "id": "956a46c4-e4e4-4979-9c1b-bebec2f05fb1", "metadata": { "tags": [], "colab": { "base_uri": "https://localhost:8080/" }, "id": "956a46c4-e4e4-4979-9c1b-bebec2f05fb1", "outputId": "b15a7acd-7d59-4607-c118-909d77f9ac29" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Here are the tasks you requested:\n", "\n", "- **Title of the video:** Build AI-powered apps on Vertex AI with LangChain\n", "- **Summary of the video:** This video is about how to use Vertex AI and LangChain to build AI-powered applications. The video starts by explaining the challenges of using large language models (LLMs) and how LangChain can help to overcome these challenges. The video then goes on to show how to use LangChain to build a basic application that summarizes large documents. Finally, the video discusses some of the use cases for Vertex AI and LangChain.\n", "- **Metadata for the video in JSON:**\n", "```json\n", "{\n", " \"Title\": \"Build AI-powered apps on Vertex AI with LangChain\",\n", " \"short description\": \"Learn how to use Vertex AI and LangChain to build AI-powered applications. This video covers the challenges of using LLMs, how LangChain can help, and how to build a basic application that summarizes large documents.\",\n", " \"language\": \"English\",\n", " \"company\": \"Google Cloud\"\n", "}\n", "``` \n" ] } ], "source": [ "print(responses.text, end=\"\")" ] }, { "cell_type": "markdown", "id": "4a9b1931-8e9a-4ac2-84cc-445df7c9743f", "metadata": { "id": "4a9b1931-8e9a-4ac2-84cc-445df7c9743f" }, "source": [ "# Explaining the Educational Concepts" ] }, { "cell_type": "code", "execution_count": 32, "id": "054120fd-cf97-4e5b-92b8-11fd7e2b3771", "metadata": { "tags": [], "id": "054120fd-cf97-4e5b-92b8-11fd7e2b3771", "colab": { "base_uri": "https://localhost:8080/", "height": 274 }, "outputId": "18541f56-0b7f-4828-a136-1b5c8d2a860c" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "<IPython.core.display.Video object>" ], "text/html": [ "<video src=\"https://storage.googleapis.com/github-repo/img/gemini/multimodality_usecases_overview/descending-into-ml.mp4\" controls width=\"450\" >\n", " Your browser does not support the <code>video</code> element.\n", " </video>" ] }, "metadata": {}, "execution_count": 32 } ], "source": [ "file_path_2 = \"github-repo/img/gemini/multimodality_usecases_overview/descending-into-ml.mp4\"\n", "video_uri_2 = f\"gs://{file_path_2}\"\n", "video_url_2 = f\"https://storage.googleapis.com/{file_path_2}\"\n", "\n", "IPython.display.Video(video_url_2, width=450)" ] }, { "cell_type": "code", "execution_count": 33, "id": "3e9be4b6-f508-4cd0-b885-cdb60bc6280b", "metadata": { "tags": [], "id": "3e9be4b6-f508-4cd0-b885-cdb60bc6280b" }, "outputs": [], "source": [ "video_2 = Part.from_uri(video_uri_2, mime_type=\"video/mp4\")" ] }, { "cell_type": "markdown", "id": "7a5bb786-74bf-40c7-ab55-f8b6f031ee1c", "metadata": { "id": "7a5bb786-74bf-40c7-ab55-f8b6f031ee1c" }, "source": [ "- You can even ask the model to answer based on answers of previous questions.\n", "- And to generate programming code based on previous answers." ] }, { "cell_type": "code", "execution_count": 34, "id": "3fa902fe-91f6-48f7-ae0d-4792f35398a5", "metadata": { "tags": [], "id": "3fa902fe-91f6-48f7-ae0d-4792f35398a5" }, "outputs": [], "source": [ "prompt = \"\"\"\n", "Please have a look at the video and answer the following\n", "questions.\n", "\n", "Questions:\n", "- Question 1: Which concept is explained in the video?\n", "- Question 2: Based on your answer to Question 1,\n", "can you explain the basic math of this concept?\n", "- Question 3: Can you provide a simple scikit code example\n", "explaining the concept?\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 35, "id": "d24e5781-4f61-4d44-a7b6-4ce497fc54da", "metadata": { "tags": [], "id": "d24e5781-4f61-4d44-a7b6-4ce497fc54da" }, "outputs": [], "source": [ "contents_2 = [video_2, prompt]" ] }, { "cell_type": "code", "execution_count": 36, "id": "29a15d96-8d6f-4a71-b5c2-3b00fe54eeb2", "metadata": { "tags": [], "id": "29a15d96-8d6f-4a71-b5c2-3b00fe54eeb2" }, "outputs": [], "source": [ "responses = multimodal_model.generate_content(\n", " contents_2,\n", " stream=False\n", ")" ] }, { "cell_type": "markdown", "id": "d093c488-d7bd-4210-8ac5-550daa6ce614", "metadata": { "id": "d093c488-d7bd-4210-8ac5-550daa6ce614" }, "source": [ "**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get." ] }, { "cell_type": "code", "execution_count": 37, "id": "0c8901fc-8ee7-4055-93d5-bca892c16d04", "metadata": { "tags": [], "id": "0c8901fc-8ee7-4055-93d5-bca892c16d04", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "0c793a07-fcf7-4476-def0-dac90c6b0e3e" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Of course! I can help with that. \n", "\n", "Here are the answers to your questions based on the video: \n", "\n", "**Question 1: Which concept is explained in the video?**\n", "The concept explained in the video is **Linear Regression** with a focus on how to minimize loss in linear regression models. \n", "\n", "**Question 2: Based on your answer to Question 1, can you explain the basic math of this concept?**\n", "\n", "Linear regression is a statistical method that aims to establish a linear relationship between a dependent variable (y) and one or more independent variables (x). The goal is to find the best-fitting line through the data points that minimizes the difference between the predicted and actual values. \n", "\n", "The fundamental equation for a linear regression model is:\n", "\n", "* **y = wX + b**\n", "\n", "Where:\n", "\n", "* **y** is the predicted value of the dependent variable.\n", "* **w** is the weight vector (slope of the line) which determines the relationship between the independent variable (X) and the dependent variable (y).\n", "* **X** is the independent variable.\n", "* **b** is the bias term (intercept of the line) representing the predicted value of y when x is zero.\n", "\n", "**Question 3: Can you provide a simple scikit code example explaining the concept?**\n", "\n", "```python\n", "import pandas as pd\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import mean_squared_error\n", "\n", "# Sample Data \n", "data = {'square_footage': [1000, 1200, 1500, 1800, 2000],\n", " 'price': [200000, 220000, 250000, 300000, 350000]}\n", "df = pd.DataFrame(data)\n", "\n", "# Splitting the data into training and testing sets\n", "X = df[['square_footage']]\n", "y = df['price']\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n", "\n", "# Create a linear regression model\n", "model = LinearRegression()\n", "\n", "# Train the model\n", "model.fit(X_train, y_train)\n", "\n", "# Make predictions on the test data\n", "y_pred = model.predict(X_test)\n", "\n", "# Evaluate the model\n", "mse = mean_squared_error(y_test, y_pred)\n", "print(f'Mean Squared Error: {mse}')\n", "\n", "# Print the model's coefficients\n", "print(f'Slope (w): {model.coef_[0]}')\n", "print(f'Intercept (b): {model.intercept_}')\n", "```\n", "\n", "This code: \n", "\n", "1. **Loads data:** Creates a pandas DataFrame with square footage and price data.\n", "2. **Splits data:** Divides the data into training and testing sets.\n", "3. **Creates model:** Initializes a LinearRegression model.\n", "4. **Trains model:** Fits the model on the training data to find the best-fit line.\n", "5. **Makes predictions:** Uses the trained model to predict prices for the test data.\n", "6. **Evaluates model:** Calculates the mean squared error (MSE) to assess the model's performance.\n", "7. **Prints results:** Displays the model's coefficients (slope and intercept). \n", "\n", "Let me know if you have any other questions or want to explore other aspects of linear regression or other machine learning concepts. \n", "\n" ] } ], "source": [ "print(responses.text)" ] }, { "cell_type": "markdown", "id": "09da9100-165b-4968-a14d-b264c86f574b", "metadata": { "id": "09da9100-165b-4968-a14d-b264c86f574b" }, "source": [ "- You can copy/paste and run your generated code in the cell below.\n", "\n", "**Note:** LLM's are known to generate code which is incomplete or has bugs" ] }, { "cell_type": "code", "execution_count": null, "id": "d8568ea9-8c05-48b4-b3c5-84e0f83a5a8d", "metadata": { "id": "d8568ea9-8c05-48b4-b3c5-84e0f83a5a8d" }, "outputs": [], "source": [ "### you can copy/paste your generated code here:\n", "\n" ] }, { "cell_type": "markdown", "id": "23746f67-e028-40e7-879e-790aef78dffa", "metadata": { "id": "23746f67-e028-40e7-879e-790aef78dffa" }, "source": [ "- Below cell includes the code which was generated in the lecture video" ] }, { "cell_type": "code", "execution_count": null, "id": "64e156d4-9e2d-4c99-9250-cb853e5ff397", "metadata": { "tags": [], "id": "64e156d4-9e2d-4c99-9250-cb853e5ff397" }, "outputs": [], "source": [ "# # Import the necessary libraries\n", "# import numpy as np\n", "# import matplotlib.pyplot as plt\n", "# from sklearn.linear_model import LinearRegression\n", "\n", "# # Create some data\n", "# X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])\n", "# y = np.dot(X, np.array([1, 2])) + 3\n", "\n", "# # Fit the linear regression model\n", "# model = LinearRegression()\n", "# model.fit(X, y)\n", "\n", "# # Make predictions\n", "# y_pred = model.predict(X)\n", "\n", "# # Plot the data and the fitted line\n", "# plt.scatter(X[:, 1], y)\n", "# plt.plot(X[:, 1], y_pred, color='red')\n", "# plt.show()" ] }, { "cell_type": "markdown", "id": "bc9d06e1-a537-46dd-bd7b-5ad3cedce500", "metadata": { "id": "bc9d06e1-a537-46dd-bd7b-5ad3cedce500" }, "source": [ "## Extracting Information" ] }, { "cell_type": "code", "execution_count": 38, "id": "537892cf-d107-4542-98b8-f381ede35a07", "metadata": { "tags": [], "id": "537892cf-d107-4542-98b8-f381ede35a07", "colab": { "base_uri": "https://localhost:8080/", "height": 274 }, "outputId": "99be6a9a-2fa1-4234-fa7b-1f16387a378b" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "<IPython.core.display.Video object>" ], "text/html": [ "<video src=\"https://storage.googleapis.com/github-repo/img/gemini/multimodality_usecases_overview/google-search.mp4\" controls width=\"450\" >\n", " Your browser does not support the <code>video</code> element.\n", " </video>" ] }, "metadata": {}, "execution_count": 38 } ], "source": [ "file_path_4 = \"github-repo/img/gemini/multimodality_usecases_overview/google-search.mp4\"\n", "video_uri_4 = f\"gs://{file_path_4}\"\n", "video_url_4 = f\"https://storage.googleapis.com/{file_path_4}\"\n", "\n", "IPython.display.Video(video_url_4, width=450)" ] }, { "cell_type": "code", "execution_count": 39, "id": "6d2a8301-2ff1-4e42-a93f-0e79ac467730", "metadata": { "tags": [], "id": "6d2a8301-2ff1-4e42-a93f-0e79ac467730" }, "outputs": [], "source": [ "video_4 = Part.from_uri(video_uri_4, mime_type=\"video/mp4\")" ] }, { "cell_type": "markdown", "id": "d1adb0e3-44b5-4701-8a37-2acb7a6d3d10", "metadata": { "id": "d1adb0e3-44b5-4701-8a37-2acb7a6d3d10" }, "source": [ "**Note:** In the lecture video, everything was put in a single prompt (`prompt_4`):\n", "\n", "```Python\n", "prompt_4 = \"\"\"\n", "Answer the following questions using the video only.\n", "Present the results in a table with a row for each question\n", "and its answer.\n", "Make sure the table is in markdown format.\n", "\n", "Questions:\n", "- What is the most searched sport?\n", "- Who is the most searched scientist?\n", "\n", "\"\"\"\n", "\n", "contents_4 = [video_4, prompt_4]\n", "```\n", "But as also mentioned in the lecture, you can break it into seperate variables (`questions` and `format_html`), as done in the notebook below. Feel free to pause the video and compare your notebook with the video to see the differences." ] }, { "cell_type": "markdown", "id": "e2a11497-5a76-4d00-93b6-2d2f701d4474", "metadata": { "id": "e2a11497-5a76-4d00-93b6-2d2f701d4474" }, "source": [ "- Here, you have your questions." ] }, { "cell_type": "code", "execution_count": 40, "id": "7061e11d-b17b-4f7a-96fb-ec1515a6d2a4", "metadata": { "tags": [], "id": "7061e11d-b17b-4f7a-96fb-ec1515a6d2a4" }, "outputs": [], "source": [ "questions = \"\"\"\n", "Answer the following questions using the video only.\n", "\n", "Questions:\n", "- What is the most searched sport?\n", "- Who is the most searched scientist?\n", "\"\"\"\n", "\n", "# questions = \"\"\"\n", "# Answer the following questions using the video only.\n", "# If the answer is not found in the video,\n", "# say \"Not found in video\".\n", "\n", "# Questions:\n", "# - What is the most searched sport?\n", "# - Who is the most searched scientist?\n", "# \"\"\"" ] }, { "cell_type": "markdown", "id": "3b134862-4be8-4373-96aa-15f90b985396", "metadata": { "id": "3b134862-4be8-4373-96aa-15f90b985396" }, "source": [ "- Here, you specify the output format.\n", "- In this case, it is table format." ] }, { "cell_type": "code", "execution_count": 41, "id": "a0c633fa-b953-474b-a8e8-7bbfe7d919b0", "metadata": { "tags": [], "id": "a0c633fa-b953-474b-a8e8-7bbfe7d919b0" }, "outputs": [], "source": [ "format_html = \"\"\"\n", "Format:\n", "Present the results in a table with a row for each question\n", "and its answer.\n", "Make sure the table is in markdown format.\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 42, "id": "cf756730-718f-47c4-82e9-2b598fc35b44", "metadata": { "tags": [], "id": "cf756730-718f-47c4-82e9-2b598fc35b44" }, "outputs": [], "source": [ "contents_4 = [video_4, questions, format_html]" ] }, { "cell_type": "markdown", "id": "14840809-732a-45f2-9222-c7c5faef0cb5", "metadata": { "id": "14840809-732a-45f2-9222-c7c5faef0cb5" }, "source": [ "- Set the `temperature`. For now, it is `temperature=0.9`" ] }, { "cell_type": "code", "execution_count": 43, "id": "9e9b0ab0-7955-49a7-8f0c-5d106ae469db", "metadata": { "tags": [], "id": "9e9b0ab0-7955-49a7-8f0c-5d106ae469db" }, "outputs": [], "source": [ "generation_config_1 = GenerationConfig(\n", " temperature=0.9,\n", ")" ] }, { "cell_type": "code", "execution_count": 44, "id": "a6a09877-2d8a-46f6-b1eb-5389151f927a", "metadata": { "tags": [], "id": "a6a09877-2d8a-46f6-b1eb-5389151f927a" }, "outputs": [], "source": [ "responses = multimodal_model.generate_content(contents_4,\n", " generation_config=generation_config_1,\n", " stream=True\n", ")" ] }, { "cell_type": "markdown", "id": "1ac63287-d25b-4b13-84c1-852c6f5b708e", "metadata": { "id": "1ac63287-d25b-4b13-84c1-852c6f5b708e" }, "source": [ "**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get." ] }, { "cell_type": "code", "execution_count": 45, "id": "afd17035-a22d-48f8-9850-bd2a8d8e1568", "metadata": { "tags": [], "id": "afd17035-a22d-48f8-9850-bd2a8d8e1568", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "fdf1db42-a3a4-481d-f801-7c7edfd8a3de" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "| Question | Answer |\n", "|---|---|\n", "| What is the most searched sport? | Soccer |\n", "| Who is the most searched scientist? | Albert Einstein |" ] } ], "source": [ "for response in responses:\n", " print(response.text, end=\"\")" ] }, { "cell_type": "markdown", "id": "d527184e-4b9e-404f-ac58-7510110fb006", "metadata": { "id": "d527184e-4b9e-404f-ac58-7510110fb006" }, "source": [ "```\n", "You can copy/paste your generation in this Markdown cell (double click here)\n", "```\n", "(Paste here)" ] }, { "cell_type": "markdown", "id": "39889919-f32c-4fe6-a5d2-3ebcccacc3c4", "metadata": { "id": "39889919-f32c-4fe6-a5d2-3ebcccacc3c4" }, "source": [ "## Finding a Needle in a Haystack" ] }, { "cell_type": "code", "execution_count": null, "id": "1f356dfc-35bf-4e02-9838-fe7f07ea0cc0", "metadata": { "id": "1f356dfc-35bf-4e02-9838-fe7f07ea0cc0" }, "outputs": [], "source": [ "from utils import gemini_vision" ] }, { "cell_type": "markdown", "id": "e9e7f61b-2f04-4226-b905-9c714573ba63", "metadata": { "id": "e9e7f61b-2f04-4226-b905-9c714573ba63" }, "source": [ "- Load the [gemini-1.5-pro-001](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemini-pro-preview-0409) model." ] }, { "cell_type": "code", "execution_count": null, "id": "1eca11af-1cd6-4b70-96e7-1b20898de566", "metadata": { "tags": [], "id": "1eca11af-1cd6-4b70-96e7-1b20898de566" }, "outputs": [], "source": [ "multimodal_model = GenerativeModel(\"gemini-1.5-pro-001\")" ] }, { "cell_type": "markdown", "id": "5e54edcd-4c29-411e-aabf-c5b26e48b8f7", "metadata": { "id": "5e54edcd-4c29-411e-aabf-c5b26e48b8f7" }, "source": [ "- Just like with images, you can send more than 1 video to the model.\n", "- The following videos are from the **[LLMOps](https://learn.deeplearning.ai/courses/llmops/lesson/1/introduction)** short course, which you can enroll in on **[DeepLearning.AI's Short Courses Platform](https://learn.deeplearning.ai)**." ] }, { "cell_type": "code", "execution_count": 46, "id": "c28074ef-2c8c-4008-a83e-c8b3140eeb9a", "metadata": { "tags": [], "id": "c28074ef-2c8c-4008-a83e-c8b3140eeb9a" }, "outputs": [], "source": [ "video_1 = Part.from_uri(\"gs://github-repo/img/gemini/multimodality_usecases_overview/sc-gc-c3-LLMOps_L1_v3.mp4\", mime_type=\"video/mp4\")\n", "video_2 = Part.from_uri(\"gs://github-repo/img/gemini/multimodality_usecases_overview/sc-gc-c3-LLMOps_L2_v4.mp4\", mime_type=\"video/mp4\")\n", "video_3 = Part.from_uri(\"gs://github-repo/img/gemini/multimodality_usecases_overview/sc-gc-c3-LLMOps_L3_v4.mp4\", mime_type=\"video/mp4\")" ] }, { "cell_type": "code", "execution_count": null, "id": "555ad4b4-9918-4fbc-8ecb-d957a16d8b4a", "metadata": { "tags": [], "id": "555ad4b4-9918-4fbc-8ecb-d957a16d8b4a" }, "outputs": [], "source": [ "from IPython.display import IFrame" ] }, { "cell_type": "markdown", "id": "3c384cb3-c0d5-4ac8-a4f3-8340f92a4d96", "metadata": { "id": "3c384cb3-c0d5-4ac8-a4f3-8340f92a4d96" }, "source": [ "- This displays only one of the three videos.\n", "- To view others, feel free to change the `file_path`" ] }, { "cell_type": "code", "execution_count": null, "id": "17f6bdb1-8c6d-4e5c-83fb-294507a6f37d", "metadata": { "tags": [], "id": "17f6bdb1-8c6d-4e5c-83fb-294507a6f37d" }, "outputs": [], "source": [ "file_path = \"tuning-demo-erwinh/video/mlops-dlai-videos/sc-gc-c3-LLMOps_L2_v4.mp4\"\n", "video_url = f\"https://storage.googleapis.com/{file_path}\"" ] }, { "cell_type": "code", "execution_count": null, "id": "7e26e417-56e2-4d0c-a955-374c675f868f", "metadata": { "tags": [], "id": "7e26e417-56e2-4d0c-a955-374c675f868f" }, "outputs": [], "source": [ "IFrame(video_url, width=560, height=315) # Adjust width and height as needed" ] }, { "cell_type": "code", "execution_count": null, "id": "a70ed794-959c-4777-8fad-00e7b20a541d", "metadata": { "tags": [], "id": "a70ed794-959c-4777-8fad-00e7b20a541d" }, "outputs": [], "source": [ "role = \"\"\"\n", "You are specialized in analyzing videos and finding \\\n", "a needle in a haystack.\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "id": "f4df3088-f7ed-43b6-b148-1fb7549797d6", "metadata": { "tags": [], "id": "f4df3088-f7ed-43b6-b148-1fb7549797d6" }, "outputs": [], "source": [ "instruction = \"\"\"\n", "Here are three videos. Each is a lesson from the \\\n", "LLMOps course from Deep Learning AI.\n", "Your answers are only based on the videos.\n", "\"\"\"" ] }, { "cell_type": "markdown", "id": "c04e5717-1495-4d69-93a2-93e0e3fc1959", "metadata": { "id": "c04e5717-1495-4d69-93a2-93e0e3fc1959" }, "source": [ "- You are asking the model (question 2) to find something very specific from across these 3 videos." ] }, { "cell_type": "code", "execution_count": null, "id": "4dbcc5af-aca5-4f1b-82cd-36966fb690e2", "metadata": { "tags": [], "id": "4dbcc5af-aca5-4f1b-82cd-36966fb690e2" }, "outputs": [], "source": [ "questions = \"\"\"\n", "Answer the following questions:\n", "1. Create a summary of each video and what is discussed in \\\n", "the video.\\\n", "Limit the summary to a max of 100 words.\n", "2. In which of the three videos does the instructor run \\\n", "and explains this Python code: bq_client.query(). \\\n", "Where do you see this code in the video?\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "id": "cda6a861-271a-4acb-ab84-8cac8ce0a9b3", "metadata": { "tags": [], "id": "cda6a861-271a-4acb-ab84-8cac8ce0a9b3" }, "outputs": [], "source": [ "contents_5 = [\n", " role,\n", " instruction,\n", " video_1,\n", " video_2,\n", " video_3,\n", " questions\n", "]\n", "\n", "# contents_5 = [\n", "# instruction,\n", "# video_1,\n", "# video_2,\n", "# video_3,\n", "# questions,\n", "# role,\n", "# ]" ] }, { "cell_type": "markdown", "id": "fab0a1be-1035-4648-9aa2-2b35e0766423", "metadata": { "id": "fab0a1be-1035-4648-9aa2-2b35e0766423" }, "source": [ "<span style=\"color:red; font-weight:bold;\">IMPORTANT ⚠️ : PROMPTING THIS NEEDLE IN A HAYSTACK EXAMPLE COSTS ABOUT $4 PER EXECUTION</span>\n", "\n", "```Python\n", "responses = multimodal_model.generate_content(\n", " contents_5,\n", " stream=True\n", ")\n", "```\n", "\n", "**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get.\n", "\n", "```Python\n", "### this will take some time to run\n", "\n", "for response in responses:\n", " print(response.text, end=\"\")\n", "```" ] } ], "metadata": { "environment": { "kernel": "python3", "name": "tf2-cpu.2-11.m114", "type": "gcloud", "uri": "gcr.io/deeplearning-platform-release/tf2-cpu.2-11:m114" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" }, "colab": { "provenance": [] } }, "nbformat": 4, "nbformat_minor": 5 }

ai-ml/deep-learning-ai-gemini/L5_videos.ipynb (1,386 lines of code) (raw):