3_optimization-design-ptn/03_prompt-optimization/01_prompt_optimization.ipynb (467 lines of code) (raw):

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Prompt Optimization Baseline\n", "---\n", "\n", "This notebook provides a baseline for the prompt optimization task.\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import Dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sys\n", "import os\n", "import yaml\n", "\n", "sys.path.insert(0, \"./\")\n", "\n", "import promptwizard\n", "from promptwizard.glue.promptopt.instantiate import GluePromptOpt\n", "from promptwizard.glue.promptopt.techniques.common_logic import (\n", " DatasetSpecificProcessing,\n", ")\n", "from promptwizard.glue.common.utils.file import save_jsonlist\n", "from typing import Any\n", "from tqdm import tqdm\n", "from re import compile, findall\n", "from datasets import load_dataset\n", "from dotenv import load_dotenv\n", "\n", "load_dotenv(override=True)\n", "\n", "\n", "def update_yaml_file(file_path, config_dict):\n", "\n", " with open(file_path, \"r\") as file:\n", " data = yaml.safe_load(file)\n", "\n", " for field, value in config_dict.items():\n", " data[field] = value\n", "\n", " with open(file_path, \"w\") as file:\n", " yaml.dump(data, file, default_flow_style=False, allow_unicode=True)\n", "\n", " print(\"YAML file updated successfully!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<br>\n", "\n", "## 🧪 Case 1 : No training data. Don't want in-context examples in final prompt\n", "---\n", "\n", "The assumptions for this scenario are:\n", "\n", "- No training data\n", "- No in-context examples included in the final prompt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path_to_config = \"configs\"\n", "promptopt_config_path = os.path.join(path_to_config, \"baseline_promptopt_config.yaml\")\n", "setup_config_path = os.path.join(path_to_config, \"setup_config.yaml\")\n", "\n", "# Set the following based on the use case\n", "config_dict = {\n", " \"task_description\": \"You are a mathematics expert. You will be given a mathematics problem which you need to solve\",\n", " \"base_instruction\": \"Lets think step by step.\",\n", " \"mutation_rounds\": 5,\n", "}\n", "update_yaml_file(promptopt_config_path, config_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create an object for calling prompt optimization and inference functionalities" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gp = GluePromptOpt(\n", " promptopt_config_path, setup_config_path, dataset_jsonl=None, data_processor=None\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Call prompt optmization function\n", "1. ```use_examples``` can be used when there are training samples and a mixture of real and synthetic in-context examples are required in the final prompt. When set to ```False``` all the in-context examples will be real\n", "2. ```generate_synthetic_examples``` can be used when there are no training samples and we want to generate synthetic examples \n", "3. ```run_without_train_examples``` can be used when there are no training samples and in-context examples are not required in the final prompt " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "best_prompt, expert_profile = gp.get_best_prompt(\n", " use_examples=False,\n", " run_without_train_examples=True,\n", " generate_synthetic_examples=False,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<br>\n", "\n", "## 🧪 Case 2: Have training data. Want in-context examples in final prompt\n", "---\n", "\n", "The assumptions for this scenario are:\n", "\n", "- Have training data\n", "- Have in-context examples and also want these in the final prompt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### AOAI client" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "from openai import AzureOpenAI\n", "from dotenv import load_dotenv, find_dotenv\n", "\n", "load_dotenv()\n", "\n", "aoai_api_endpoint = os.getenv(\"AZURE_OPENAI_ENDPOINT\")\n", "aoai_api_key = os.getenv(\"AZURE_OPENAI_API_KEY\")\n", "aoai_api_version = os.getenv(\"AZURE_OPENAI_API_VERSION\")\n", "aoai_deployment_name = os.getenv(\"AZURE_OPENAI_DEPLOYMENT_NAME\")\n", "\n", "if not aoai_api_version:\n", " aoai_api_version = os.getenv(\"OPENAI_API_VERSION\")\n", "if not aoai_deployment_name:\n", " aoai_deployment_name = os.getenv(\"DEPLOYMENT_NAME\")\n", "\n", "try:\n", " client = AzureOpenAI(\n", " azure_endpoint=aoai_api_endpoint,\n", " api_key=aoai_api_key,\n", " api_version=aoai_api_version,\n", " )\n", " deployment_name = aoai_deployment_name\n", " print(\"=== Initialized AzuureOpenAI client ===\")\n", " print(f\"AZURE_OPENAI_ENDPOINT={aoai_api_endpoint}\")\n", " print(f\"AZURE_OPENAI_API_VERSION={aoai_api_version}\")\n", " print(f\"AZURE_OPENAI_DEPLOYMENT_NAME={aoai_deployment_name}\")\n", "except (ValueError, TypeError) as e:\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Transform dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "from collections import OrderedDict\n", "\n", "\n", "def summarize_answer(\n", " answer_text, deployment_name=\"gpt-4o-mini\", temperature=0.1, max_tokens=500\n", "):\n", " system_message = \"You are a helpful assistant that summarizes answers in a concise and clear way.\"\n", " try:\n", " response = client.chat.completions.create(\n", " model=deployment_name,\n", " messages=[\n", " {\"role\": \"system\", \"content\": system_message},\n", " {\n", " \"role\": \"user\",\n", " \"content\": f\"Summarize the following answer:\\n\\n{answer_text}\",\n", " },\n", " ],\n", " temperature=temperature,\n", " max_tokens=max_tokens,\n", " )\n", "\n", " summary = response.choices[0].message.content\n", " return summary\n", " except Exception as e:\n", " print(f\"Error summarizing: {e}\")\n", " return \"\"\n", "\n", "\n", "def transform_jsonl(input_path, output_path, summarize=False):\n", " with open(input_path, \"r\", encoding=\"utf-8\") as infile, open(\n", " output_path, \"w\", encoding=\"utf-8\"\n", " ) as outfile:\n", " for line in infile:\n", " item = json.loads(line)\n", " new_item = OrderedDict()\n", " query = item.get(\"query\", \"\")\n", " doc = item.get(\"document_content\", \"\")\n", " ans = item.get(\"answer\", \"\")\n", "\n", " if doc == \"\":\n", " doc_with_ans = ans\n", " else:\n", " doc_with_ans = (\n", " f\"<DOCUMENT_CONTENT>{doc}</DOCUMENT_CONTENT><ANSWER>{ans}</ANSWER>\"\n", " )\n", "\n", " new_item[\"question\"] = query\n", " new_item[\"answer\"] = doc_with_ans\n", " if summarize:\n", " print(f\"Summarizing answer: {ans[:30]}...\")\n", " new_item[\"final_answer\"] = summarize_answer(ans)\n", " else:\n", " new_item[\"final_answer\"] = ans\n", " json.dump(new_item, outfile, ensure_ascii=False)\n", " outfile.write(\"\\n\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "transform_jsonl(\n", " \"dataset/mobile.jsonl\", \"dataset/mobile_transformed.jsonl\", summarize=True\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a dataset specific class and define the required functions " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class MyDataset(DatasetSpecificProcessing):\n", "\n", " def dataset_to_jsonl(self, dataset_jsonl: str, **kwargs: Any) -> None:\n", " def extract_answer_from_output(completion):\n", "\n", " return completion\n", "\n", " examples_set = []\n", "\n", " for _, sample in tqdm(enumerate(kwargs[\"dataset\"]), desc=\"Evaluating samples\"):\n", " example = {\n", " DatasetSpecificProcessing.QUESTION_LITERAL: sample[\n", " \"question\"\n", " ], # question\n", " DatasetSpecificProcessing.ANSWER_WITH_REASON_LITERAL: sample[\n", " \"answer\"\n", " ], # answer\n", " DatasetSpecificProcessing.FINAL_ANSWER_LITERAL: extract_answer_from_output(\n", " sample[\"final_answer\"]\n", " ), # final_answer\n", " }\n", " examples_set.append(example)\n", "\n", " save_jsonlist(dataset_jsonl, examples_set, \"w\")\n", "\n", " def extract_final_answer(self, llm_output):\n", "\n", " return llm_output" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "path_to_config = \"configs\"\n", "promptopt_config_path = os.path.join(path_to_config, \"mobile_promptopt_config.yaml\")\n", "setup_config_path = os.path.join(path_to_config, \"setup_config.yaml\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "answer_format = \"\"\"Your response should be structured as follows:\n", "- Product Specifications\n", "- Product Design\n", "- Product Summary\n", "\"\"\"\n", "\n", "config_dict = {\"answer_format\": answer_format, \"unique_model_id\": \"gpt-4o\"}\n", "update_yaml_file(promptopt_config_path, config_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create an object for calling prompt optimization and inference functionalities" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_processor = MyDataset()\n", "\n", "gp = GluePromptOpt(\n", " promptopt_config_path,\n", " setup_config_path,\n", " dataset_jsonl=\"dataset/mobile_transformed.jsonl\",\n", " data_processor=my_processor,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Call prompt optmization function\n", "1. ```use_examples``` can be used when there are training samples and a mixture of real and synthetic in-context examples are required in the final prompt. When set to ```False``` all the in-context examples will be real\n", "2. ```generate_synthetic_examples``` can be used when there are no training samples and we want to generate synthetic examples \n", "3. ```run_without_train_examples``` can be used when there are no training samples and in-context examples are not required in the final prompt \n", "\n", "**It will take a while to run.** Please make sure to set the ```use_examples``` and ```generate_synthetic_examples``` flags accordingly." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "best_prompt, expert_profile = gp.get_best_prompt(\n", " use_examples=True,\n", " run_without_train_examples=False,\n", " generate_synthetic_examples=False,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"Best prompt: {best_prompt} \\nExpert profile: {expert_profile}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Save the best prompt and expert profile to a file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pickle\n", "\n", "save_dir = \"results\"\n", "if not os.path.exists(save_dir):\n", " os.system(f\"mkdir {save_dir}\")\n", "\n", "with open(f\"{save_dir}/mobile_best_prompt.pkl\", \"wb\") as f:\n", " pickle.dump(best_prompt, f)\n", "with open(f\"{save_dir}/mobile_expert_profile.pkl\", \"wb\") as f:\n", " pickle.dump(expert_profile, f)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open(f\"{save_dir}/mobile_best_prompt.pkl\", \"rb\") as f:\n", " loaded_best_prompt = pickle.load(f)\n", "with open(f\"{save_dir}/mobile_expert_profile.pkl\", \"rb\") as f:\n", " loaded_expert_profile = pickle.load(f)" ] } ], "metadata": { "kernelspec": { "display_name": "py312-dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 2 }