# Prompt Optimization for Summarization Tasks
---

Prompt optimization for summarization tasks can be done by applying PromptWizard. The following notebook shows how to optimize a prompt for the task of summarizing a news article. The goal is to create a prompt that effectively instructs the model to generate a concise summary of the article's content.

In [None]:
import promptwizard
from promptwizard.glue.promptopt.instantiate import GluePromptOpt
from promptwizard.glue.promptopt.techniques.common_logic import (
    DatasetSpecificProcessing,
)
from promptwizard.glue.common.utils.file import save_jsonlist
from typing import Any
from tqdm import tqdm
from re import compile, findall
import os
from datasets import load_dataset
import yaml

<br>

## ðŸ§ª 1. Prompt Optimization using Your Own dataset

----

### Load the dataset for summarization
This dataset is a custom dataset created by the author by crawling Naver News (https://news.naver.com) for the Korean NLP model hands-on.

- Period: July 1, 2022 - July 10, 2022
- Subject: IT, economics

In [None]:
from datasets import load_dataset

num_debug_samples = 10
dataset = load_dataset("daekeun-ml/naver-news-summarization-ko")
dataset["train"] = dataset["train"].shuffle().select(range(num_debug_samples))
dataset["test"] = dataset["test"].shuffle().select(range(num_debug_samples))

In [None]:
class SummarizationDataset(DatasetSpecificProcessing):

    def dataset_to_jsonl(self, dataset_jsonl: str, **kwargs: Any) -> None:
        def extract_answer_from_output(completion):

            return completion

        examples_set = []

        for _, sample in tqdm(enumerate(kwargs["dataset"]), desc="Evaluating samples"):
            example = {
                DatasetSpecificProcessing.QUESTION_LITERAL: sample[
                    "question"
                ],  # question
                DatasetSpecificProcessing.ANSWER_WITH_REASON_LITERAL: sample[
                    "answer"
                ],  # answer
                DatasetSpecificProcessing.FINAL_ANSWER_LITERAL: extract_answer_from_output(
                    sample["answer"]
                ),  # final_answer
            }
            examples_set.append(example)

        save_jsonlist(dataset_jsonl, examples_set, "w")

    def extract_final_answer(self, llm_output):

        return llm_output

In [None]:
summarization_processor = SummarizationDataset()

for dataset_type in ["train", "test"]:
    data_list = []
    for data in dataset[dataset_type]:
        data_list.append({"question": data["document"], "answer": data["summary"]})
    summarization_processor.dataset_to_jsonl(
        f"dataset/news_summarization_{dataset_type}.jsonl", dataset=data_list
    )

### Update configs

In [None]:
import os
import yaml


def update_yaml_file(file_path, config_dict):

    with open(file_path, "r") as file:
        data = yaml.safe_load(file)

    for field, value in config_dict.items():
        data[field] = value

    with open(file_path, "w") as file:
        yaml.dump(data, file, default_flow_style=False, allow_unicode=True)

    print("YAML file updated successfully!")

In [None]:
path_to_config = "configs"
promptopt_config_path = os.path.join(
    path_to_config, "summarization_promptopt_config.yaml"
)
setup_config_path = os.path.join(path_to_config, "setup_config.yaml")

# Update promptopt_config.yaml. This file is used to configure the prompt optimization process.
config_dict = {
    "task_description": "You are a summarizer. Your task is to summarize the content of the document.",
    "base_instruction": "Write a concise summary of the following. Think step by step to ensure you cover all important points.",
    "answer_format": "Provide a bullet-point summary of the following document, listing the main arguments and supporting evidence in 3-5 concise bullet points. Avoid unnecessary detail and focus on the most important takeaways.",
    "mutation_rounds": 2,
    "few_shot_count": 3,
    "generate_reasoning": True,
    "mutate_refine_iterations": 1,
}
update_yaml_file(promptopt_config_path, config_dict)

# Update setup_config.yaml. This file is used to track the log of the experiment.
config_dict = {
    "experiment_name": "summarization",
}
update_yaml_file(setup_config_path, config_dict)

### Initialize LLM

In [None]:
import dotenv
from langchain.chat_models import init_chat_model

dotenv.load_dotenv()

llm = init_chat_model("gpt-4o-mini", model_provider="azure_openai")

### Create an object for calling prompt optimization and inference functionalities

In [None]:
import sys

sys.path.insert(0, "./")

import promptwizard
from promptwizard.glue.promptopt.instantiate import GluePromptOpt
from promptwizard.glue.promptopt.techniques.common_logic import (
    DatasetSpecificProcessing,
)

# gp = GluePromptOpt(
#     promptopt_config_path, setup_config_path, dataset_jsonl=None, data_processor=None
# )
gp = GluePromptOpt(
    promptopt_config_path,
    setup_config_path,
    dataset_jsonl="dataset/news_summarization_train.jsonl",
    data_processor=summarization_processor,
)

### Call prompt optmization function
1. ```use_examples``` can be used when there are training samples and a mixture of real and synthetic in-context examples are required in the final prompt. When set to ```False``` all the in-context examples will be real
2. ```generate_synthetic_examples``` can be used when there are no training samples and we want to generate synthetic examples 
3. ```run_without_train_examples``` can be used when there are no training samples and in-context examples are not required in the final prompt 

**Note: It will take a while to run (about 3-4 minutes)**

In [None]:
best_prompt, expert_profile = gp.get_best_prompt(
    use_examples=True,
    run_without_train_examples=False,
    generate_synthetic_examples=False,
)

### Setup the default prompt. Optimized prompt came from the above step.

In [None]:
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    """You are a summarizer. Your task is to summarize the document."""
)

human_prompt = """
The following is a set of documents:
{docs}
Based on this list of docs, please summarize the content of the document.
Answer:
"""

optimized_system_prompt = expert_profile
optimized_human_prompt = best_prompt + "\n" + human_prompt

# Heuristic prompt
map_prompt = ChatPromptTemplate.from_messages(
    [("system", system_prompt), ("human", human_prompt)],
)

# Optimized prompt using PromptWizard
optimized_map_prompt = ChatPromptTemplate.from_messages(
    [("system", optimized_system_prompt), ("human", optimized_human_prompt)],
)

In [None]:
# Also available via the hub: `hub.pull("rlm/reduce-prompt")`
reduce_template = """
The following is a set of summaries:
{docs}
Take these and distill it into a final, consolidated summary of the main themes. The summary should include the main keywords.
"""
reduce_prompt = ChatPromptTemplate([("human", reduce_template)])

### Load documents

This example simply uses the technical blog post from the Microsoft Technical blog. But you can any document you want to summarize. Please note that documment retrieval for production requires a lot of work.

In [None]:
from langchain_community.document_loaders import WebBaseLoader

url = "https://www.hankyung.com/article/202504042119i"
loader = WebBaseLoader(url)
docs = loader.load()

In [None]:
from langchain_text_splitters import CharacterTextSplitter

chunk_size = 1000
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=chunk_size, chunk_overlap=100
)
split_docs = text_splitter.split_documents(docs)
print(f"Generated {len(split_docs)} documents.")

<br>

## 2. Build the Graph and Run it

----

In [None]:
import operator
from typing import Annotated, List, Literal, TypedDict

from langchain.chains.combine_documents.reduce import (
    acollapse_docs,
    split_list_of_docs,
)
from langchain_core.documents import Document
from langgraph.constants import Send
from langgraph.graph import END, START, StateGraph
from functools import partial

token_max = chunk_size


def length_function(documents: List[Document]) -> int:
    """Get number of tokens for input contents."""
    return sum(llm.get_num_tokens(doc.page_content) for doc in documents)


# This will be the overall state of the main graph.
# It will contain the input document contents, corresponding
# summaries, and a final summary.
class OverallState(TypedDict):
    # Notice here we use the operator.add
    # This is because we want combine all the summaries we generate
    # from individual nodes back into one list - this is essentially
    # the "reduce" part
    contents: List[str]
    summaries: Annotated[list, operator.add]
    collapsed_summaries: List[Document]
    final_summary: str


# This will be the state of the node that we will "map" all
# documents to in order to generate summaries
class SummaryState(TypedDict):
    content: str


# Here we generate a summary, given a document
async def generate_summary(state: SummaryState, prompt_template: ChatPromptTemplate):
    prompt = prompt_template.invoke(state["content"])
    response = await llm.ainvoke(prompt)
    return {"summaries": [response.content]}


# Here we define the logic to map out over the documents
# We will use this an edge in the graph
def map_summaries(state: OverallState):
    # We will return a list of `Send` objects
    # Each `Send` object consists of the name of a node in the graph
    # as well as the state to send to that node
    return [
        Send("generate_summary", {"content": content}) for content in state["contents"]
    ]


def collect_summaries(state: OverallState):
    return {
        "collapsed_summaries": [Document(summary) for summary in state["summaries"]]
    }


async def _reduce(input: dict) -> str:
    prompt = reduce_prompt.invoke(input)
    response = await llm.ainvoke(prompt)
    return response.content


# Add node to collapse summaries
async def collapse_summaries(state: OverallState):
    doc_lists = split_list_of_docs(
        state["collapsed_summaries"], length_function, token_max
    )
    results = []
    for doc_list in doc_lists:
        results.append(await acollapse_docs(doc_list, _reduce))

    return {"collapsed_summaries": results}


# This represents a conditional edge in the graph that determines
# if we should collapse the summaries or not
def should_collapse(
    state: OverallState,
) -> Literal["collapse_summaries", "generate_final_summary"]:
    num_tokens = length_function(state["collapsed_summaries"])
    if num_tokens > token_max:
        return "collapse_summaries"
    else:
        return "generate_final_summary"


# Here we will generate the final summary
async def generate_final_summary(state: OverallState):
    response = await _reduce(state["collapsed_summaries"])
    return {"final_summary": response}

### Case 1. Using the heuristic prompt

In [None]:
# Construct the graph
# Nodes:
_generate_summary = partial(generate_summary, prompt_template=map_prompt)

graph = StateGraph(OverallState)
graph.add_node("generate_summary", _generate_summary)
graph.add_node("collect_summaries", collect_summaries)
graph.add_node("collapse_summaries", collapse_summaries)
graph.add_node("generate_final_summary", generate_final_summary)

# Edges:
graph.add_conditional_edges(START, map_summaries, ["generate_summary"])
graph.add_edge("generate_summary", "collect_summaries")
graph.add_conditional_edges("collect_summaries", should_collapse)
graph.add_conditional_edges("collapse_summaries", should_collapse)
graph.add_edge("generate_final_summary", END)

app = graph.compile()

In [None]:
from azure_genai_utils.graphs import visualize_langgraph

visualize_langgraph(app, xray=True)

In [None]:
from langchain_core.runnables import RunnableConfig
from azure_genai_utils.messages import random_uuid

inputs = {
    "contents": [doc.page_content for doc in split_docs],
}
config = RunnableConfig(recursion_limit=10, configurable={"thread_id": random_uuid()})
async for step in app.astream(inputs, config):
    curr_node = list(step.keys())[0]
    print("\n" + "=" * 50)
    print(f"ðŸ”„ Node: \033[1;36m{curr_node}\033[0m ðŸ”„")
    print("- " * 25)
    print(list(step.values()))

In [None]:
outputs = step["generate_final_summary"]["final_summary"]
from IPython.display import display, Markdown

display(Markdown(outputs))

### Case 2. Using the Optimized prompt


In [None]:
# Construct the graph
# Nodes:
_generate_summary = partial(generate_summary, prompt_template=optimized_map_prompt)

graph = StateGraph(OverallState)
graph.add_node("generate_summary", _generate_summary)
graph.add_node("collect_summaries", collect_summaries)
graph.add_node("collapse_summaries", collapse_summaries)
graph.add_node("generate_final_summary", generate_final_summary)

# Edges:
graph.add_conditional_edges(START, map_summaries, ["generate_summary"])
graph.add_edge("generate_summary", "collect_summaries")
graph.add_conditional_edges("collect_summaries", should_collapse)
graph.add_conditional_edges("collapse_summaries", should_collapse)
graph.add_edge("generate_final_summary", END)

app = graph.compile()

In [None]:
from langchain_core.runnables import RunnableConfig
from azure_genai_utils.messages import random_uuid

inputs = {
    "contents": [doc.page_content for doc in split_docs],
}
config = RunnableConfig(recursion_limit=10, configurable={"thread_id": random_uuid()})
async for step in app.astream(inputs, config):
    curr_node = list(step.keys())[0]
    print("\n" + "=" * 50)
    print(f"ðŸ”„ Node: \033[1;36m{curr_node}\033[0m ðŸ”„")
    print("- " * 25)
    print(list(step.values()))

You can see that the summary generated by the optimized prompt is better than the one generated by the heuristic prompt. The optimized prompt is more concise and captures the main points of the article more effectively.
Of course, you can improve the heuristic prompt by prompt engineering. But the automated optimized prompt is a good starting point for creating effective prompts for summarization tasks.

In [None]:
outputs = step["generate_final_summary"]["final_summary"]
from IPython.display import display, Markdown

display(Markdown(outputs))

### Futher work
You can evaluate the performance of the optimized prompt using the test set and Azure AI Evaluation SDK.