## Evals and Metaprompting

### Initial Setup
We'll start with importing a few dependencies and setting up some visuals

In [None]:
%pip install openai
%pip install pandas
%pip install ipython

In [1]:
from openai import OpenAI
from IPython.display import display, Markdown
import pandas as pd
from concurrent.futures import ThreadPoolExecutor
from functionDefinitions import TOOLS
import csv
import json
import os

client = OpenAI()
MODEL = 'o1-mini'

pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

## Generating a Routine
We'll take the Flight Cancellation Policy that we have created and convert it to an LLM-based routine with the following prompt

In [2]:
with open('originalPolicy/flightCancellationsPolicy.md', 'r') as file:
    flight_cancellation_policy = file.read()

In [3]:
CONVERSION_PROMPT = """
You are a helpful assistant tasked with taking an external facing help center article and converting it into a internal-facing programmatically executable routine optimized for an LLM. 
The LLM using this routine will be tasked with reading the policy, answering incoming questions from customers, and helping drive the case toward resolution.

Please follow these instructions:
1. **Review the customer service policy carefully** to ensure every step is accounted for. It is crucial not to skip any steps or policies.
2. **Organize the instructions into a logical, step-by-step order**, using the specified format.
3. **Use the following format**:
   - **Main actions are numbered** (e.g., 1, 2, 3).
   - **Sub-actions are lettered** under their relevant main actions (e.g., 1a, 1b).
      **Sub-actions should start on new lines**
   - **Specify conditions using clear 'if...then...else' statements** (e.g., 'If the product was purchased within 30 days, then...').
   - **For instructions that require more information from the customer**, provide polite and professional prompts to ask for additional information.
   - **For actions that require data from external systems**, write a step to call a function using backticks for the function name (e.g., `call the check_delivery_date function`).
      - **If a step requires the customer service agent to take an action** (e.g., process a refund), generate a function call for this action (e.g., `call the process_refund function`).
      - **Define any new functions** by providing a brief description of their purpose and required parameters.
   - **If there is an action an assistant can performon behalf of the user**, include a function call for this action (e.g., `call the change_email_address function`), and ensure the function is defined with its purpose and required parameters.
      - This action may not be explicitly defined in the help center article, but can be done to help the user resolve their inquiry faster
   - **The step prior to case resolution should always be to ask if there is anything more you can assist with**.
   - **End with a final action for case resolution**: calling the `case_resolution` function should always be the final step.
4. **Ensure compliance** by making sure all steps adhere to company policies, privacy regulations, and legal requirements.
5. **Handle exceptions or escalations** by specifying steps for scenarios that fall outside the standard policy.

**Important**: If at any point you are uncertain, respond with "I don't know."

Please convert the customer service policy into the formatted routine, ensuring it is easy to follow and execute programmatically.

"""

In [4]:
def generate_routine(policy):
    try:
        messages = [
            {
                "role": "user",
                "content": f"""
                    {CONVERSION_PROMPT}

                    POLICY:
                    {policy}
                """
            }
        ]

        response = client.chat.completions.create(
            model=MODEL,
            messages=messages
        )
        

        return response.choices[0].message.content 
    except Exception as e:
        print(f"An error occurred: {e}")

In [5]:
flight_cancellation_routine = generate_routine(flight_cancellation_policy)

In [6]:
display(Markdown(flight_cancellation_routine))

```markdown
1. **Confirm Customer Identity**
   a. Prompt the customer: "Could you please provide your booking reference, full name, and flight number?"
   b. `call the verify_identity function` with parameters booking_reference, full_name, flight_number
      - **verify_identity**: Validates the customer's identity based on booking reference, name, and flight number.

2. **Listen and Understand Customer Request**
   a. Ask the customer: "Are you looking to cancel your flight, make a change, or inquire about compensation?"
   b. If the customer wants to **cancel**, then proceed to step 3.
   c. Else if the customer wants to **change**, then proceed to step 4.
   d. Else if the customer inquires about **compensation**, then proceed to step 5.
   e. Else, prompt: "Could you please clarify your request?" and loop back to step 2a.

3. **Handle Cancellations**
   a. `call the check_ticket_type function` with parameter booking_reference
      - **check_ticket_type**: Determines if the ticket is refundable, non-refundable, or flexible.
   b. If the ticket is **refundable**, then:
      i. `call the process_refund function` with parameters booking_reference, refund_amount
         - **process_refund**: Initiates a refund based on the refund amount and timing.
      ii. Inform the customer: "Your refundable ticket qualifies for a full or partial refund based on the cancellation timing."
         - If **within 24 hours of booking**, then full refund is guaranteed.
         - Else, provide refund amount based on specific fare rules.
   c. Else if the ticket is **non-refundable**, then:
      i. Offer flight credit: "We can provide you with flight credit for future use, subject to an administration fee."
         - `call the offer_flight_credit function` with parameters booking_reference, administration_fee
            - **offer_flight_credit**: Issues flight credit after deducting any applicable fees.
      ii. Advise about penalty fees: "Please note that a penalty fee of [amount] applies to non-refundable tickets."
         - `call the apply_penalty_fee function` with parameters booking_reference, penalty_amount
            - **apply_penalty_fee**: Deducts the penalty fee from the available refund or credit.
   d. Else, inform the customer: "I don't know."

4. **Handle Changes**
   a. `call the check_ticket_type function` with parameter booking_reference
      - **check_ticket_type**: Determines if the ticket is refundable, non-refundable, or flexible.
   b. Ask the customer: "Would you like to make a same-day change or a change in advance?"
   c. If **same-day change**, then:
      i. If the ticket is **flexible**, then:
         - `call the process_same_day_change function` with parameters booking_reference, new_flight_details
            - **process_same_day_change**: Changes the flight without any fees, subject to availability.
      ii. Else if the ticket is **non-flexible**, then:
         - `call the apply_change_fee function` with parameters booking_reference, change_fee
            - **apply_change_fee**: Applies a change fee to the ticket.
         - Ask the customer: "There is a change fee of [amount]. Would you like to proceed and pay any fare difference?"
         - If yes, `call the process_fare_difference function` with parameters booking_reference, fare_difference
            - **process_fare_difference**: Calculates and processes any additional fare required.
   d. Else if **change in advance**, then:
      i. Ask: "Is your requested change within 7 days of departure?"
      ii. If **within 7 days**, then:
         - `call the apply_standard_change_fee function` with parameters booking_reference, standard_change_fee
            - **apply_standard_change_fee**: Applies the standard change fee for changes within 7 days.
      iii. Else (**beyond 7 days**), then:
         - If the ticket type allows, `call the apply_lesser_change_fee function` or `call the waive_change_fee function` based on ticket flexibility
            - **apply_lesser_change_fee**: Applies a reduced change fee.
            - **waive_change_fee**: Removes the change fee for eligible tickets.
   e. Else, inform the customer: "I don't know."

5. **Handle Compensation Inquiries**
   a. `call the check_compensation_eligibility function` with parameters booking_reference, delay_length, cause
      - **check_compensation_eligibility**: Determines if the customer is eligible for compensation based on delay and cause.
   b. If **eligible for compensation**, then:
      i. Inform the customer: "You are eligible for compensation based on the length of your delay and its cause."
      ii. If the **delay exceeds four hours**, then:
         - `call the provide_meal_voucher function` or `call the offer_hotel_accommodation function` based on customer needs
            - **provide_meal_voucher**: Issues meal vouchers to the customer.
            - **offer_hotel_accommodation**: Arranges hotel accommodation if an overnight stay is required.
   c. Else, inform the customer: "I don't know."

6. **Rebooking Process**
   a. `call the get_next_available_flight function` with parameter booking_reference
      - **get_next_available_flight**: Retrieves the next available flight operated by the airline.
   b. If a suitable flight is available, then:
      i. `call the process_rebooking function` with parameters booking_reference, new_flight_details
         - **process_rebooking**: Rebooks the customer on the selected flight.
      ii. Inform the customer: "You have been rebooked on the next available flight."
   c. Else, if no suitable flight is available, then:
      i. `call the check_interline_partners function` with parameter booking_reference
         - **check_interline_partners**: Checks for alternative connections with partner airlines.
      ii. If an interline option is found, then:
         - `call the process_interline_rebooking function` with parameters booking_reference, partner_flight_details
            - **process_interline_rebooking**: Rebooks the customer with a partner airline.
         - Inform the customer: "Your flight has been rebooked with our partner airline."
      iii. Else, inform the customer: "Currently, there are no available flights. Would you like to be placed on a waitlist or receive an upgrade if possible?"
         - If the customer opts for an upgrade, `call the offer_upgraded_seat function` with parameters booking_reference
            - **offer_upgraded_seat**: Offers a complimentary upgrade to a higher class if available.
   d. Else, inform the customer: "I don't know."

7. **Special Cases Handling**
   a. If the customer cites a **medical emergency**, then:
      i. Ask for documentation: "Could you please provide a medical certificate?"
      ii. If provided, `call the process_medical_cancellation function` with parameters booking_reference, medical_certificate
         - **process_medical_cancellation**: Allows full cancellation or flight credit without fees.
   b. If the customer cites a **bereavement**, then:
      i. Ask for documentation: "Could you please provide the necessary documentation for bereavement?"
      ii. If provided, `call the process_bereavement_case function` with parameters booking_reference, documentation
         - **process_bereavement_case**: Offers flexibility on cancellations or changes.
   c. If the booking is a **group booking**, then:
      i. Ask the customer: "Is this a group booking?"
      ii. If yes, `call the handle_group_booking function` with parameters booking_reference, group_request
         - **handle_group_booking**: Manages partial cancellations or name changes within the group.
   d. If the booking involves **unaccompanied minors**, then:
      i. `call the prioritize_minors function` with parameters booking_reference
         - **prioritize_minors**: Ensures unaccompanied minors are rebooked on the next available flight with proper supervision.
   e. Else, proceed to next step.

8. **Ask for Additional Assistance**
   a. Prompt the customer: "Is there anything else I can assist you with today?"
   b. If the customer has additional requests, loop back to step 2.
   c. Else, proceed to step 9.

9. **Case Resolution**
   a. `call the case_resolution function` with parameters booking_reference, final_status
      - **case_resolution**: Marks the case as resolved and logs the final status.

10. **Handle Exceptions or Escalations**
    a. If at any point the issue falls outside standard policies, then:
       i. Inform the customer: "I understand your concern. Let me escalate this to a supervisor for further assistance."
       ii. `call the escalate_to_supervisor function` with parameters booking_reference, issue_details
          - **escalate_to_supervisor**: Forwards the case to a supervisor for specialized handling.
    b. Else, inform the customer: "I don't know."
```

## Evaluating Accuracy

Now that we have a routine generated with o1, we can run it against our evaluation suite and measure its accuracy.

We'll start by creating an agent that is equipped with the policy and a list of tools. It will be given messages from an existing conversation and will be tasked with determining the next best action to take

In [7]:
def agent_response(transcript, policy, model):
    try:
        messages = [
            {
                "role": "system",
                "content": f"""
You are a customer service agent that is responsible for handling airline related issues. Below is the exact policy that you must follow to address the customer's issue

POLICY:
{policy}
                """
            }
        ]

        messages.extend(transcript)
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=TOOLS
        )
        
        return response.choices[0].message 
    except Exception as e:
        print(f"An error occurred: {e}")

We will process each row in parallel to reduce runtime and compare the function call + inputs that the model selects against our expected function + parameters.

In [8]:
def process_row(row_number, row, policy, model):
    try:
        # Extract values from the current row
        conversation_str = row['conversation']
        expected_function = row['expected_function']
        expected_inputs_str = row['expected_inputs']

        # Parse the conversation JSON
        try:
            conversation = json.loads(conversation_str)
        except json.JSONDecodeError as e:
            print(f"Error parsing 'conversation' in row {row_number}: {e}")
            conversation = None

        # Parse the expected_inputs JSON
        try:
            expected_inputs = json.loads(expected_inputs_str)
            # If expected_inputs is a string (double-encoded), parse it again
            if isinstance(expected_inputs, str):
                expected_inputs = json.loads(expected_inputs)
        except json.JSONDecodeError as e:
            print(f"Error parsing 'expected_inputs' in row {row_number}: {e}")
            expected_inputs = None

        # Extract the last assistant's message content if it exists
        response = agent_response(conversation, policy, model)
        assistant_message_content = response.content if response else None
        tool_calls = response.tool_calls

        # If the most recent response does not contain a tool call and just a message from the assistant, we rerun it once more to get our tool call.
        if not tool_calls:
            if assistant_message_content:
                # Append the assistant's message content to the conversation
                conversation.append({"role": "assistant", "content": assistant_message_content})
                # Make another call to agent_response
                response = agent_response(conversation, policy, model)
                tool_calls = response.tool_calls

        if not tool_calls:
            actual_function = None
            actual_inputs = None
            is_correct = False
        else:
            tool_call = tool_calls[0]  # Assuming we're only interested in the first tool call
            function_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)

            is_correct = (function_name == expected_function) and (arguments == expected_inputs)
            actual_function = function_name
            actual_inputs = arguments

        return {
            'expected_function': expected_function,
            'expected_inputs': expected_inputs,
            'actual_function': actual_function,
            'actual_inputs': actual_inputs,
            'is_correct': is_correct,
            'assistant_message_content': assistant_message_content
        }

    except Exception as e:
        print(f"Error processing row {row_number}: {e}")
        return {
            'expected_function': row.get('expected_function'),
            'expected_inputs': row.get('expected_inputs'),
            'actual_function': None,
            'actual_inputs': None,
            'is_correct': False,
            'assistant_message_content': None
        }

def evaluate_function_calls(file_path, policy, model):
    records = []

    # Check if the file exists
    if not os.path.isfile(file_path):
        print(f"Error: The file '{file_path}' does not exist.")
        return

    try:
        with open(file_path, 'r', newline='', encoding='utf-8') as csvfile:
            # Initialize the CSV reader with pipe as delimiter
            reader = csv.DictReader(csvfile, delimiter='|', quotechar='"', escapechar='\\')

            # Use ThreadPoolExecutor to process rows in parallel
            with ThreadPoolExecutor() as executor:
                futures = {executor.submit(process_row, row_number, row, policy, model): row_number for row_number, row in enumerate(reader, start=2)}
                for future in futures:
                    record = future.result()
                    records.append(record)

    except Exception as e:
        print(f"An unexpected error occurred while reading the CSV file: {e}")
        return

    df = pd.DataFrame(records)
    total_accuracy = df['is_correct'].mean()
    return df, total_accuracy


Let's take a look at our results.

In [9]:
# Assuming the CSV file is located at 'evals/functionCallingEval.csv'
df, accuracy = evaluate_function_calls('evals/functionCallingEval.csv', flight_cancellation_routine, 'gpt-4o-mini-2024-07-18')

# Display the accuracy as a mini header
display(Markdown(f"### Accuracy: {accuracy:.2%}"))

display(df)

### Accuracy: 70.59%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},check_ticket_type,{'booking_reference': 'GHI789'},False,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"You have a non-refundable ticket. I can provide you with flight credit for future use, subject to an administration fee. Additionally, please note that a penalty fee applies to non-refundable tickets.\n\nWould you like me to proceed with offering you flight credit, or do you need any further information?"
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},,,False,Would you like to make a same-day change or a change in advance?
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},process_partial_group_cancellation,{'booking_reference': 'BCD890'},False,


## Metaprompting

Let's now leverage o1 again to add in a metaprompting loop to see if we can improve the quality of our evals.

We'll take the following multi-step approach:
- We'll pass in the current routine + eval results to o1 and ask it analyze the results and update the routine accordingly
- Since o1 does not currently support structured outputs, we'll chain with output with a 4o to enforce a schema we can parse
- Finally, we take the new routine and run it back through our eval to generate new results

We'll run this loop a fixed number of times and see what improvements we can make

In [10]:
def metaprompt(messages):
    try:
        response = client.chat.completions.create(
            model='o1-preview',
            messages=messages,
        )
        
        return response.choices[0].message.content
    except Exception as e:
        print(f"An error occurred: {e}")

In [11]:
def enforce_schema(updated_prompt):
    try:
        messages = [
            {
                "role": "system",
                "content": f"""
You will be given a response from an LLM that just generated a policy for flight cancellations.
Your task is to take just the policy exactly as it is written and return it in the defined json. Remove all parts from the LLM's answer that are not part of the policy.

LLM RESPONSE:
{updated_prompt}
                """
            }
        ]

        response = client.chat.completions.create(
            model='gpt-4o-2024-08-06',
            messages=messages,
            response_format= {
                "type": "json_schema",
                "json_schema": {
                    "name": "policy_output",
                    "schema": {
                    "type": "object",
                    "properties": {
                        "final_answer": { "type": "string" }
                    },
                    "required": ["final_answer"],
                    "additionalProperties": False
                    },
                    "strict": True
                }
            }

        )
        
        return response.choices[0].message.content
    except Exception as e:
        print(f"An error occurred: {e}") 

In [12]:
updated_policy = flight_cancellation_routine
messages = [
    {
        "role": "user",
        "content": f"""
You are an agent that is responsible for improving the quality of routine instructions that are provided to a customer service LLM agent.

I am going to give you the policy for the customer service agent that contains detailed instructions on how to handle flight cancellations and changes.

You will also be provided with the results from an eval set that include the following:
    - conversation history: This is the conversation that we present to the LLM along with the system prompt
    - expected_function: This is the function we expect the LLM to call
    - expected_input: This is the input we expect the LLM to provide to the function
    - actual_function: This is the actual function the LLM called
    - actual_input: This is the actual input the LLM provided
    - assistant_message_content: This is the message the LLM generated when it returned its response
    - is_correct: True/False value depending on if the model responded correctly

Carefully analyze the instructions provided as well as the results of the eval. Get a firm understanding of the failures in the policy.

Return an updated policy that will perform better against the dataset.

Here is the current policy:
{flight_cancellation_policy}
"""
    }
]

for _ in range(5):
    # Evaluate the function calls with the current policy
    df, accuracy = evaluate_function_calls('evals/functionCallingEval.csv', updated_policy, 'gpt-4o-mini-2024-07-18')
    
    # Display the accuracy as a mini header
    display(Markdown(f"### Accuracy: {accuracy:.2%}"))
    display(df)
    results_json = df.to_json(orient='records')

    messages.append({
        "role": "user",
        "content": f"""
Here are the results based on the current policy:
{results_json}
"""
    })
    # Use the metaprompt function to get an updated policy
    temp_policy_json = enforce_schema(metaprompt(messages))
    temp_policy_str = temp_policy_json.strip("json```").strip("```")
    temp_policy = json.loads(temp_policy_str)["final_answer"]
    print(f"Corrected Policy: {temp_policy}")

    messages.append({
        "role": "assistant",
        "content": f"""
{temp_policy}
"""
    })

    # Update the policy for the next iteration
    updated_policy = temp_policy


### Accuracy: 64.71%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},check_ticket_type,{'booking_reference': 'GHI789'},False,
3,offer_flight_credit,{'booking_reference': 'JKL012'},,,False,"Since you have a non-refundable ticket, we can offer you flight credit for future use, subject to an administration fee. Please note that a penalty fee will also apply to non-refundable tickets.\n\nWould you like to proceed with the flight credit option?"
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},,,False,Would you like to make a same-day change or a change in advance?
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},check_ticket_type,{'booking_reference': 'BCD890'},False,


Corrected Policy: **Internal Flight Cancellations and Changes Policy**

**Purpose**: This document serves as a detailed guide for internal support agents to handle flight cancellations and changes. The focus is on providing clear instructions, ensuring efficiency, consistency, and customer satisfaction.

**Note**: Always maintain a calm, empathetic tone while assisting customers, especially during stressful situations involving cancellations or major changes.

## **Table of Contents**

1. General Guidelines for Handling Customer Requests  
2. Functions and Procedures  
3. Cancellations: Types and Policies  
4. Changes: Types and Policies  
5. Rebooking Guidelines  
6. Compensation and Refund Rules  
7. Special Cases  
8. FAQs for Common Scenarios

---

### **1\. General Guidelines for Handling Customer Requests**

- **Confirm Identity**: Verify the customer's identity by asking for their booking reference and any additional required details (e.g., name and flight number).
  
- **Listen

### Accuracy: 100.00%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,"Could you please provide the new date you would like to change your flight to? Additionally, I need to check the type of ticket you have to proceed. I will do that now."
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"Since your ticket is non-refundable, you will be offered flight credit for future use. I'll proceed with that now."
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},process_change_no_fee,{'booking_reference': 'VWX234'},True,
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: - **Under "Minimal Dialogue":**
  - **Do Not Mention Internal Processes:**
    - *Guideline:* Avoid stating internal actions or checks being performed. Focus communication on the customer's needs and the information required from them.
  - **Example Adjustment:**
    - Instead of saying, "I need to check the type of ticket you have to proceed. I will do that now," simply proceed with the necessary function call or ask for essential information if required.

- **Emphasize Clarity and Conciseness:**
  - Ensure that all communications are clear, necessary, and contribute directly to resolving the customer's request.


### Accuracy: 82.35%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},process_change_no_fee,{'booking_reference': 'DEF456'},False,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"Since you have a non-refundable ticket, please let me know how you would like to proceed. You may consider options such as offering a flight credit or assessing your eligibility for any potential exceptions."
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},rebook_without_fee,{'booking_reference': 'VWX234'},False,
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},process_partial_group_cancellation,{'booking_reference': 'BCD890'},False,


Corrected Policy: **Updated Internal Flight Cancellations and Changes Policy**

**Purpose**: This document serves as a detailed guide for internal support agents to handle flight cancellations and changes. The focus is on providing clear instructions, ensuring efficiency, consistency, and customer satisfaction.

## **Table of Contents**

1. General Guidelines for Handling Customer Requests  
2. Functions and Procedures  
3. Cancellations: Types and Policies  
4. Changes: Types and Policies  
5. Rebooking Guidelines  
6. Compensation and Refund Rules  
7. Special Cases  
8. FAQs for Common Scenarios

### **1\. General Guidelines for Handling Customer Requests**

- **Confirm Identity**: Verify the customer's identity by asking for their booking reference and any additional required details (e.g., name and flight number).

- **Listen and Understand**: Determine if the customer wants to cancel, change, or inquire about compensation.

- **Use Appropriate Functions**: Directly call the relev

### Accuracy: 100.00%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},process_change_no_fee,{'booking_reference': 'VWX234'},True,
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: **Internal Flight Cancellations and Changes Policy**

- **Sequential Function Calls:** Emphasized the importance of calling **`check_ticket_type(booking_reference)`** before processing any changes or cancellations. This ensures that agents handle requests according to the correct ticket policies.

- **Correct Function Usage:** Clarified the distinction between similar functions (e.g., **`process_change_no_fee`** vs. **`rebook_without_fee`**) to prevent confusion and ensure appropriate actions are taken.

- **Minimal Dialogue:** Reinforced guidance on avoiding unnecessary conversation and internal process mentions, focusing communication on essential information required from the customer.

- **Comprehensive Function List:** Included all relevant functions, such as **`process_partial_group_cancellation(booking_reference)`**, to cover various scenarios agents may encounter.

- **Action-Oriented Instructions:** Provided clear, direct actions for agents to follow in each 

### Accuracy: 94.12%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"Your ticket is non-refundable. Unfortunately, this means that I cannot process a cancellation for you. Would you like to explore alternate options, such as flight credit or changes with applicable fees?"
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},rebook_without_fee,{'booking_reference': 'VWX234'},False,
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: ## **Policies for Flight Cancellations and Changes**

### **3\. Cancellations: Types and Policies**

#### **3.1 Customer-Initiated Cancellations**

- **Refundable Tickets**:

  - **Within 24 Hours of Booking**:
    - **Action**: Call **`process_full_refund(booking_reference)`**.

  - **Beyond 24 Hours**:
    - **Action**: First, call **`check_ticket_type(booking_reference)`** to confirm refundable status, then proceed with **`process_full_refund(booking_reference)`** if eligible.

- **Non-Refundable Tickets**:

  - **Action**: Offer flight credit by calling **`offer_flight_credit(booking_reference)`**.

- **Flexible Tickets**:

  - **Action**: Call **`process_full_refund(booking_reference)`** or **`offer_flight_credit(booking_reference)`** based on customer preference.

### **4\. Changes: Types and Policies**

#### **4.1 Customer-Initiated Changes**

- **Important**: Always **call **`check_ticket_type(booking_reference)`** before processing any changes** to determine 

## Distilling Down to a smaller model

Each time we release a new snapshot of a model, it is always a challenge to ensure that your existing prompt works for the new snapshot.

In this example, we'll simulate that work by trying to get the routine to work for our older GPT 3.5 Turbo model.

In [13]:
messages = [
    {
        "role": "user",
        "content": f"""
You are an agent that is responsible for improving the quality of routine instructions that are provided to a customer service LLM agent.

I am going to give you the policy for the customer service agent that contains detailed instructions on how to handle flight cancellations and changes.

You will also be provided with the results from an eval set that include the following:
    - conversation history: This is the conversation that we present to the LLM along with the system prompt
    - expected_function: This is the function we expect the LLM to call
    - expected_input: This is the input we expect the LLM to provide to the function
    - actual_function: This is the actual function the LLM called
    - actual_input: This is the actual input the LLM provided
    - assistant_message_content: This is the message the LLM generated when it returned its response
    - is_correct: True/False value depending on if the model responded correctly

Carefully analyze the instructions provided as well as the results of the eval. Get a firm understanding of the failures in the policy.

Return an updated policy that will perform better against the dataset.

Here is the current policy:
{updated_policy}
"""
    }
]

for _ in range(5):
    # Evaluate the function calls with the current policy
    df, accuracy = evaluate_function_calls('evals/functionCallingEval.csv', updated_policy, 'gpt-3.5-turbo-0125')
    
    # Display the accuracy as a mini header
    display(Markdown(f"### Accuracy: {accuracy:.2%}"))
    display(df)

    results_json = df.to_json(orient='records')

    messages.append({
        "role": "user",
        "content": f"""
Here are the results based on the current policy:
{results_json}
"""
    })
    # Use the metaprompt function to get an updated policy
    temp_policy_json = enforce_schema(metaprompt(messages))
    temp_policy_str = temp_policy_json.strip("json```").strip("```")
    temp_policy = json.loads(temp_policy_str)["final_answer"]
    print(f"Corrected Policy: {temp_policy}")

    messages.append({
        "role": "assistant",
        "content": f"""
{temp_policy}
"""
    })

    # Update the policy for the next iteration
    updated_policy = temp_policy


### Accuracy: 94.12%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"I have confirmed that you have a non-refundable ticket. Since your ticket is non-refundable, I can offer you flight credit for future use. Would you like to proceed with flight credit?"
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},rebook_without_fee,{'booking_reference': 'VWX234'},False,"Great news! Your ticket is a flexible ticket, which means we can proceed with rebooking your flight without any additional fees. Let's go ahead and rebook your flight on the next available option."
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: ## **Policies for Flight Cancellations and Changes**

### **2\. Important Clarifications**

#### **2.1 Difference Between `process_change_no_fee` and `rebook_without_fee`**

- **`process_change_no_fee(booking_reference)`**:
  - **Use When**: The customer initiates a change to their booking and holds a **flexible ticket**.
  - **Purpose**: Processes the customer's requested change **without any additional fees**.
  - **Examples**:
    - Customer wants to change the departure time or date.
    - Customer wishes to change to a different flight on the same day.

- **`rebook_without_fee(booking_reference)`**:
  - **Use When**: The **airline initiates a change** or there is an **airline-induced disruption** (e.g., cancellations, significant delays).
  - **Purpose**: Rebooks the customer on an alternative flight **without any additional charges**.
  - **Examples**:
    - Flight is canceled by the airline.
    - Flight is significantly delayed (≥ 2 hours).
    - Offering an u

### Accuracy: 94.12%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"Your ticket is non-refundable. In this case, I can offer you flight credit for future use. Would you like to proceed with that option?"
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},rebook_without_fee,{'booking_reference': 'VWX234'},False,"Great news! You have a flexible ticket, which means you are eligible for rebooking without any additional fees. Let's proceed with rebooking your flight."
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: ## **Policies for Flight Cancellations and Changes**

### **2. Important Clarifications**

#### **2.1 Distinguishing Between Customer-Initiated and Airline-Initiated Actions**

- **Customer-Initiated Actions**:
  - **Definition**: Actions where the customer requests changes or cancellations to their booking.
  - **Functions to Use**:
    - **`process_change_no_fee(booking_reference)`**
    - **`apply_change_fee(booking_reference)`**
    - **`process_full_refund(booking_reference)`**
    - **`offer_flight_credit(booking_reference)`**

- **Airline-Initiated Actions**:
  - **Definition**: Actions where the airline changes or cancels flights due to operational reasons.
  - **Functions to Use**:
    - **`rebook_without_fee(booking_reference)`**
    - **`process_full_refund(booking_reference)`**
    - **`offer_flight_credit(booking_reference)`**

#### **2.2 Clarifying Function Usage: `process_change_no_fee` vs. `rebook_without_fee`**

- **`process_change_no_fee(booking_refe

### Accuracy: 94.12%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"I have confirmed that you have a non-refundable ticket. Since the ticket is non-refundable, I will proceed with offering you flight credit for future use. Let me process that for you."
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},rebook_without_fee,{'booking_reference': 'VWX234'},False,"Great news! You have a flexible ticket, which means you can rebook your flight without any additional fees. Let's proceed with the rebooking process."
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: ## **Policies for Flight Cancellations and Changes**

### **3. Cancellations: Types and Policies**

#### **3.1 Customer-Initiated Cancellations**

- **Refundable Tickets**:

  - **Within 24 Hours of Booking**:
    - **Action**: Call **`process_full_refund(booking_reference)`**.

  - **Beyond 24 Hours**:
    - **Action**:
      1. Call **`check_ticket_type(booking_reference)`** to confirm refundable status.
      2. Proceed with **`process_full_refund(booking_reference)`** if eligible.

- **Non-Refundable Tickets**:

  - **Action**: Offer flight credit by calling **`offer_flight_credit(booking_reference)`**.

- **Flexible Tickets**:

  - **Action**: Offer options based on customer preference:
    - **`process_full_refund(booking_reference)`**
    - **`offer_flight_credit(booking_reference)`**

### **4. Changes: Types and Policies**

#### **4.1 Customer-Initiated Changes**

- **Important**: Always **call `check_ticket_type(booking_reference)` before processing any chang

### Accuracy: 100.00%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"Your ticket is non-refundable. As per our policy for non-refundable tickets, I can offer you flight credit for future use. Would you like to proceed with that option?"
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},process_change_no_fee,{'booking_reference': 'VWX234'},True,"Great news! Your ticket is flexible, which means you can rebook your flight without any additional fees. Let's proceed with the rebooking process."
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: ### **3. Cancellations: Types and Policies**

#### **3.1 Customer-Initiated Cancellations**

- **Refundable Tickets**:

  - **Within 24 Hours of Booking**:
    - **Action**: Call **`process_full_refund(booking_reference)`**.
    - **Language**: "I will process a full refund for your booking."

  - **Beyond 24 Hours**:
    - **Action Steps**:
      1. Call **`check_ticket_type(booking_reference)`** to confirm refundable status.
      2. If eligible, proceed with **`process_full_refund(booking_reference)`**.
    - **Language**: "Your ticket is refundable. I will proceed with a full refund."

- **Non-Refundable Tickets**:

  - **Action**: Offer flight credit by calling **`offer_flight_credit(booking_reference)`**.
  - **Language**: "Your ticket is non-refundable. I can offer you flight credit for future travel."

- **Flexible Tickets**:

  - **Action**: Offer options based on customer preference:
    - **`process_full_refund(booking_reference)`**
    - **`offer_flight_cr

### Accuracy: 100.00%

Unnamed: 0,expected_function,expected_inputs,actual_function,actual_inputs,is_correct,assistant_message_content
0,check_ticket_type,{'booking_reference': 'ABC123'},check_ticket_type,{'booking_reference': 'ABC123'},True,
1,check_ticket_type,{'booking_reference': 'DEF456'},check_ticket_type,{'booking_reference': 'DEF456'},True,
2,process_full_refund,{'booking_reference': 'GHI789'},process_full_refund,{'booking_reference': 'GHI789'},True,
3,offer_flight_credit,{'booking_reference': 'JKL012'},offer_flight_credit,{'booking_reference': 'JKL012'},True,"Your ticket is non-refundable. However, I can offer you flight credit for future travel. Would you like to proceed with flight credit?"
4,check_ticket_type,{'booking_reference': 'MNO345'},check_ticket_type,{'booking_reference': 'MNO345'},True,
5,prioritize_missed_connections,{'booking_reference': 'PQR678'},prioritize_missed_connections,{'booking_reference': 'PQR678'},True,
6,check_compensation_eligibility,{'booking_reference': 'STU901'},check_compensation_eligibility,{'booking_reference': 'STU901'},True,
7,process_change_no_fee,{'booking_reference': 'VWX234'},process_change_no_fee,{'booking_reference': 'VWX234'},True,"Great news! Your ticket is flexible, which means you can rebook your flight without any additional fees. Would you like to proceed with rebooking your flight?"
8,process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",process_flexible_cancellation,"{'booking_reference': 'YZA567', 'medical_certificate': 'medical_certificate_001'}",True,
9,permit_name_change,{'booking_reference': 'BCD890'},permit_name_change,{'booking_reference': 'BCD890'},True,


Corrected Policy: ## **Policies for Flight Cancellations and Changes**

### **2. Important Clarifications**

#### **2.1 Use of Terms in Customer Communications**

- **"Change"**:
  - **Use When**: The **customer initiates** modifications to their booking (e.g., changing flight dates, times, destinations).
  - **Instructions**:
    - **Always use the term "change"** when assisting with customer-initiated requests.
    - **Do Not Use**: The term "rebook" in these scenarios.
  - **Examples**:
    - **Correct**: "You can **change** your flight without any additional fees."
    - **Incorrect**: "You can **rebook** your flight without any additional fees."

- **"Rebook"**:
  - **Use When**: The **airline initiates** changes due to cancellations, significant delays, or disruptions.
  - **Instructions**:
    - Use the term "**rebook**" when the airline is adjusting the booking.
  - **Examples**:
    - "We will **rebook** you on the next available flight at no additional charge."

#### **2.2 Fu