# Summarise Personalised Reviews

In [1]:
%load_ext autoreload
%autoreload 2

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

## Set up Azure OpenAI

In [2]:
import os
import openai
from dotenv import load_dotenv

# Set up Azure OpenAI
load_dotenv()
openai.api_type = "azure"
openai.api_base = "https://tutorial-openai-01-2023.openai.azure.com/"
openai.api_version = "2022-12-01"
openai.api_key = os.getenv("OPENAI_API_KEY")

True

## Deploy a Language Model
ref: 
- https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models
- https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models#text-search-embedding

In [3]:
# id of desired_model
query_model = "text-search-davinci-query-001" # suitable for Search, context relevance, information retrieval

# list models deployed
deployment_id = None
result = openai.Deployment.list()

for deployment in result.data:
    if deployment["status"] != "succeeded":
        continue
    
    model = openai.Model.retrieve(deployment["model"])
    if model["id"] == query_model:
        deployment_id = deployment["id"]
        
# if not model deployed, deploy one
if not deployment_id:
    print('No deployment with status: succeeded found.')

    # Now let's create the deployment
    print(f'Creating a new deployment with model: {query_model}')
    result = openai.Deployment.create(model=query_model, scale_settings={"scale_type":"standard"})
    deployment_id = result["id"]
    print(f'Successfully created {query_model} with deployment_id {deployment_id}')
else:
    print(f'Found a succeeded deployment of "{query_model}" that supports text search with id: {deployment_id}.')

Found a succeeded deployment of "text-search-davinci-query-001" that supports text search with id: deployment-33421982e4f64be9a3a9b2bb9af45def.


## Create Embeddings for Reviews

See [01-get-embeddings.ipynb](./01-get-embeddings.ipynb) on how to get embeddings.

In this example, we will load embeddings from a file. 

## Load Data

In [4]:
import pandas as pd
fname = '../data/rottentomatoes-20movies-embeddings.csv'
df_orig = pd.read_csv(fname, delimiter='\t', index_col=False) # This takes approx 46 sec.

In [8]:
import numpy as np

DEVELOPMENT = False  # Set to True for development using a subset of data

if DEVELOPMENT:
    # Sub-sample for development
    df = df_orig.sample(n=50, replace=False, random_state=9).copy()
else:
    df = df_orig.copy()

# drop rows with NaN
df.dropna(inplace=True)

# convert string to array
df["embedding"] = df['embedding'].apply(eval).apply(np.array) # This takes approx 3 min 13 sec
df.head()
df.shape

Unnamed: 0,Movie,Publish,Review,Date,Score,Word_Count,embedding
0,SOLO: A STAR WARS STORY,Stuff.co.nz,The formula is strong with this one.,24/05/2018,70.0,7,"[-0.013439337722957134, 0.006897337269037962, ..."
1,BLACK PANTHER,Gone With The Twins,Just about the same as every other Marvel title.,05/12/2020,50.0,9,"[-0.006859001703560352, 0.0037438718136399984,..."
2,DUNKIRK,Screen Zealots,This is one heck of a stunning war picture.,20/12/2018,80.0,9,"[-0.003785009030252695, 0.004640915431082249, ..."
3,KNIVES OUT,Student Edge,Don't fear: No spoilers here. All you need to ...,26/11/2019,80.0,17,"[0.0009526872891001403, 0.016423344612121582, ..."
4,KNIVES OUT,Deep Focus Review,"Sharp and funny, Knives Out exceeds expectatio...",23/02/2022,100.0,29,"[-0.005653353873640299, 0.010235012508928776, ..."


(6640, 7)

In [10]:
df['Movie'].value_counts()

JOKER                               380
CAPTAIN MARVEL                      373
ONCE UPON A TIME... IN HOLLYWOOD    372
AVENGERS: ENDGAME                   370
US                                  358
STAR WARS: THE RISE OF SKYWALKER    351
A STAR IS BORN                      340
BLACK PANTHER                       339
AVENGERS: INFINITY WAR              329
SOLO: A STAR WARS STORY             321
STAR WARS: THE LAST JEDI            320
SPIDER-MAN: FAR FROM HOME           319
DUNKIRK                             316
KNIVES OUT                          311
TOY STORY 4                         309
READY PLAYER ONE                    308
WONDER WOMAN                        307
1917                                306
FIRST MAN                           306
ROGUE ONE: A STAR WARS STORY        305
Name: Movie, dtype: int64

## Count Tokens

In [11]:
import tiktoken
encoding = tiktoken.get_encoding('gpt2')

df['token_count'] = ''

for idx, movie, review in zip(df.index.values, df['Movie'].loc[df.index.values], df['Review'].loc[df.index.values]):
    df['token_count'].loc[idx] = len(encoding.encode(review))

df.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)


Unnamed: 0,Movie,Publish,Review,Date,Score,Word_Count,embedding,token_count
0,SOLO: A STAR WARS STORY,Stuff.co.nz,The formula is strong with this one.,24/05/2018,70.0,7,"[-0.013439337722957134, 0.006897337269037962, ...",8
1,BLACK PANTHER,Gone With The Twins,Just about the same as every other Marvel title.,05/12/2020,50.0,9,"[-0.006859001703560352, 0.0037438718136399984,...",10
2,DUNKIRK,Screen Zealots,This is one heck of a stunning war picture.,20/12/2018,80.0,9,"[-0.003785009030252695, 0.004640915431082249, ...",10
3,KNIVES OUT,Student Edge,Don't fear: No spoilers here. All you need to ...,26/11/2019,80.0,17,"[0.0009526872891001403, 0.016423344612121582, ...",23
4,KNIVES OUT,Deep Focus Review,"Sharp and funny, Knives Out exceeds expectatio...",23/02/2022,100.0,29,"[-0.005653353873640299, 0.010235012508928776, ...",37


## Find documents with similar embeddings to the embeddings of the question

In [12]:
import numpy as np

def get_embedding(text, deployment_id=deployment_id):
    """ 
    Get embeddings for an input text. 
    """
    result = openai.Embedding.create(
      deployment_id=deployment_id,
      input=text
    )
    result = np.array(result["data"][0]["embedding"])
    return result

def vector_similarity(x, y):
    """
    Returns the similarity between two vectors.    
    Because OpenAI Embeddings are normalized to length 1, the cosine similarity is the same as the dot product.
    """
    similarity = np.dot(x, y)
    return similarity 

def order_document_sections_by_query_similarity(query, contexts):
    """
    Find the query embedding for the supplied query, and compare it against all of the pre-calculated document embeddings
    to find the most relevant sections. 
    Return the list of document sections, sorted by relevance in descending order.
    """
    query_embedding = get_embedding(query)

    document_similarities = sorted([
        (vector_similarity(query_embedding, doc_embedding), doc_index) for doc_index, doc_embedding in contexts.items()
    ], reverse=True)
    
    return document_similarities

## Construct prompt
Add relevant document sections to the query prompt.

In [39]:
MAX_SECTION_LEN = 500
SEPARATOR = "\n* "
ENCODING = "gpt2"  # encoding for text-davinci-003

encoding = tiktoken.get_encoding(ENCODING)
separator_len = len(encoding.encode(SEPARATOR))

In [40]:
def construct_prompt(query: str, context_embeddings: pd.DataFrame, df: pd.DataFrame) -> str:
    """
    Append sections of document that are most similar to the query.
    """
    most_relevant_document_sections = order_document_sections_by_query_similarity(query, context_embeddings)
    
    chosen_sections = []
    chosen_sections_len = 0
    chosen_sections_indexes = []
     
    for _, section_index in most_relevant_document_sections:
        # Add contexts until we run out of space.        
        document_section = df.loc[section_index]
        
        chosen_sections_len += document_section['token_count'] + separator_len
        if chosen_sections_len > MAX_SECTION_LEN:
            break
            
        chosen_sections.append(SEPARATOR + 
                               'movie title: ' + document_section['Movie'] + ' ' +
                               document_section['Review'].replace("\n", " "))
        
        chosen_sections_indexes.append(str(section_index))
            
    # Diagnostic information
    print(f"Selected {len(chosen_sections)} document sections, with indexes:")    
    for i in chosen_sections_indexes:
        print(i + " " + df['Movie'].loc[int(i)] + 
              )
    
    header = """Answer the question truthfully using context, if unsure, say "I don't know."\n\nContext:\n"""
    prompt = header + "".join(chosen_sections) + "\n\n Q: " + query + "\n A:"
    
    return prompt

### Example prompts

In [41]:
query = 'Summarise reviews of Captain Marvel.'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)

Selected 11 document sections, with indexes:
3025 CAPTAIN MARVEL
2821 CAPTAIN MARVEL
2801 CAPTAIN MARVEL
2724 CAPTAIN MARVEL
2817 CAPTAIN MARVEL
2955 CAPTAIN MARVEL
2991 CAPTAIN MARVEL
3023 CAPTAIN MARVEL
2960 CAPTAIN MARVEL
2868 CAPTAIN MARVEL
2984 CAPTAIN MARVEL
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: CAPTAIN MARVEL A plucky and pleasing, if predictable, excursion that burns brightly, if briefly.
* movie title: CAPTAIN MARVEL It's perfectly possible to point out the narrative hiccups of the movie and, at the same time, to stress how important its existence and success are. [Full review in Portuguese]
* movie title: CAPTAIN MARVEL Captain Marvel is an unremarkable, passable time killer...[Brie] Larson's performance is wooden...while the film's big action scenes are so lacking in imagination the screen often looks as though it's being continually doused in technicolour vomit.
* movie title: CAPTAIN MARVEL Unfortunately, the

## Retrieve Information

In [42]:
def retrieve_information(prompt):
    try:
        # Request API
        response = openai.Completion.create(
            deployment_id= "text-davinci-003", # Assumed already deployed
            prompt=prompt,
            temperature=1,
            max_tokens=100,
            top_p=1.0,
            frequency_penalty=0.0,
            presence_penalty=1
        )

        # response
        result = response['choices'][0]['text']; print(result)
    except Exception as err:
        print(idx)
        print(f"Unexpected {err=}, {type(err)=}")

    return 

## Example Queries

In [43]:
query = 'Summarise reviews of Captain Marvel.'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 11 document sections, with indexes:
3025 CAPTAIN MARVEL
2821 CAPTAIN MARVEL
2801 CAPTAIN MARVEL
2724 CAPTAIN MARVEL
2817 CAPTAIN MARVEL
2955 CAPTAIN MARVEL
2991 CAPTAIN MARVEL
3023 CAPTAIN MARVEL
2960 CAPTAIN MARVEL
2868 CAPTAIN MARVEL
2984 CAPTAIN MARVEL
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: CAPTAIN MARVEL A plucky and pleasing, if predictable, excursion that burns brightly, if briefly.
* movie title: CAPTAIN MARVEL It's perfectly possible to point out the narrative hiccups of the movie and, at the same time, to stress how important its existence and success are. [Full review in Portuguese]
* movie title: CAPTAIN MARVEL Captain Marvel is an unremarkable, passable time killer...[Brie] Larson's performance is wooden...while the film's big action scenes are so lacking in imagination the screen often looks as though it's being continually doused in technicolour vomit.
* movie title: CAPTAIN MARVEL Unfortunately, the

In [44]:
query = 'Should I watch Ready Player One?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 13 document sections, with indexes:
4755 READY PLAYER ONE
4908 READY PLAYER ONE
4872 READY PLAYER ONE
4980 READY PLAYER ONE
4843 READY PLAYER ONE
4819 READY PLAYER ONE
4949 READY PLAYER ONE
4999 READY PLAYER ONE
4782 READY PLAYER ONE
4795 READY PLAYER ONE
4801 READY PLAYER ONE
4802 READY PLAYER ONE
4966 READY PLAYER ONE
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: READY PLAYER ONE It's a film to watch and enjoy once, but probably not to return to again and again, like Spielberg's best. It's fixated on easter eggs and it's like an easter egg itself: shiny and pretty, inducing a brief sugar high, but ultimately hollow.
* movie title: READY PLAYER ONE ...this failed the watch test miserably as I didn't know if I would actually survive almost 2 1/2 hours of watching all this nonsensical gobbledygook. But it will probably gross a billion dollars.
* movie title: READY PLAYER ONE Escapist fun made for mandatory repeat viewings

In [45]:
query = 'Should I watch Spiderman Far from Home? I am big fan of visual effects.'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 12 document sections, with indexes:
2093 SPIDER-MAN: FAR FROM HOME
2228 SPIDER-MAN: FAR FROM HOME
2148 SPIDER-MAN: FAR FROM HOME
2238 SPIDER-MAN: FAR FROM HOME
2116 SPIDER-MAN: FAR FROM HOME
2348 SPIDER-MAN: FAR FROM HOME
2253 SPIDER-MAN: FAR FROM HOME
2169 SPIDER-MAN: FAR FROM HOME
2044 SPIDER-MAN: FAR FROM HOME
2050 SPIDER-MAN: FAR FROM HOME
2207 SPIDER-MAN: FAR FROM HOME
2211 SPIDER-MAN: FAR FROM HOME
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: SPIDER-MAN: FAR FROM HOME Goodness, this is cleverly constructed stuff - full of fun, humour, youth and big visual effects, but underpinned by a screenplay that does a really well thought-through job of linking the MCU's future to its hugely popular past.
* movie title: SPIDER-MAN: FAR FROM HOME It has it all: great battles and special effects, a touching human tale, dramatic twists, wit and humor, stunning scenery, super-cool gadgets, a clear place in the Marvel mythology.
*

In [46]:
query = 'Why shouldn\'t I watch spiderman? I am big fan of visual effects.'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 11 document sections, with indexes:
2093 SPIDER-MAN: FAR FROM HOME
2309 SPIDER-MAN: FAR FROM HOME
2253 SPIDER-MAN: FAR FROM HOME
2090 SPIDER-MAN: FAR FROM HOME
2044 SPIDER-MAN: FAR FROM HOME
2228 SPIDER-MAN: FAR FROM HOME
2050 SPIDER-MAN: FAR FROM HOME
2110 SPIDER-MAN: FAR FROM HOME
2169 SPIDER-MAN: FAR FROM HOME
2310 SPIDER-MAN: FAR FROM HOME
2116 SPIDER-MAN: FAR FROM HOME
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: SPIDER-MAN: FAR FROM HOME Goodness, this is cleverly constructed stuff - full of fun, humour, youth and big visual effects, but underpinned by a screenplay that does a really well thought-through job of linking the MCU's future to its hugely popular past.
* movie title: SPIDER-MAN: FAR FROM HOME The story is pyrotechnical. The soundtrack is multi-decibel. There is no room for wit, thought, emotion or seriously challenging novelty. But, simultaneously, I'd rather watch Holland do this rubbish than most movi

In [47]:
query = 'Why shouldn\'t I watch 1917?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 15 document sections, with indexes:
932 1917
778 1917
821 1917
879 1917
831 1917
851 1917
773 1917
694 1917
685 1917
686 1917
665 1917
861 1917
702 1917
909 1917
674 1917
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: 1917 Tense, powerful and breathtaking, 1917 is a visceral movie you won't forget in a hurry - and it's the first must-see movie of next year.
* movie title: 1917 From the sole perspective of the filmmaking craft, "1917" is worth a watch.
* movie title: 1917 This is a movie one does not watch so much as witness. It simply must be seen.
* movie title: 1917 Sitting through it is like watching someone else playing a video game for two solid hours, and not an especially compelling one at that.
* movie title: 1917 Full of thrills and spills yet surprisingly intimate and personal, 1917 is a breathtaking, profound tour de force, easily sealing its position as one of the greatest war films of all time.
* movie title:

In [48]:
query = 'I am not a big fan of lengthy movie, should I watch 1917?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 13 document sections, with indexes:
674 1917
879 1917
759 1917
719 1917
949 1917
952 1917
932 1917
734 1917
831 1917
862 1917
738 1917
701 1917
694 1917
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: 1917 The film's final hour loses steam and is beset by more than a few narrative lapses it ultimately can't overcome. Still, this is a worthwhile epic best seen on the big screen.
* movie title: 1917 Sitting through it is like watching someone else playing a video game for two solid hours, and not an especially compelling one at that.
* movie title: 1917 In other words, "1917" often seems built more to wow audience than make them feel. And it may well have been a better film set around extended cuts than fully committing to the one-take gimmick.
* movie title: 1917 It proves that its long takes, instead of being a way of making things more difficult or 'artistic', are a vital component for this particular story. [Full review 

In [49]:
query = 'I love visual effects, should I watch Captain Marvel or Spiderman?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 11 document sections, with indexes:
2823 CAPTAIN MARVEL
2093 SPIDER-MAN: FAR FROM HOME
2148 SPIDER-MAN: FAR FROM HOME
2857 CAPTAIN MARVEL
2116 SPIDER-MAN: FAR FROM HOME
3091 CAPTAIN MARVEL
2785 CAPTAIN MARVEL
2253 SPIDER-MAN: FAR FROM HOME
2875 CAPTAIN MARVEL
2851 CAPTAIN MARVEL
3005 CAPTAIN MARVEL
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: CAPTAIN MARVEL Great visual effects and acting by Brie Larson make for an enjoyable watch that embraces a confident and smart woman character, something rarely seen in this genre.
* movie title: SPIDER-MAN: FAR FROM HOME Goodness, this is cleverly constructed stuff - full of fun, humour, youth and big visual effects, but underpinned by a screenplay that does a really well thought-through job of linking the MCU's future to its hugely popular past.
* movie title: SPIDER-MAN: FAR FROM HOME Special credit must be given to the effects team, which brings to life Mysterio's trippy, percep

In [None]:
query = 'I love emotional movies, what movie should I watch?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 15 document sections, with indexes:
5719 DUNKIRK
5938 DUNKIRK
2463 AVENGERS: ENDGAME
920 1917
713 1917
726 1917
699 1917
2369 AVENGERS: ENDGAME
943 1917
672 1917
1817 TOY STORY 4
860 1917
842 1917
734 1917
5157 BLACK PANTHER
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: DUNKIRK It's an emotional gauntlet, as you'll be glued to the edge of your seats with your eyes staring at the screening.
* movie title: DUNKIRK The most emotional moments in Nolan's film are the smaller ones, a soldier slipping into the sea swimming towards home, a young hero recognized in a local newspaper.
* movie title: AVENGERS: ENDGAME Emotional heft combines with a sweeping sense of the epic, often within the same scene.
* movie title: 1917 Emotionally hollow, 1917 is the big what if of 2019.
* movie title: 1917 Emotional and poetic in its spectacular mixture of noise, chaos and almost abstract calm. [Full Review in Spanish]
* movie title: 1917 I w

In [None]:
query = 'Is Joker a scary movie?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 13 document sections, with indexes:
1289 JOKER
1184 JOKER
1245 JOKER
1025 JOKER
1295 JOKER
991 JOKER
1082 JOKER
1336 JOKER
1240 JOKER
1269 JOKER
1332 JOKER
1319 JOKER
1121 JOKER
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: JOKER This movie is indeed terrifying, but that's because it's horrifically, unintentionally insightful of how such men are viewed, both from without and within.
* movie title: JOKER The social commentary of "Joker" is laughably one-dimensional and Phoenix's Fleck isn't that scary. Really, he's more sad than anything.
* movie title: JOKER The Joker is deadly serious, a bleak but oddly beautiful horror film that evokes the nightmarish nihilism of Martin Scorsese's Taxi Driver.
* movie title: JOKER A disturbing and frightening origin story that reflects on sensitive issues that affect today's society. Joaquin Phoenix's bravura performance as the popular comic book villain is nothing short of extraordina

In [None]:
query = 'What type of movie is Joker?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 13 document sections, with indexes:
991 JOKER
1347 JOKER
1105 JOKER
1119 JOKER
1121 JOKER
1126 JOKER
1336 JOKER
1314 JOKER
1078 JOKER
1086 JOKER
1090 JOKER
1180 JOKER
1297 JOKER
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: JOKER Joker is dark, dense, violent, and monstrous. A descent into Hades that is unlike anything we've seen on screen that's inspired by a comic [Full review in Spanish]
* movie title: JOKER Simply calling Joker a "comic book movie" does it a disservice; it is a story that feels like it could be about any number of disaffected people who are marginalized by the ruthless world in which we live.
* movie title: JOKER "Joker" is a character study laced with societal commentary, more art-house than blockbuster. And in a world of ready-made superhero franchises, it's so bold, it's shocking.
* movie title: JOKER A dark, deranged and often mesmerizing take on the superhero genre. The sort of movie that crawls

# Failed Examples

The set of examples below failed to retrieve both the movie title in query. Therefore, the response is deemed not reliable. 

In [50]:
query = 'I love emotional movies, should I watch Knives Out or A Star Is Born?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 11 document sections, with indexes:
303 KNIVES OUT
225 KNIVES OUT
296 KNIVES OUT
279 KNIVES OUT
36 KNIVES OUT
148 KNIVES OUT
131 KNIVES OUT
147 KNIVES OUT
60 KNIVES OUT
273 KNIVES OUT
57 KNIVES OUT
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: KNIVES OUT If there's a case for a film that should be experienced purely with an audience, Rian Johnson's "Knives Out" would be a spectacular example. His latest feature is a deliciously entertaining, hysterical whodunit from start to finish.
* movie title: KNIVES OUT This film is super satisfying. It's funny, the mystery plot really works, and it also has something to say, which it does with enjoyably righteous anger. Go for the production design and the plot, leave being blown away by De Armas.
* movie title: KNIVES OUT One of the funniest films of the year, this tale of deception and intrigue packs an all-star cast at their very best and some theatrical filmmaking from a techni

In [55]:
query = 'How is Knives Out compared to Joker?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 12 document sections, with indexes:
4 KNIVES OUT
102 KNIVES OUT
288 KNIVES OUT
290 KNIVES OUT
147 KNIVES OUT
213 KNIVES OUT
32 KNIVES OUT
269 KNIVES OUT
104 KNIVES OUT
184 KNIVES OUT
225 KNIVES OUT
264 KNIVES OUT
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: KNIVES OUT Sharp and funny, Knives Out exceeds expectations by proving to be more than its surface implies, even as Johnson demonstrates his first-rate skill in the story's maneuvers and charades.
* movie title: KNIVES OUT On the performance front, the Knives Out show is comprehensively stolen - and never once returned - by a wired, inspired and kookily amusing Daniel Craig.
* movie title: KNIVES OUT Similar to Ready or Not earlier this year, Knives Out is an unabashed commentary on the current state of affairs in America.
* movie title: KNIVES OUT Through clever plotting and smart sleight of hand, Johnson has crafted a whodunit that's worthy of Agatha Christie or Ar

In [60]:
query = 'Is the movie Joker better or 1917?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 12 document sections, with indexes:
759 1917
943 1917
831 1917
922 1917
830 1917
932 1917
777 1917
738 1917
671 1917
861 1917
748 1917
750 1917
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: 1917 In other words, "1917" often seems built more to wow audience than make them feel. And it may well have been a better film set around extended cuts than fully committing to the one-take gimmick.
* movie title: 1917 It's both stylish and gimmicky, while also emotionally tense and incredibly clever, resulting in a unique cinematic experience that's frequently harrowing, propulsive, and often emotional excruciating.
* movie title: 1917 Full of thrills and spills yet surprisingly intimate and personal, 1917 is a breathtaking, profound tour de force, easily sealing its position as one of the greatest war films of all time.
* movie title: 1917 With incredibly immersive visuals, courtesy of the great Roger Deakins, 1917 is a major cinem

In the example below, there is no movie title Superman. Yet, it provides a response based on 'Wonder Woman'. 

Mitigation: Consider using `Movie` title as a filter. 

In [61]:
query = 'What are the reviews of Superman?'
prompt = construct_prompt(query=query, context_embeddings=df['embedding'], df=df); print(prompt)
retrieve_information(prompt=prompt)

Selected 12 document sections, with indexes:
6146 WONDER WOMAN
6206 WONDER WOMAN
6149 WONDER WOMAN
6116 WONDER WOMAN
6220 WONDER WOMAN
6202 WONDER WOMAN
6107 WONDER WOMAN
6201 WONDER WOMAN
6039 WONDER WOMAN
6311 WONDER WOMAN
6231 WONDER WOMAN
6119 WONDER WOMAN
Answer the question truthfully using context, if unsure, say "I don't know."

Context:

* movie title: WONDER WOMAN Wonder Woman is a great action movie, full of adventure and light drama, similar to films like Christopher Reeve's Superman. [Full review in Spanish]
* movie title: WONDER WOMAN There is an engaging, old‑school feel to Wonder Woman, a film that has some of the sincerity and charm of the early Christopher Reeve Superman epics.
* movie title: WONDER WOMAN DC finally breaks its long losing streak with a satisfying effort- and all it took was changing absolutely everything (Splice Today)
* movie title: WONDER WOMAN A well-intentioned film that in an age of cynicism dares to recapitulate the importance of the primeval id