notebooks/search/10-semantic-reranking-retriever-cohere.ipynb

{ "cells": [ { "cell_type": "markdown", "id": "c2907fddfeac343a", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "# Semantic Reranking with Cohere Reranker\n", "\n", "<a target=\"_blank\" href=\"https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/10-semantic-reranking-retriever-cohere.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n", "\n", "This example will show how to combine search and [semantic reranking](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-reranking.html) to improve the accuracy of your search results. We'll be using the [rerank feature from Cohere](https://cohere.com/rerank).\n", "\n", "Note: for a complete integration with Cohere please refer to [this notebook](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/integrations/cohere/cohere-elasticsearch.ipynb). This example focuses on Cohere reranking only through an Elastic [retriever](https://www.elastic.co/guide/en/elasticsearch/reference/current/retrievers-overview.html) query." ] }, { "cell_type": "markdown", "id": "3db37d2cf8264468", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "# Requirements\n", "\n", "For this example, you will need:\n", "\n", "- An Elastic deployment:\n", "\n", " - We'll be using [Elastic Cloud](https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html) for this example (available with a [free trial](https://cloud.elastic.co/registration?onboarding_token=vectorsearch&utm_source=github&utm_content=elasticsearch-labs-notebook))\n", "\n", "- Elasticsearch 8.15 or above, or [Elasticsearch serverless](https://www.elastic.co/elasticsearch/serverless)\n", "\n", "- Cohere API key" ] }, { "cell_type": "markdown", "id": "7fe1ed0703a8d1d3", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Create Elastic Cloud deployment\n", "\n", "If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?onboarding_token=vectorsearch&utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial." ] }, { "cell_type": "markdown", "id": "f9c8bd62c8241f90", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Install packages and connect with Elasticsearch Client\n", "\n", "To get started, we'll need to connect to our Elastic deployment using the Python client (version 8.15.0 or above).\n", "Because we're using an Elastic Cloud deployment, we'll use the **Cloud ID** to identify our deployment.\n", "\n", "First we need to `pip` install the `elasticsearch` package:" ] }, { "cell_type": "code", "execution_count": null, "id": "13fdf7656ced2da3", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "!pip install elasticsearch" ] }, { "cell_type": "markdown", "id": "9d54b112361d2f3d", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Next, we need to import the modules we need.\n", "\n", "🔐 NOTE: `getpass` enables us to securely prompt the user for credentials without echoing them to the terminal, or storing it in memory." ] }, { "cell_type": "code", "execution_count": null, "id": "9a60627704e77ff6", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from elasticsearch import Elasticsearch, exceptions\n", "from urllib.request import urlopen\n", "from getpass import getpass\n", "import json" ] }, { "cell_type": "markdown", "id": "eb9498124146d8bb", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Now we can instantiate the Python Elasticsearch client.\n", "\n", "First we prompt the user for their password and Cloud ID.\n", "Then we create a `client` object that instantiates an instance of the `Elasticsearch` class." ] }, { "cell_type": "code", "execution_count": null, "id": "6e14437dcce0f235", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id\n", "ELASTIC_CLOUD_ID = getpass(\"Elastic Cloud ID: \")\n", "\n", "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key\n", "ELASTIC_API_KEY = getpass(\"Elastic API Key: \")\n", "\n", "# Create the client instance\n", "client = Elasticsearch(\n", " # For local development\n", " # hosts=[\"http://localhost:9200\"]\n", " cloud_id=ELASTIC_CLOUD_ID,\n", " api_key=ELASTIC_API_KEY,\n", ")" ] }, { "cell_type": "markdown", "id": "89b6b7721f6d8599", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Enable telemetry\n", "\n", "Knowing that you are using this notebook helps us decide where to invest our efforts to improve our products. We would like to ask you that you run the following code to let us gather anonymous usage statistics. See [telemetry.py](https://github.com/elastic/elasticsearch-labs/blob/main/telemetry/telemetry.py) for details. Thank you!" ] }, { "cell_type": "code", "execution_count": null, "id": "5a7af618fb61f358", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "!curl -O -s https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/telemetry/telemetry.py\n", "from telemetry import enable_telemetry\n", "\n", "client = enable_telemetry(client, \"10-semantic-reranking-retriever-cohere\")" ] }, { "cell_type": "markdown", "id": "cbbdaf9118a97732", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Test the Client\n", "\n", "Before you continue, confirm that the client has connected with this test." ] }, { "cell_type": "code", "execution_count": null, "id": "4cb0685fae12e034", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "print(client.info())" ] }, { "cell_type": "markdown", "id": "59e2223bf2c4331", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Refer to [the documentation](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#connect-self-managed-new) to learn how to connect to a self-managed deployment.\n", "\n", "Read [this page](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#connect-self-managed-new) to learn how to connect using API keys." ] }, { "cell_type": "markdown", "id": "a45f9857", "metadata": {}, "source": [ "## Set up Cohere Inference Endpoint\n", "\n", "We'll be using the [Cohere rerank](https://cohere.com/rerank) feature to perform semantic reordering of search hits through an Elasticsearch [inference endpoint](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html).\n", "\n", "Go to the [Cohere website](https://cohere.com/) and create an API key, then set it here." ] }, { "cell_type": "code", "execution_count": null, "id": "13ddd1ed", "metadata": {}, "outputs": [], "source": [ "COHERE_API_KEY = getpass(\"Cohere API key: \")" ] }, { "cell_type": "markdown", "id": "22fa643780acd44a", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Create the Inference Endpoint\n", "\n", "Let's create the inference endpoint by using the [Create inference API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html).\n", "\n", "For this example we'll use the [Cohere service](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html), but the inference API also supports [many other inference services](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html#put-inference-api-desc)." ] }, { "cell_type": "code", "execution_count": null, "id": "b02700b4", "metadata": {}, "outputs": [], "source": [ "try:\n", " client.inference.delete(inference_id=\"cohere-rerank-inference\")\n", "except exceptions.NotFoundError:\n", " # Inference endpoint does not exist\n", " pass\n", "\n", "try:\n", " client.options(\n", " request_timeout=60, max_retries=3, retry_on_timeout=True\n", " ).inference.put(\n", " task_type=\"rerank\",\n", " inference_id=\"cohere-rerank-inference\",\n", " inference_config={\n", " \"service\": \"cohere\",\n", " \"service_settings\": {\n", " \"api_key\": COHERE_API_KEY,\n", " \"model_id\": \"rerank-english-v3.0\",\n", " },\n", " },\n", " )\n", " print(\"Inference endpoint created successfully\")\n", "except exceptions.BadRequestError as e:\n", " if e.error == \"resource_already_exists_exception\":\n", " print(\"Inference endpoint created successfully\")\n", " else:\n", " raise e" ] }, { "cell_type": "markdown", "id": "818f7a72a83b5776", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Create the Index\n", "\n", "Now we need to create an index. Let's create one that enables us to perform search and semantic reranking on text articles." ] }, { "cell_type": "code", "execution_count": null, "id": "ace87760606f67c6", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "client.indices.delete(index=\"semantic-reranking-articles\", ignore_unavailable=True)\n", "client.indices.create(\n", " index=\"semantic-reranking-articles\",\n", " mappings={\n", " \"properties\": {\n", " \"title\": {\"type\": \"text\"},\n", " \"text\": {\"type\": \"text\"},\n", " },\n", " },\n", ")" ] }, { "cell_type": "markdown", "id": "2b5a46b60660a489", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Populate the Index\n", "\n", "Let's populate the index with a couple of random article fragments from Wikipedia." ] }, { "cell_type": "code", "execution_count": null, "id": "24f0133923553d28", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "url = \"https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/notebooks/search/articles-wikipedia.json\"\n", "response = urlopen(url)\n", "articles = json.loads(response.read())\n", "\n", "operations = []\n", "for article in articles:\n", " operations.append({\"index\": {\"_index\": \"semantic-reranking-articles\"}})\n", " operations.append(article)\n", "client.bulk(index=\"semantic-reranking-articles\", operations=operations, refresh=True)" ] }, { "cell_type": "markdown", "id": "6fff5932fcbac1b0", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Search without reranking\n", "\n", "First let's run a classic search that uses lexical text matching.\n", "\n", "### Aside: Pretty printing Elasticsearch search results\n", "\n", "Your `search` API calls will return hard-to-read nested JSON.\n", "We'll create a little function called `pretty_search_response` to return nice, human-readable outputs from our examples." ] }, { "cell_type": "code", "execution_count": null, "id": "ad417b4b3f50c889", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "def pretty_search_response(response):\n", " if len(response[\"hits\"][\"hits\"]) == 0:\n", " print(\"Your search returned no results.\")\n", " else:\n", " for hit in response[\"hits\"][\"hits\"]:\n", " id = hit[\"_id\"]\n", " score = hit[\"_score\"]\n", " title = hit[\"_source\"][\"title\"]\n", " text = hit[\"_source\"][\"text\"]\n", "\n", " pretty_output = f\"\\nID: {id}\\nScore: {score}\\nTitle: {title}\\nText: {text}\"\n", "\n", " print(pretty_output)" ] }, { "cell_type": "markdown", "id": "926c77e0", "metadata": {}, "source": [ "Assume we're interested to learn about the solar eclipse, but we don't know the exact name of this phenomenon. We'll perform a classic search that matches the text _\"the Moon covers the Sun\"_. Let's see what results this finds:" ] }, { "cell_type": "code", "execution_count": 67, "id": "7d5a50e3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "ID: yyKfN5EB0RS1pNqNt7y0\n", "Score: 2.0718374\n", "Title: Cheshire\n", "Text: Cheshire is a county in England. It is the North West part of the country. It is most famous for making salt and cheese. Cheshire is made up of lots of little towns including the Borough of Macclesfield which covers a large area of plains. The main attraction is in Kerridge where there is the famous landmark 'White Nancy.'\n", "\n", "ID: 1CKfN5EB0RS1pNqNt7y0\n", "Score: 0.8610966\n", "Title: Sun Moon Lake\n", "Text: Sun Moon Lake (; Thao: \"Zintun\") is a lake in Nantou County, Taiwan. It is the largest lake in Taiwan. Sun Moon Lake is one of the Eight Views of Taiwan. The lake was named because the east side of the lake looks like a sun, and the west side of the lake looks like a moon.\n", "\n", "ID: 1SKfN5EB0RS1pNqNt7y0\n", "Score: 0.83579814\n", "Title: Unification Church\n", "Text: The Unification Church is a religious movement started by Sun Myung Moon in Korea in the 1940s. It officially began as a church in 1954 in Seoul, South Korea. On October 12, 2009, it was announced that Sun Myung Moon was given the church to his sons, Moon Hyung-jin, Moon Kook-jin, and Moon Hyun-jin.\n", "\n", "ID: zyKfN5EB0RS1pNqNt7y0\n", "Score: 0.782294\n", "Title: Phases of the Moon\n", "Text: As the Moon orbits around the Earth, the half of the Moon that faces the Sun will be lit up. The different shapes of the lit portion of the Moon that can be seen from Earth are known as phases of the Moon. Each phase repeats itself every 29.5 days.\n", "\n", "ID: ziKfN5EB0RS1pNqNt7y0\n", "Score: 0.77926123\n", "Title: Orbital revolution\n", "Text: Orbital revolution is the movement of a planet around a star, or a moon around a planet. For example, the Earth revolves around the Sun, and the Moon revolves about the Earth.\n", "\n", "ID: 0iKfN5EB0RS1pNqNt7y0\n", "Score: 0.75159883\n", "Title: Solar eclipse of December 14, 2020\n", "Text: A total solar eclipse occurred on Monday, December 14, 2020. A solar eclipse occurs when the Moon passes between Earth and the Sun, which will cover the image of the Sun for a viewer on Earth.\n", "\n", "ID: 0SKfN5EB0RS1pNqNt7y0\n", "Score: 0.73564994\n", "Title: Solar eclipse\n", "Text: As seen from Earth, a solar eclipse /\"ee-klips\"/ happens when the Moon is directly between the Earth and the Sun. This makes the Moon fully or partially (partly) cover the sun. Solar eclipses can only happen during a new moon. Every year there are about two solar eclipses. Sometimes there are even five solar eclipses in a year. However, only two of these can be total solar eclipses, and often a year will pass without a total eclipse.\n", "\n", "ID: zCKfN5EB0RS1pNqNt7y0\n", "Score: 0.7309638\n", "Title: Mundilfari\n", "Text: Mundilfari (Mundilfäri) (Old Norse, possibly \"the one moving according to particular times\") is in Norse mythology a father of Sól (Sun) and Máni (Moon). One moon is named after him.\n", "\n", "ID: 0CKfN5EB0RS1pNqNt7y0\n", "Score: 0.71921694\n", "Title: Pokémon Ultra Sun and Ultra Moon\n", "Text: Pokémon Ultra Sun and Ultra Moon is a game in the 7th generation of \"Pokémon\". It was released on the Nintendo 3DS in 2017.\n", "\n", "ID: 0yKfN5EB0RS1pNqNt7y0\n", "Score: 0.6899033\n", "Title: Sun and moon letters\n", "Text: In Arabic and Maltese, consonants are divided into two groups: the sun/solar letters ( ', Maltese: konsonanti xemxin) and moon/lunar letters ( ', Maltese: konsonanti qamrin).\n" ] } ], "source": [ "query = {\"match\": {\"text\": \"the Moon covers the Sun\"}}\n", "response = client.search(\n", " index=\"semantic-reranking-articles\",\n", " query=query,\n", ")\n", "\n", "pretty_search_response(response)" ] }, { "cell_type": "markdown", "id": "783f82a4", "metadata": {}, "source": [ "The top hits - Cheshire, Sun Moon Lake, Unification Church and so on - all come up because the text has matching words with our query's words, for example \"covers\" or \"sun\". However, these contents are unrelated to the _meaning_ of our query. Further down below, result #7 _is_ an article about the solar eclipse, but it's lost among the many other hits.\n", "\n", "Can we somehow get more relevant results?" ] }, { "cell_type": "markdown", "id": "14d6c1f1", "metadata": {}, "source": [ "## Search with reranking\n", "\n", "Enter semantic reranking! We'll instruct Elasticsearch to run the same query, but this time also perform semantic reranking on the top results. For this we need to wrap our query in a `text_similarity_reranker` [retriever](https://www.elastic.co/guide/en/elasticsearch/reference/current/retrievers-overview.html), and reference the previously created Cohere inference endpoint that will do the reranking." ] }, { "cell_type": "code", "execution_count": 68, "id": "141457d6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "ID: 0SKfN5EB0RS1pNqNt7y0\n", "Score: 0.9812029\n", "Title: Solar eclipse\n", "Text: As seen from Earth, a solar eclipse /\"ee-klips\"/ happens when the Moon is directly between the Earth and the Sun. This makes the Moon fully or partially (partly) cover the sun. Solar eclipses can only happen during a new moon. Every year there are about two solar eclipses. Sometimes there are even five solar eclipses in a year. However, only two of these can be total solar eclipses, and often a year will pass without a total eclipse.\n", "\n", "ID: 0iKfN5EB0RS1pNqNt7y0\n", "Score: 0.9078038\n", "Title: Solar eclipse of December 14, 2020\n", "Text: A total solar eclipse occurred on Monday, December 14, 2020. A solar eclipse occurs when the Moon passes between Earth and the Sun, which will cover the image of the Sun for a viewer on Earth.\n", "\n", "ID: zyKfN5EB0RS1pNqNt7y0\n", "Score: 0.41584742\n", "Title: Phases of the Moon\n", "Text: As the Moon orbits around the Earth, the half of the Moon that faces the Sun will be lit up. The different shapes of the lit portion of the Moon that can be seen from Earth are known as phases of the Moon. Each phase repeats itself every 29.5 days.\n" ] } ], "source": [ "query = {\"match\": {\"text\": \"the Moon covers the Sun\"}}\n", "response = client.search(\n", " index=\"semantic-reranking-articles\",\n", " retriever={\n", " \"text_similarity_reranker\": {\n", " \"retriever\": {\"standard\": {\"query\": query}},\n", " \"field\": \"text\",\n", " \"rank_window_size\": 20,\n", " \"inference_id\": \"cohere-rerank-inference\",\n", " \"inference_text\": \"the Moon covers the Sun\",\n", " \"min_score\": 0.40,\n", " },\n", " },\n", ")\n", "\n", "pretty_search_response(response)" ] }, { "cell_type": "markdown", "id": "22c4d4d395adb472", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Much better! Not only are the top results semantically close to our query _\"the Moon covers the Sun\"_, the irrelevant results with a low score were discarded from the response. As a result, the list of articles we ended up with are indeed those that provide the best answer to our question.\n", "\n", "What's also great about reranking is that it can be used on top of existing search solutions out of the box. Under the hood the same lexical search was executed as before - the one that resulted in mixed hits -, then Cohere took the texts from the top articles and reordered them according to their relation to our query's meaning.\n", "\n", "Whether your search application uses lexical, vector or hybrid search, reranking can improve your results." ] }, { "cell_type": "markdown", "id": "78be304240d6c695", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Conclusion\n", "\n", "[Semantic reranking](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-reranking.html) is an incredibly powerful tool for boosting the performance of a search experience or a RAG tool. It lets us immediately add semantic search capabilities to existing Elasticsearch installations out there." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" } }, "nbformat": 4, "nbformat_minor": 5 }

notebooks/search/10-semantic-reranking-retriever-cohere.ipynb (647 lines of code) (raw):