demo-python/code/embeddings/cohere-embeddings/cohere-embeddings.ipynb (533 lines of code) (raw):

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up a Python virtual environment in Visual Studio Code\n", "\n", "1. Open the Command Palette (Ctrl+Shift+P).\n", "1. Search for **Python: Create Environment**.\n", "1. Select **Venv**.\n", "1. Select a Python interpreter. Choose 3.10 or later.\n", "\n", "It can take a minute to set up. If you run into problems, see [Python environments in VS Code](https://code.visualstudio.com/docs/python/environments)." ] }, { "cell_type": "markdown", "metadata": { "vscode": { "languageId": "plaintext" } }, "source": [ "### Install packages" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "! pip install -r cohere-embeddings-requirements.txt --quiet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Provision the sample\n", "\n", "This sample uses [`azd`](https://learn.microsoft.com/azure/developer/azure-developer-cli/), a bicep template, and a custom post-provision hook to provision the sample. The sample uses API keys for authentication.\n", "\n", "1. Open a PowerShell command prompt in the cohere-embeddings folder.\n", "\n", "1. Run `azd config set defaults.subscription <yourSubscriptionID>` to set the subscription if you have multiple Azure subscriptions.\n", "1. Run `azd env new <your-environment-name>` to create a new `azd` environment. This environment will contain a set of environment variables used to provision the sample.\n", "1. If you want to use an existing resource, set the corresponding `azd` environment variables before deployment:\n", " 1. Existing Search resource:\n", " 1. `azd env set AZURE_SEARCH_SERVICE <your-search-service-name>`\n", " 1. `azd env set AZURE_SEARCH_SERVICE_LOCATION <your-search-service-location>`\n", " 1. `azd env set AZURE_SEARCH_SERVICE_RESOURCE_GROUP <your-search-service-resource-group>`\n", " 1. `azd env set AZURE_SEARCH_SERVICE_SKU <your-search-service-sku>`\n", " 1. `azd env set AZURE_SEARCH_SERVICE_SEMANTIC_RANKER <your-semantic-ranker-sku>`\n", " 1. You can use an existing AI Hub, AI Project, or Cohere Serverless Endpoint by setting the following environment variables:\n", " 1. `azd env set AZUREAI_HUB_NAME <ai-hub-name>`\n", " 1. `azd env set AZUREAI_PROJECT_NAME <ai-project-name>`\n", " 1. `azd env set AZUREAI_SERVERLESS_ENDPOINT_NAME <cohere-serverless-endpoint-names>`\n", " 1. `azd env set AZUREAI_MARKETPLACE_SUBSCRIPTION <cohere-marketplace-subscription-name>`\n", "1. Run `azd provision`.\n", " 1. Enter a development environment name.\n", " 1. Enter a region for the deployment.\n", "\n", "This step may take a few minutes to complete" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Retrieve environment variables after provisioning\n", "\n", "The included `azd` bicep template saves all required environment variables for the notebook automatically." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Load all environment variables from the azd deployment\n", "import subprocess\n", "from io import StringIO\n", "from dotenv import load_dotenv\n", "import os\n", "result = subprocess.run([\"azd\", \"env\", \"get-values\"], stdout=subprocess.PIPE, cwd=os.getcwd())\n", "load_dotenv(override=True, stream=StringIO(result.stdout.decode(\"utf-8\")))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Validate the indexer has completed successfully\n", "\n", "An indexer runs in the background to chunk and vectorize all the sample documents. Validate that it has completed without any errors before trying to search the sample index. If there are any errors, they may be due to temporary throttling from the Azure OpenAI embeddings. Try re-running the indexer to resolve this issue." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Status: success\n" ] } ], "source": [ "from azure.search.documents.indexes import SearchIndexerClient\n", "from azure.identity import AzureCliCredential\n", "from azure.core.credentials import AzureKeyCredential\n", "from azure.mgmt.search import SearchManagementClient\n", "\n", "credential = AzureCliCredential(tenant_id=os.getenv(\"AZURE_TENANT_ID\", None))\n", "search_mgmt_client = SearchManagementClient(credential=credential, subscription_id=os.getenv(\"AZURE_SUBSCRIPTION_ID\"))\n", "key = search_mgmt_client.admin_keys.get(resource_group_name=os.getenv(\"AZURE_RESOURCE_GROUP\"), search_service_name=os.getenv(\"AZURE_SEARCH_SERVICE\")).primary_key\n", "search_indexer_client = SearchIndexerClient(endpoint=os.getenv(\"AZURE_SEARCH_ENDPOINT\"), credential=AzureKeyCredential(key))\n", "status = search_indexer_client.get_indexer_status(name=os.getenv(\"AZURE_SEARCH_INDEXER\"))\n", "print(f\"Status: {status.last_result.status}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Perform a vector search by vectorizing your text query\n", "\n", "Perform a vector search to find the most relevant images based on the text query.\n", "\n", "Vector queries call [VectorizableTextQuery](https://learn.microsoft.com/python/api/azure-search-documents/azure.search.documents.models.vectorizabletextquery) to vectorize a query text string that's used to match against vectorized images created by AI Services. VectorizeableTextQuery uses the vectorizer defined in the index, which is a Cohere serverless endpoint hosted in Azure AI Studio." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "File: Benefit_Options.pdf\n", "Score: 0.8044232\n", "Content: a variety of in-network providers, including primary care \n", "physicians, specialists, hospitals, and pharmacies. This plan does not offer coverage for emergency \n", "services, mental health and substance abuse coverage, or out-of-network services.\n", "\n", "Comparison of Plans \n", "Both plans offer coverage for routine physicals, well-child visits, immunizations, and other preventive \n", "care services. The plans also cover preventive care services such as mammograms, colonoscopies, and \n", "other cancer screenings. \n", "\n", "Northwind Health Plus offers more comprehensive coverage than Northwind Standard. This plan offers \n", "coverage for emergency services, both in-network and out-of-network, as well as mental health and \n", "substance abuse coverage. Northwind Standard does not offer coverage for emergency services, mental \n", "health and substance abuse coverage, or out-of-network services. \n", "\n", "Both plans offer coverage for prescription drugs. Northwind Health Plus offers a wider range of \n", "prescription drug coverage than Northwind Standard. Northwind Health Plus covers generic, brand-\n", "name, and specialty drugs, while Northwind Standard only covers generic and brand-name drugs. \n", "\n", "Both plans offer coverage for vision and dental services. Northwind Health Plus offers coverage for vision \n", "exams, glasses, and contact lenses, as well as dental exams, cleanings, and fillings. Northwind Standard \n", "only offers coverage for vision exams and glasses. \n", "\n", "Both plans offer coverage for medical services. Northwind Health Plus offers coverage for hospital stays, \n", "doctor visits, lab tests, and X-rays. Northwind Standard only offers coverage for doctor visits and lab \n", "tests. \n", "\n", "Northwind Health Plus is a comprehensive plan that offers more coverage than Northwind Standard. \n", "Northwind Health Plus offers coverage for emergency services, mental health and substance abuse \n", "coverage, and out-of-network services, while Northwind Standard does not. Northwind Health Plus also\n", "\n", "\n", "File: Northwind_Health_Plus_Benefits_Details.pdf\n", "Score: 0.7902214\n", "Content: This \n", "\n", "means that you must obtain approval from Northwind Health Plus prior to receiving the \n", "\n", "service. If pre-authorization or pre-certification is not obtained, you may be responsible for \n", "\n", "the full cost of the services. \n", "\n", "It is important to understand that the Allowed Amount does not include any applicable \n", "\n", "copays, coinsurance, or deductibles that may be due. It is also important to understand that \n", "\n", "the Allowed Amount may vary depending on the type of care received and the type of \n", "\n", "provider that is providing the care. Therefore, it is important to check with the provider \n", "\n", "prior to receiving services to determine the Allowed Amount that Northwind Health Plus \n", "\n", "will pay for the services you are receiving. \n", "\n", "Finally, it is important to keep track of your out-of-pocket expenses. This includes any \n", "\n", "copays, coinsurance, or deductibles that you may be required to pay. It is important to \n", "\n", "understand what your financial responsibility is when receiving care under Northwind \n", "\n", "Health Plus, so that you can plan accordingly and make sure that you are meeting your \n", "\n", "financial obligations. \n", "\n", "IMPORTANT PLAN INFORMATION \n", "\n", "\n", "\n", "Northwind Health Plus is a comprehensive health plan that offers coverage for medical, \n", "\n", "vision, and dental services. It also provides coverage for prescription drugs, mental health \n", "\n", "and substance abuse services, and preventive care. You can choose from a variety of in-\n", "\n", "network providers, including primary care physicians, specialists, hospitals, and \n", "\n", "pharmacies. Emergency services are also covered, both in-network and out-of-network. \n", "\n", "Co-pays, deductibles, and out-of-pocket maximums may apply to your plan. Your plan may \n", "\n", "also include separate deductibles for different services, such as prescription drugs and \n", "\n", "hospitalization. It is important to know what your plan covers and what the cost-sharing \n", "\n", "requirements are. To get more information, please visit the Northwind Health website or \n", "\n", "contact them directly.\n", "\n", "\n", "File: Benefit_Options.pdf\n", "Score: 0.7864126\n", "Content: for medical services. Northwind Health Plus offers coverage for hospital stays, \n", "doctor visits, lab tests, and X-rays. Northwind Standard only offers coverage for doctor visits and lab \n", "tests. \n", "\n", "Northwind Health Plus is a comprehensive plan that offers more coverage than Northwind Standard. \n", "Northwind Health Plus offers coverage for emergency services, mental health and substance abuse \n", "coverage, and out-of-network services, while Northwind Standard does not. Northwind Health Plus also \n", "\n", "\n", "\n", "offers a wider range of prescription drug coverage than Northwind Standard. Both plans offer coverage \n", "for vision and dental services, as well as medical services. \n", "\n", "Cost Comparison\n", "Contoso Electronics deducts the employee's portion of the healthcare cost from each paycheck. This \n", "means that the cost of the health insurance will be spread out over the course of the year, rather \n", "than being paid in one lump sum. The employee's portion of the cost will be calculated based on the \n", "selected health plan and the number of people covered by the insurance. The table below shows a \n", "cost comparison between the different health plans offered by Contoso Electronics:\n", "\n", "Next Steps \n", "We hope that this information has been helpful in understanding the differences between Northwind \n", "Health Plus and Northwind Standard. We are confident that you will find the right plan for you and \n", "your family. Thank you for choosing Contoso Electronics!\n", "\n", "\n" ] } ], "source": [ "from azure.search.documents import SearchClient\n", "from azure.search.documents.models import VectorizableTextQuery\n", "\n", "# Generate text embeddings for the query \n", "query = \"What is included in my Northwind Health Plus plan that is not in standard?\" \n", " \n", "# Initialize the SearchClient \n", "search_client = SearchClient(endpoint=os.getenv(\"AZURE_SEARCH_ENDPOINT\"), index_name=os.getenv(\"AZURE_SEARCH_INDEX\"), credential=AzureKeyCredential(key)) \n", "vector_query = VectorizableTextQuery(text=query, k_nearest_neighbors=50, fields=\"embedding\") \n", "\n", "# Perform vector search \n", "results = search_client.search( \n", " search_text=None, \n", " vector_queries= [vector_query],\n", " select=[\"metadata_storage_path\", \"content\"],\n", " top=3\n", ") \n", "\n", "# Print the search results \n", "for result in results: \n", " print(f\"File: {os.path.basename(result['metadata_storage_path'])}\")\n", " print(f\"Score: {result['@search.score']}\")\n", " print(f\"Content: {result['content']}\")\n", " print(\"\\n\") \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Perform a hybrid search with semantic reranking\n", "\n", "Combine the vector search with a text search for higher accuracy" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "File: Northwind_Standard_Benefits_Details.pdf\n", "Semantic score: 2.844975471496582\n", "Score: 0.027364110574126244\n", "Content: includes counselling, psychotherapy, \n", "\n", "and other treatments related to mental health and substance abuse. \n", "\n", "\n", "\n", "Out-of-Network Services: The Northwind Standard plan does not cover any services that are \n", "\n", "provided by a provider that is not part of the Northwind Health network. This includes \n", "\n", "doctors, hospitals, and other healthcare providers who are not part of the Northwind Health \n", "\n", "network. \n", "\n", "Tips \n", "\n", "When selecting a healthcare plan, it is important to be aware of the exclusions in the plan. \n", "\n", "Here are some tips to help you understand the exclusions in the Northwind Standard plan: \n", "\n", "1. Understand the types of services that are not covered by the Northwind Standard plan. Be \n", "\n", "sure to familiarize yourself with the list of exclusions and make sure that any services you \n", "\n", "might require are covered. \n", "\n", "2. If you require emergency services, be sure to check with your provider to see if they are \n", "\n", "part of the Northwind Health network. If they are not, you will be responsible for the full \n", "\n", "cost of those services. \n", "\n", "3. If you require mental health or substance abuse treatments, be sure to check with your \n", "\n", "provider to see if they are part of the Northwind Health network. These services are not \n", "\n", "covered by the Northwind Standard plan. \n", "\n", "4. If you require services from a provider that is not part of the Northwind Health network, \n", "\n", "you will be responsible for the full cost of those services. \n", "\n", "By understanding the exclusions in the Northwind Standard plan, you can make informed \n", "\n", "decisions about your healthcare. Be sure to read the plan document carefully to make sure \n", "\n", "that the plan meets your healthcare needs. \n", "\n", "WHAT IF I HAVE OTHER COVERAGE? \n", "\n", "Coordinating Benefits With Other Health Care Plans \n", "\n", "WHAT IF I HAVE OTHER COVERAGE? \n", "\n", "Coordinating Benefits With Other Health Care Plans \n", "\n", "It may be possible to coordinate benefits with other health care plans if you have other \n", "\n", "coverage. Coordinating benefits allows you to receive payments from each health plan\n", "\n", "\n", "File: Benefit_Options.pdf\n", "Semantic score: 2.764169692993164\n", "Score: 0.016129031777381897\n", "Content: for medical services. Northwind Health Plus offers coverage for hospital stays, \n", "doctor visits, lab tests, and X-rays. Northwind Standard only offers coverage for doctor visits and lab \n", "tests. \n", "\n", "Northwind Health Plus is a comprehensive plan that offers more coverage than Northwind Standard. \n", "Northwind Health Plus offers coverage for emergency services, mental health and substance abuse \n", "coverage, and out-of-network services, while Northwind Standard does not. Northwind Health Plus also \n", "\n", "\n", "\n", "offers a wider range of prescription drug coverage than Northwind Standard. Both plans offer coverage \n", "for vision and dental services, as well as medical services. \n", "\n", "Cost Comparison\n", "Contoso Electronics deducts the employee's portion of the healthcare cost from each paycheck. This \n", "means that the cost of the health insurance will be spread out over the course of the year, rather \n", "than being paid in one lump sum. The employee's portion of the cost will be calculated based on the \n", "selected health plan and the number of people covered by the insurance. The table below shows a \n", "cost comparison between the different health plans offered by Contoso Electronics:\n", "\n", "Next Steps \n", "We hope that this information has been helpful in understanding the differences between Northwind \n", "Health Plus and Northwind Standard. We are confident that you will find the right plan for you and \n", "your family. Thank you for choosing Contoso Electronics!\n", "\n", "\n", "File: Northwind_Health_Plus_Benefits_Details.pdf\n", "Semantic score: 2.643535614013672\n", "Score: 0.011904762126505375\n", "Content: to answer any questions or address any \n", "\n", "concerns you may have about your plan. \n", "\n", "Healthcare Providers - Independent Contractors \n", "\n", "OTHER INFORMATION ABOUT THIS PLAN \n", "\n", "Healthcare Providers - Independent Contractors \n", "\n", "The Northwind Health Plus plan includes coverage for healthcare services provided by \n", "\n", "independent contractors. This means that services provided by independent contractors \n", "\n", "may be covered under the Northwind Health Plus plan, provided that the service is \n", "\n", "medically necessary. \n", "\n", "Independent contractors are healthcare providers that are not employed by Northwind \n", "\n", "Health or any other company or organization. They are self-employed and provide services \n", "\n", "on a contract basis. These services can include medical, vision, and dental services, as well \n", "\n", "as prescription drug coverage and mental health and substance abuse coverage. \n", "\n", "It is important to note that services provided by independent contractors are not covered \n", "\n", "under the Northwind Health Plus plan unless they are necessary to treat an illness or injury. \n", "\n", "For example, a physical therapist who is an independent contractor may be covered under \n", "\n", "the plan if the services are necessary to treat an illness or injury. However, services \n", "\n", "provided by an independent contractor that are not medically necessary, such as a massage \n", "\n", "therapist or acupuncturist, are not covered under the plan. \n", "\n", "When selecting a healthcare provider, it is important to make sure that the provider is an \n", "\n", "independent contractor and is covered under the Northwind Health Plus plan. You can do \n", "\n", "this by checking the provider’s website or calling the provider’s office to confirm that they \n", "\n", "are an independent contractor and that their services are covered under the Northwind \n", "\n", "Health Plus plan. \n", "\n", "It is also important to note that any services that you receive from an independent \n", "\n", "contractor may be subject to a deductible or coinsurance. This means that you may be \n", "\n", "responsible for a portion of the cost of the service.\n", "\n", "\n" ] } ], "source": [ "from azure.search.documents import SearchClient\n", "from azure.search.documents.models import VectorizableTextQuery\n", "from azure.search.documents.models import HybridSearch\n", "\n", "# Generate text embeddings for the query \n", "query = \"What is included in my Northwind Health Plus plan that is not in standard?\" \n", " \n", "# Initialize the SearchClient \n", "search_client = SearchClient(endpoint=os.getenv(\"AZURE_SEARCH_ENDPOINT\"), index_name=os.getenv(\"AZURE_SEARCH_INDEX\"), credential=AzureKeyCredential(key)) \n", "vector_query = VectorizableTextQuery(text=query, k_nearest_neighbors=50, fields=\"embedding\") \n", "\n", "# Perform vector search \n", "results = search_client.search( \n", " search_text=query, \n", " vector_queries= [vector_query],\n", " semantic_query=query,\n", " hybrid_search=HybridSearch(max_text_recall_size=50),\n", " select=[\"metadata_storage_path\", \"content\"],\n", " semantic_configuration_name='semantic-config',\n", " top=3\n", ") \n", "\n", "# Print the search results \n", "for result in results: \n", " print(f\"File: {os.path.basename(result['metadata_storage_path'])}\")\n", " print(f\"Semantic score: {result['@search.reranker_score']}\")\n", " print(f\"Score: {result['@search.score']}\")\n", " print(f\"Content: {result['content']}\")\n", " print(\"\\n\") " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 2 }