tool_use/tool_choice.ipynb (726 lines of code) (raw):

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tool choice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tool use supports a parameter called `tool_choice` that allows you to specify how you want Claude to call tools. In this notebook, we'll take a look at how it works and when to use it. Before going any further, make sure you are comfortable with the basics of tool use with Claude.\n", "\n", "When working with the `tool_choice` parameter, we have three possible options: \n", "\n", "* `auto` allows Claude to decide whether to call any provided tools or not\n", "* `tool` allows us to force Claude to always use a particular tool\n", "* `any` tells Claude that it must use one of the provided tools, but doesn't force a particular tool\n", "\n", "Let's take a look at each option in detail. We'll start by importing the Anthropic SDK:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "from anthropic import Anthropic\n", "client = Anthropic()\n", "MODEL_NAME = \"claude-3-sonnet-20240229\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Auto\n", "\n", "Setting `tool_choice` to `auto` allows the model to automatically decide whether to use tools or not. This is the default behavior when working with tools. \n", "\n", "To demonstrate this, we're going to provide Claude with a fake web search tool. We will ask Claude questions, some of which would require calling the web search tool and other which Claude should be able to answer on its own.\n", "\n", "Let's start by defining a tool called `web_search`. Please note, to keep this demo simple, we're not actually searching the web here:" ] }, { "cell_type": "code", "execution_count": 137, "metadata": {}, "outputs": [], "source": [ "def web_search(topic):\n", " print(f\"pretending to search the web for {topic}\")\n", "\n", "web_search_tool = {\n", " \"name\": \"web_search\",\n", " \"description\": \"A tool to retrieve up to date information on a given topic by searching the web\",\n", " \"input_schema\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"topic\": {\n", " \"type\": \"string\",\n", " \"description\": \"The topic to search the web for\"\n", " },\n", " },\n", " \"required\": [\"topic\"]\n", " }\n", "}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we write a function that accepts a user_query and passes it along to Claude, along with the `web_search_tool`. \n", "\n", "We also set `tool_choice` to `auto`:\n", "\n", "```py\n", "tool_choice={\"type\": \"auto\"}\n", "```\n", "\n", "Here's the complete function:" ] }, { "cell_type": "code", "execution_count": 145, "metadata": {}, "outputs": [], "source": [ "from datetime import date\n", "\n", "def chat_with_web_search(user_query):\n", " messages = [{\"role\": \"user\", \"content\": user_query}]\n", "\n", " system_prompt=f\"\"\"\n", " Answer as many questions as you can using your existing knowledge. \n", " Only search the web for queries that you can not confidently answer.\n", " Today's date is {date.today().strftime(\"%B %d %Y\")}\n", " If you think a user's question involves something in the future that hasn't happened yet, use the search tool.\n", " \"\"\"\n", "\n", " response = client.messages.create(\n", " system=system_prompt,\n", " model=MODEL_NAME,\n", " messages=messages,\n", " max_tokens=1000,\n", " tool_choice={\"type\": \"auto\"},\n", " tools=[web_search_tool]\n", " )\n", " last_content_block = response.content[-1]\n", " if last_content_block.type == \"text\":\n", " print(\"Claude did NOT call a tool\")\n", " print(f\"Assistant: {last_content_block.text}\")\n", " elif last_content_block.type == \"tool_use\":\n", " print(\"Claude wants to use a tool\")\n", " print(last_content_block)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start with a question Claude should be able to answer without using the tool:" ] }, { "cell_type": "code", "execution_count": 139, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Claude did NOT call a tool\n", "Assistant: The sky appears blue during the day. This is because the Earth's atmosphere scatters more blue light from the sun than other colors, making the sky look blue.\n" ] } ], "source": [ "chat_with_web_search(\"What color is the sky?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we ask \"What color is the sky?\", Claude does not use the tool. Let's try asking something that Claude should use the web search tool to answer:" ] }, { "cell_type": "code", "execution_count": 140, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Claude wants to use a tool\n", "ToolUseBlock(id='toolu_staging_018nwaaRebX33pHqoZZXDaSw', input={'topic': '2024 Miami Grand Prix winner'}, name='web_search', type='tool_use')\n" ] } ], "source": [ "chat_with_web_search(\"Who won the 2024 Miami Grand Prix?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we ask \"Who won the 2024 Miami Grand Prix?\", Claude uses the web search tool! \n", "\n", "Let's try a few more examples:" ] }, { "cell_type": "code", "execution_count": 141, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Claude did NOT call a tool\n", "Assistant: The Los Angeles Rams won Super Bowl LVI in 2022, defeating the Cincinnati Bengals by a score of 23-20. The game was played on February 13, 2022 at SoFi Stadium in Inglewood, California.\n" ] } ], "source": [ "# Claude should NOT need to use the tool for this:\n", "chat_with_web_search(\"Who won the superbowl in 2022?\")" ] }, { "cell_type": "code", "execution_count": 144, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Claude wants to use a tool\n", "ToolUseBlock(id='toolu_staging_016XPwcprHAgYJBtN7A3jLhb', input={'topic': '2024 Super Bowl winner'}, name='web_search', type='tool_use')\n" ] } ], "source": [ "# Claude SHOULD use the tool for this:\n", "chat_with_web_search(\"Who won the superbowl in 2024?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Your Prompt Matters!\n", "\n", "When working with `tool_choice` of `auto`, it's important that you spend time to write a detailed prompt. Often, Claude can be over-eager to call tools. Writing a detailed prompt helps Claude determine when to call a tool and when not to. In the above example, we included specific instructions in the system prompt: \n", "\n", "\n", "```py\n", " system_prompt=f\"\"\"\n", " Answer as many questions as you can using your existing knowledge. \n", " Only search the web for queries that you can not confidently answer.\n", " Today's date is {date.today().strftime(\"%B %d %Y\")}\n", " If you think a user's question involves something in the future that hasn't happened yet, use the search tool.\n", "\"\"\"\n", "```\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Forcing a specific tool\n", "\n", "We can force Claude to use a particular tool using `tool_choice`. In the example below, we've defined two simple tools: \n", "* `print_sentiment_scores` - a tool that \"tricks\" Claude into generating well-structured JSON output containing sentiment analysis data. For more info on this approach, see [Extracting Structured JSON using Claude and Tool Use](https://github.com/anthropics/anthropic-cookbook/blob/main/tool_use/extracting_structured_json.ipynb)\n", "* `calculator` - a very simple calculator tool that takes two numbers and adds them together \n" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [], "source": [ "\n", "tools = [\n", " {\n", " \"name\": \"print_sentiment_scores\",\n", " \"description\": \"Prints the sentiment scores of a given tweet or piece of text.\",\n", " \"input_schema\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"positive_score\": {\"type\": \"number\", \"description\": \"The positive sentiment score, ranging from 0.0 to 1.0.\"},\n", " \"negative_score\": {\"type\": \"number\", \"description\": \"The negative sentiment score, ranging from 0.0 to 1.0.\"},\n", " \"neutral_score\": {\"type\": \"number\", \"description\": \"The neutral sentiment score, ranging from 0.0 to 1.0.\"}\n", " },\n", " \"required\": [\"positive_score\", \"negative_score\", \"neutral_score\"]\n", " }\n", " },\n", " {\n", " \"name\": \"calculator\",\n", " \"description\": \"Adds two number\",\n", " \"input_schema\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"num1\": {\"type\": \"number\", \"description\": \"first number to add\"},\n", " \"num2\": {\"type\": \"number\", \"description\": \"second number to add\"},\n", " },\n", " \"required\": [\"num1\", \"num2\"]\n", " }\n", " }\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our goal is to write a function called `analyze_tweet_sentiment` that takes a tweet and prints a basic sentiment analysis of that tweet. Eventually we will \"force\" Claude to use our sentiment analysis tool, but we'll start by showing what happens when we **do not** force the tool use. \n", "\n", "In this first \"bad\" version of the `analyze_tweet_sentiment` function, we provide Claude with both tools. For the sake of comparison, we'll start by setting tool_choice to \"auto\":\n", "\n", "```py\n", "tool_choice={\"type\": \"auto\"}\n", "```\n", "\n", "Please note that we are deliberately not providing Claude with a well-written prompt, to make it easier to see the impact of forcing the use of a particular tool." ] }, { "cell_type": "code", "execution_count": 124, "metadata": {}, "outputs": [], "source": [ "def analyze_tweet_sentiment(query):\n", " response = client.messages.create(\n", " model=MODEL_NAME,\n", " max_tokens=4096,\n", " tools=tools,\n", " tool_choice={\"type\": \"auto\"},\n", " messages=[{\"role\": \"user\", \"content\": query}]\n", " )\n", " print(response)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see what happens when we call the function with the tweet \"Holy cow, I just made the most incredible meal!\"" ] }, { "cell_type": "code", "execution_count": 125, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ToolsBetaMessage(id='msg_staging_01ApgXx7W7qsDugdaRWh6p21', content=[TextBlock(text=\"That's great to hear! I don't actually have the capability to assess sentiment from text, but it sounds like you're really excited and proud of the incredible meal you made. Cooking something delicious that you're proud of can definitely give a sense of accomplishment and happiness. Well done on creating such an amazing dish!\", type='text')], model='claude-3-sonnet-20240229', role='assistant', stop_reason='end_turn', stop_sequence=None, type='message', usage=Usage(input_tokens=429, output_tokens=69))\n" ] } ], "source": [ "analyze_tweet_sentiment(\"Holy cow, I just made the most incredible meal!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Claude does not call our sentiment analysis tool:\n", "> \"That's great to hear! I don't actually have the capability to assess sentiment from text, but it sounds like you're really excited and proud of the incredible meal you made\n", "\n", "Next, let's imagine someone tweets this: \"I love my cats! I had four and just adopted 2 more! Guess how many I have now?\"" ] }, { "cell_type": "code", "execution_count": 128, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ToolsBetaMessage(id='msg_staging_018gTrwrx6YwBR2jjhdPooVg', content=[TextBlock(text=\"That's wonderful that you love your cats and adopted two more! To figure out how many cats you have now, I can use the calculator tool:\", type='text'), ToolUseBlock(id='toolu_staging_01RFker5oMQoY6jErz5prmZg', input={'num1': 4, 'num2': 2}, name='calculator', type='tool_use')], model='claude-3-sonnet-20240229', role='assistant', stop_reason='tool_use', stop_sequence=None, type='message', usage=Usage(input_tokens=442, output_tokens=101))\n" ] } ], "source": [ "analyze_tweet_sentiment(\"I love my cats! I had four and just adopted 2 more! Guess how many I have now?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Claude wants to call the calculator tool:\n", "\n", "> ToolUseBlock(id='toolu_staging_01RFker5oMQoY6jErz5prmZg', input={'num1': 4, 'num2': 2}, name='calculator', type='tool_use')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clearly, this current implementation is not doing what we want (mostly because we set it up to fail). \n", "\n", "Next, let's force Claude to **always** use the `print_sentiment_scores` tool by updating `tool_choice`:\n", "\n", "```py\n", "tool_choice={\"type\": \"tool\", \"name\": \"print_sentiment_scores\"}\n", "```\n", "\n", "In addition to setting `type` to `tool`, we must provide a particular tool name." ] }, { "cell_type": "code", "execution_count": 132, "metadata": {}, "outputs": [], "source": [ "def analyze_tweet_sentiment(query):\n", " response = client.messages.create(\n", " model=MODEL_NAME,\n", " max_tokens=4096,\n", " tools=tools,\n", " tool_choice={\"type\": \"tool\", \"name\": \"print_sentiment_scores\"},\n", " messages=[{\"role\": \"user\", \"content\": query}]\n", " )\n", " print(response)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now if we try prompting Claude with the same prompts from earlier, it's always going to call the `print_sentiment_scores` tool:" ] }, { "cell_type": "code", "execution_count": 133, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ToolsBetaMessage(id='msg_staging_018GtYk8Xvee3w8Eeh6pbgoq', content=[ToolUseBlock(id='toolu_staging_01FMRQ9pZniZqFUGQwTcFU4N', input={'positive_score': 0.9, 'negative_score': 0.0, 'neutral_score': 0.1}, name='print_sentiment_scores', type='tool_use')], model='claude-3-sonnet-20240229', role='assistant', stop_reason='tool_use', stop_sequence=None, type='message', usage=Usage(input_tokens=527, output_tokens=79))\n" ] } ], "source": [ "analyze_tweet_sentiment(\"Holy cow, I just made the most incredible meal!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Claude calls our `print_sentiment_scores` tool:\n", "\n", "> ToolUseBlock(id='toolu_staging_01FMRQ9pZniZqFUGQwTcFU4N', input={'positive_score': 0.9, 'negative_score': 0.0, 'neutral_score': 0.1}, name='print_sentiment_scores', type='tool_use')\n", "\n", "Even if we try to trip up Claude with a \"Math-y\" tweet, it still always calls the `print_sentiment_scores` tool:" ] }, { "cell_type": "code", "execution_count": 134, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ToolsBetaMessage(id='msg_staging_01RACamfrHdpvLxWaNwDfZEF', content=[ToolUseBlock(id='toolu_staging_01Wb6ZKSwKvqVSKLDAte9cKU', input={'positive_score': 0.8, 'negative_score': 0.0, 'neutral_score': 0.2}, name='print_sentiment_scores', type='tool_use')], model='claude-3-sonnet-20240229', role='assistant', stop_reason='tool_use', stop_sequence=None, type='message', usage=Usage(input_tokens=540, output_tokens=79))\n" ] } ], "source": [ "analyze_tweet_sentiment(\"I love my cats! I had four and just adopted 2 more! Guess how many I have now?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even though we're forcing Claude to call our `print_sentiment_scores` tool, we should still employ some basic prompt engineering:" ] }, { "cell_type": "code", "execution_count": 135, "metadata": {}, "outputs": [], "source": [ "def analyze_tweet_sentiment(query):\n", "\n", " prompt = f\"\"\"\n", " Analyze the sentiment in the following tweet: \n", " <tweet>{query}</tweet>\n", " \"\"\"\n", " \n", " response = client.messages.create(\n", " model=MODEL_NAME,\n", " max_tokens=4096,\n", " tools=tools,\n", " tool_choice={\"type\": \"auto\"},\n", " messages=[{\"role\": \"user\", \"content\": prompt}]\n", " )\n", " print(response)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Any\n", "\n", "The final option for `tool_choice` is `any` which allows us to tell Claude \"you must call a tool, but you can pick which one\". Imagine we want to create a SMS chatbot using Claude. The only way for this chatbot to actually \"communicate\" with a user is via SMS text message. \n", "\n", "In the example below, we make a very simple text-messaging assistant that has access to two tools:\n", "* `send_text_to_user` sends a text message to a user\n", "* `get_customer_info` looks up customer data based on a username\n", "\n", "The idea is to create a chatbot that always calls one of these tools and never responds with a non-tool response. In all situations, Claude should either respond back by trying to send a text message or calling `get_customer_info` to get more customer information.\n", "\n", "Most importantly, we set `tool_choice` to \"any\":\n", "\n", "```py\n", "tool_choice={\"type\": \"any\"}\n", "```" ] }, { "cell_type": "code", "execution_count": 162, "metadata": {}, "outputs": [], "source": [ "def send_text_to_user(text):\n", " # Sends a text to the user\n", " # We'll just print out the text to keep things simple:\n", " print(f\"TEXT MESSAGE SENT: {text}\")\n", "\n", "def get_customer_info(username):\n", " return {\n", " \"username\": username,\n", " \"email\": f\"{username}@email.com\",\n", " \"purchases\": [\n", " {\"id\": 1, \"product\": \"computer mouse\"},\n", " {\"id\": 2, \"product\": \"screen protector\"},\n", " {\"id\": 3, \"product\": \"usb charging cable\"},\n", " ]\n", " }\n", "\n", "tools = [\n", " {\n", " \"name\": \"send_text_to_user\",\n", " \"description\": \"Sends a text message to a user\",\n", " \"input_schema\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"text\": {\"type\": \"string\", \"description\": \"The piece of text to be sent to the user via text message\"},\n", " },\n", " \"required\": [\"text\"]\n", " }\n", " },\n", " {\n", " \"name\": \"get_customer_info\",\n", " \"description\": \"gets information on a customer based on the customer's username. Response includes email, username, and previous purchases. Only call this tool once a user has provided you with their username\",\n", " \"input_schema\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"username\": {\"type\": \"string\", \"description\": \"The username of the user in question. \"},\n", " },\n", " \"required\": [\"username\"]\n", " }\n", " },\n", "]\n", "\n", "system_prompt = \"\"\"\n", "All your communication with a user is done via text message.\n", "Only call tools when you have enough information to accurately call them. \n", "Do not call the get_customer_info tool until a user has provided you with their username. This is important.\n", "If you do not know a user's username, simply ask a user for their username.\n", "\"\"\"\n", "\n", "def sms_chatbot(user_message):\n", " messages = [{\"role\": \"user\", \"content\":user_message}]\n", "\n", " response = client.messages.create(\n", " system=system_prompt,\n", " model=MODEL_NAME,\n", " max_tokens=4096,\n", " tools=tools,\n", " tool_choice={\"type\": \"any\"},\n", " messages=messages\n", " )\n", " if response.stop_reason == \"tool_use\":\n", " last_content_block = response.content[-1]\n", " if last_content_block.type == 'tool_use':\n", " tool_name = last_content_block.name\n", " tool_inputs = last_content_block.input\n", " print(f\"=======Claude Wants To Call The {tool_name} Tool=======\")\n", " if tool_name == \"send_text_to_user\":\n", " send_text_to_user(tool_inputs[\"text\"])\n", " elif tool_name == \"get_customer_info\":\n", " print(get_customer_info(tool_inputs[\"username\"]))\n", " else:\n", " print(\"Oh dear, that tool doesn't exist!\")\n", " \n", " else:\n", " print(\"No tool was called. This shouldn't happen!\")\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start simple:" ] }, { "cell_type": "code", "execution_count": 163, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "=======Claude Wants To Call The send_text_to_user Tool=======\n", "TEXT MESSAGE SENT: Hello! I'm doing well, thanks for asking. How can I assist you today?\n" ] } ], "source": [ "sms_chatbot(\"Hey there! How are you?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Claude responds back by calling the `send_text_to_user` tool.\n", "\n", "Next, we'll ask Claude something a bit trickier:" ] }, { "cell_type": "code", "execution_count": 164, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "=======Claude Wants To Call The send_text_to_user Tool=======\n", "TEXT MESSAGE SENT: Hi there, to look up your order details I'll need your username first. Can you please provide me with your username?\n" ] } ], "source": [ "sms_chatbot(\"I need help looking up an order\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Claude wants to send a text message, asking a user to provide their username.\n", "\n", "Now, let's see what happens when we provide Claude with our username:" ] }, { "cell_type": "code", "execution_count": 165, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "=======Claude Wants To Call The get_customer_info Tool=======\n", "{'username': 'jenny76', 'email': 'jenny76@email.com', 'purchases': [{'id': 1, 'product': 'computer mouse'}, {'id': 2, 'product': 'screen protector'}, {'id': 3, 'product': 'usb charging cable'}]}\n" ] } ], "source": [ "sms_chatbot(\"I need help looking up an order. My username is jenny76\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Claude calls the `get_customer_info` tool, just as we hoped! \n", "\n", "Even if we send Claude a gibberish message, it will still call one of our tools:" ] }, { "cell_type": "code", "execution_count": 166, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "=======Claude Wants To Call The send_text_to_user Tool=======\n", "TEXT MESSAGE SENT: I'm afraid I didn't understand your query. Could you please rephrase what you need help with?\n" ] } ], "source": [ "sms_chatbot(\"askdj aksjdh asjkdbhas kjdhas 1+1 ajsdh\")" ] } ], "metadata": { "kernelspec": { "display_name": "py311", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 2 }