quick_start/v1/04_tokens_and_usage.ipynb (242 lines of code) (raw):

{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# OpenAI Tokens" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Overview \n", "Tokens are the basic units that OpenAI models use to process text. \\\n", "Understanding how to encode and decode tokens is essential for efficiently utilizing OpenAI services, especially when working with prompts and responses.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. Import helper libraries and instantiate credentials and model" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import re\n", "import requests\n", "import sys\n", "import os\n", "from openai import AzureOpenAI\n", "import tiktoken\n", "from dotenv import load_dotenv\n", "load_dotenv()\n", "\n", "client = AzureOpenAI(\n", " azure_endpoint = os.getenv(\"AZURE_OPENAI_ENDPOINT\"), \n", " api_key=os.getenv(\"AZURE_OPENAI_KEY\"), \n", " api_version=\"2024-02-15-preview\"\n", ")\n", "\n", "CHAT_COMPLETIONS_MODEL = os.getenv('AZURE_OPENAI_DEPLOYMENT_NAME')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tokenizing a Prompt\n", "This example demonstrates how to encode a prompt into tokens, count the total number of tokens, and decode them back into words using the `tiktoken` library:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of tokens: 9\n", "Tokens : [79207, 5377, 15836, 2532, 374, 3331, 16528, 1457, 0]\n", "Words : ['Azure', ' Open', 'AI', ' service', ' is', ' General', ' Available', ' now', '!']\n" ] } ], "source": [ "encoding=tiktoken.encoding_for_model(\"gpt-3.5-turbo\")\n", "prompt = \"Azure OpenAI service is General Available now!\"\n", "tokens = encoding.encode(prompt)\n", "print('Total number of tokens:', len(tokens))\n", "print('Tokens :', tokens)\n", "print('Words : ', [encoding.decode([t]) for t in tokens])" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "response = client.chat.completions.create(\n", " model=CHAT_COMPLETIONS_MODEL,\n", " messages = [{\"role\":\"system\", \"content\":\"You are a helpful assistant.\"},\n", " {\"role\":\"user\",\"content\": prompt}],\n", " max_tokens=60,\n", " n=2,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Use cases to explore\n", "1. **Prompt Engineering** \\\n", "Efficiently design and test prompts by understanding their tokenization.\n", "\n", "2. **Cost Management** \\\n", "Manage API costs by controlling the number of tokens in requests and responses.\n", "\n", "3. **Text Analysis** \\\n", "Analyze and preprocess text data by tokenizing and decoding content." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Show 2 returned results" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "============================== ANSWER #1 ==============================\n", "Yes, the Azure OpenAI Service became generally available in January 2023. This service enables businesses and developers to access OpenAI's powerful models, such as GPT-3, Codex, and DALL-E, through Microsoft's Azure platform. It provides the scalability, security, and integration capabilities of\n", "============================== ANSWER #2 ==============================\n", "Yes, the Azure OpenAI Service became generally available in January 2023. This service allows businesses and developers to integrate OpenAI's powerful language models, such as GPT-3, Codex, and other advanced AI capabilities, into their applications through the Azure cloud platform.\n", "\n", "With Azure OpenAI Service\n" ] } ], "source": [ "print('='*30, 'ANSWER #1', '='*30)\n", "print(response.choices[0].message.content)\n", "print('='*30, 'ANSWER #2', '='*30)\n", "print(response.choices[1].message.content)\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Understanding Token Usage in OpenAI Completions\n", "When using OpenAI models, it's important to understand how tokens are used and billed. \\\n", "The CompletionUsage class provides information about the number of tokens used in the prompt, completion, and the total number of tokens for a request. \\\n", "This helps in managing costs and optimizing the efficiency of your API calls." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "CompletionUsage(completion_tokens=120, prompt_tokens=26, total_tokens=146, completion_tokens_details=None, prompt_tokens_details=None)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response.usage" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Explanation of Token Usage\n", "The `CompletionUsage` output provides detailed information about token usage in a request:\n", "\n", "_completion_tokens_: \\\n", "Number of tokens used in the generated completion.\n", "\n", "_prompt_tokens_: \\\n", "Number of tokens used in the input prompt.\n", "\n", "_total_tokens_: \\\n", "Total number of tokens used in the request (sum of prompt tokens and completion tokens)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Use cases to explore\n", "1. **Cost Management** \\\n", "OpenAI API costs are based on the number of tokens processed. By monitoring token usage, you can manage and optimize your costs.\n", "\n", "2. **Performance Optimization** \\\n", "Understanding token usage helps in optimizing your prompts and ensuring that responses are within the desired length.\n", "\n", "3. **Quota Management** \\\n", "If you have a usage quota, tracking token usage ensures you stay within your allocated limits.\n", "\n", "#### Best Practices\n", "1. **Optimize Prompts** \\\n", "Keep your prompts concise to minimize token usage while maintaining clarity.\n", "\n", "2. **Set Token Limits** \\\n", "Use parameters like max_tokens to control the maximum number of tokens in the response.\n", "\n", "1. **Monitor Usage** \\\n", "Regularly monitor token usage to identify patterns and opportunities for optimization." ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }