2-notebooks/1-chat_completion/4-phi-4.ipynb

{ "cells": [ { "cell_type": "markdown", "id": "1e99a5b8", "metadata": {}, "source": [ "# 🍎 Phi-4 Model with AIProjectClient 🍏\n", "\n", "**Phi-4** is a next-generation open model that aims to provide near GPT-4o capabilities at a fraction of the cost, making it ideal for many enterprise or personal use cases. It's especially great for chain-of-thought reasoning and RAG (Retrieval Augmented Generation) scenarios.\n", "\n", "In this notebook, you'll see how to:\n", "1. **Initialize** an `AIProjectClient` for your Azure AI Foundry environment.\n", "2. **Chat** with the **Phi-4** model using `azure-ai-inference`.\n", "3. **Show** a Health & Fitness example, featuring disclaimers and wellness Q&A.\n", "4. **Enjoy** the value proposition of a cheaper alternative to GPT-4 with strong reasoning capabilities. 🏋️\n", "\n", "> **Disclaimer**: This is not medical advice. Please consult professionals.\n", "\n", "## Why Phi-4?\n", "Phi-4 is a 14B-parameter model trained on curated data for high reasoning performance.\n", "- **Cost-Effective**: Get GPT-4-level performance for many tasks without the GPT-4 price.\n", "- **Reasoning & RAG**: Perfect for chain-of-thought reasoning steps and retrieval augmented generation workflows.\n", "- **Generous Context Window**: 16K tokens, enabling more context or longer user conversations.\n", "\n", "<img src=\"./seq-diagrams/4-phi-4.png\" width=\"30%\"/>\n" ] }, { "cell_type": "markdown", "id": "e93357dd", "metadata": {}, "source": [ "## 1. Setup\n", "\n", "Below, we'll install and import the necessary libraries:\n", "- **azure-ai-projects**: For the `AIProjectClient`.\n", "- **azure-ai-inference**: For calling your model, specifically the chat completions.\n", "- **azure-identity**: For `DefaultAzureCredential`.\n", "\n", "Ensure you have a `.env` file with:\n", "```bash\n", "PROJECT_CONNECTION_STRING=<your-conn-string>\n", "SERVERLESS_MODEL_NAME=phi-4\n", "```\n", "\n", "> **Note**: It's recommended to complete the [`3-basic-rag.ipynb`](./3-basic-rag.ipynb) notebook before this one, as it covers important concepts that will be helpful here." ] }, { "cell_type": "code", "execution_count": null, "id": "8b5634a0", "metadata": {}, "outputs": [], "source": [ "import os\n", "from dotenv import load_dotenv\n", "from pathlib import Path\n", "from azure.identity import DefaultAzureCredential\n", "from azure.ai.projects import AIProjectClient\n", "from azure.ai.inference.models import SystemMessage, UserMessage, AssistantMessage\n", "\n", "from pathlib import Path\n", "\n", "# Load environment variables\n", "notebook_path = Path().absolute()\n", "parent_dir = notebook_path.parent\n", "load_dotenv(parent_dir / '.env')\n", "\n", "conn_string = os.getenv(\"PROJECT_CONNECTION_STRING\")\n", "phi4_deployment = os.getenv(\"SERVERLESS_MODEL_NAME\", \"phi-4\")\n", "\n", "try:\n", " project_client = AIProjectClient.from_connection_string(\n", " credential=DefaultAzureCredential(),\n", " conn_str=conn_string,\n", " )\n", " print(\"✅ AIProjectClient created successfully!\")\n", "except Exception as e:\n", " print(\"❌ Error creating AIProjectClient:\", e)" ] }, { "cell_type": "markdown", "id": "500d63ef", "metadata": {}, "source": [ "## 2. Chat with Phi-4 🍏\n", "We'll demonstrate a simple conversation using **Phi-4** in a health & fitness context. We'll define a system prompt that clarifies the role of the assistant. Then we'll ask some user queries.\n", "\n", "> Notice that Phi-4 is well-suited for chain-of-thought reasoning. We'll let it illustrate its reasoning steps for fun.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "0bcc4772", "metadata": {}, "outputs": [], "source": [ "def chat_with_phi4(user_question, chain_of_thought=False):\n", " \"\"\"Send a chat request to the Phi-4 model with optional chain-of-thought.\"\"\"\n", " # We'll define a system message with disclaimers:\n", " system_prompt = (\n", " \"You are a Phi-4 AI assistant, focusing on health and fitness.\\n\"\n", " \"Remind users that you are not a medical professional, but can provide general info.\\n\"\n", " )\n", "\n", " # We can optionally instruct the model to show chain-of-thought. (Use carefully in production.)\n", " if chain_of_thought:\n", " system_prompt += \"Please show your step-by-step reasoning in your answer.\\n\"\n", "\n", " # We create messages for system + user.\n", " system_msg = SystemMessage(content=system_prompt)\n", " user_msg = UserMessage(content=user_question)\n", "\n", " with project_client.inference.get_chat_completions_client() as chat_client:\n", " response = chat_client.complete(\n", " model=phi4_deployment,\n", " messages=[system_msg, user_msg],\n", " temperature=0.8, # a bit creative\n", " top_p=0.9,\n", " max_tokens=400,\n", " )\n", "\n", " return response.choices[0].message.content\n", "\n", "# Example usage:\n", "question = \"I'm training for a 5K. Any tips on a weekly workout schedule?\"\n", "answer = chat_with_phi4(question, chain_of_thought=True)\n", "print(\"🗣️ User:\", question)\n", "print(\"🤖 Phi-4:\", answer)" ] }, { "cell_type": "markdown", "id": "bc68c40d", "metadata": {}, "source": [ "## 3. RAG-like Example (Stub)\n", "Phi-4 also excels in retrieval augmented generation scenarios, where you provide external context and let the model reason over it. Below is a **stub** example showing how you'd pass retrieved text as context.\n", "\n", "> In a real scenario, you'd embed & search for relevant passages, then feed them into the user/system message.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "419ea578", "metadata": {}, "outputs": [], "source": [ "def chat_with_phi4_rag(user_question, retrieved_doc):\n", " \"\"\"Simulate an RAG flow by appending retrieved context to the system prompt.\"\"\"\n", " system_prompt = (\n", " \"You are Phi-4, helpful fitness AI.\\n\"\n", " \"We have some context from the user's knowledge base: \\n\"\n", " f\"{retrieved_doc}\\n\"\n", " \"Please use this context to help your answer. If the context doesn't help, say so.\\n\"\n", " )\n", "\n", " system_msg = SystemMessage(content=system_prompt)\n", " user_msg = UserMessage(content=user_question)\n", "\n", " with project_client.inference.get_chat_completions_client() as chat_client:\n", " response = chat_client.complete(\n", " model=phi4_deployment,\n", " messages=[system_msg, user_msg],\n", " temperature=0.3,\n", " max_tokens=300,\n", " )\n", " return response.choices[0].message.content\n", "\n", "# Let's define a dummy doc snippet:\n", "doc_snippet = \"Recommended to run 3 times per week and mix with cross-training.\\n\" \\\n", " \"Include rest days or active recovery days for muscle repair.\"\n", "\n", "user_q = \"How often should I run weekly to prepare for a 5K?\"\n", "rag_answer = chat_with_phi4_rag(user_q, doc_snippet)\n", "print(\"🗣️ User:\", user_q)\n", "print(\"🤖 Phi-4 (RAG):\", rag_answer)" ] }, { "cell_type": "markdown", "id": "3a33a375", "metadata": {}, "source": [ "## 4. Wrap-Up & Best Practices\n", "1. **Chain-of-Thought**: Great for debugging or certain QA tasks, but be mindful about revealing chain-of-thought to end users.\n", "2. **RAG**: Use `azure-ai-inference` with retrieval results to ground your answers.\n", "3. **OpenTelemetry**: Optionally integrate `opentelemetry-sdk` and `azure-core-tracing-opentelemetry` for full observability.\n", "4. **Evaluate**: Use `azure-ai-evaluation` to measure your model’s performance.\n", "5. **Cost & Performance**: Phi-4 aims to provide near GPT-4 performance at lower cost. Evaluate for your domain needs.\n", "\n", "## 🎉 Congratulations!\n", "You've seen how to:\n", "- Use **Phi-4** with `AIProjectClient` and `azure-ai-inference`.\n", "- Create a **chat** flow with chain-of-thought.\n", "- Stub a **RAG** scenario.\n", "\n", "> Happy hacking! 🏋️\n" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 5 }

2-notebooks/1-chat_completion/4-phi-4.ipynb (228 lines of code) (raw):