2-notebooks/1-chat_completion/5-deep-seek-r1.ipynb

{ "cells": [ { "cell_type": "markdown", "id": "a3d8f7b1", "metadata": {}, "source": [ "# 🚀 DeepSeek-R1 Model with Azure AI Inference 🧠\n", "\n", "**DeepSeek-R1** is a state-of-the-art reasoning model combining reinforcement learning and supervised fine-tuning, excelling at complex reasoning tasks with 37B active parameters and 128K context window.\n", "\n", "In this notebook, you'll learn to:\n", "1. **Initialize** the ChatCompletionsClient for Azure serverless endpoints\n", "2. **Chat** with DeepSeek-R1 using reasoning extraction\n", "3. **Implement** a travel planning example with step-by-step reasoning\n", "4. **Leverage** the 128K context window for complex scenarios\n", "\n", "## Why DeepSeek-R1?\n", "- **Advanced Reasoning**: Specializes in chain-of-thought problem solving\n", "- **Massive Context**: 128K token window for detailed analysis\n", "- **Efficient Architecture**: 37B active parameters from 671B total\n", "- **Safety Integrated**: Built-in content filtering capabilities\n" ] }, { "cell_type": "markdown", "id": "d6e3a4c2", "metadata": {}, "source": [ "## 1. Setup & Authentication\n", "\n", "Required packages:\n", "- `azure-ai-inference`: For chat completions\n", "- `python-dotenv`: For environment variables\n", "\n", ".env file requirements:\n", "```bash\n", "AZURE_INFERENCE_ENDPOINT=<your-endpoint-url>\n", "AZURE_INFERENCE_KEY=<your-api-key>\n", "MODEL_NAME=DeepSeek-R1\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "a53f8d4c", "metadata": {}, "outputs": [], "source": [ "import os\n", "import re\n", "from dotenv import load_dotenv\n", "from azure.ai.inference import ChatCompletionsClient\n", "from azure.ai.inference.models import SystemMessage, UserMessage\n", "from azure.core.credentials import AzureKeyCredential\n", "\n", "# Load environment\n", "load_dotenv()\n", "endpoint = os.getenv(\"AZURE_INFERENCE_ENDPOINT\")\n", "key = os.getenv(\"AZURE_INFERENCE_KEY\")\n", "model_name = os.getenv(\"MODEL_NAME\", \"DeepSeek-R1\")\n", "\n", "# Initialize client\n", "try:\n", " client = ChatCompletionsClient(\n", " endpoint=endpoint,\n", " credential=AzureKeyCredential(key)\n", " )\n", " print(\"✅ Client initialized | Model:\", client.get_model_info().model_name)\n", "except Exception as e:\n", " print(\"❌ Initialization failed:\", e)" ] }, { "cell_type": "markdown", "id": "3c01d5d9", "metadata": {}, "source": [ "## 2. Intelligent Travel Planning ✈️\n", "\n", "Demonstrate DeepSeek-R1's reasoning capabilities for trip planning:" ] }, { "cell_type": "code", "execution_count": null, "id": "e6a5d8d9", "metadata": {}, "outputs": [], "source": [ "def plan_trip_with_reasoning(query, show_thinking=False):\n", " \"\"\"Get travel recommendations with reasoning extraction\"\"\"\n", " messages = [\n", " SystemMessage(content=\"You are a travel expert. Provide detailed plans with rationale.\"),\n", " UserMessage(content=f\"{query} Include hidden gems and safety considerations.\")\n", " ]\n", " \n", " response = client.complete(\n", " messages=messages,\n", " model=model_name,\n", " temperature=0.7,\n", " max_tokens=1024\n", " )\n", " \n", " content = response.choices[0].message.content\n", " \n", " # Extract reasoning if present\n", " if show_thinking:\n", " match = re.search(r\"<think>(.*?)</think>(.*)\", content, re.DOTALL)\n", " if match:\n", " return {\"thinking\": match.group(1).strip(), \"answer\": match.group(2).strip()}\n", " return content\n", "\n", "# Example usage\n", "query = \"Plan a 5-day cultural trip to Kyoto in April\"\n", "result = plan_trip_with_reasoning(query, show_thinking=True)\n", "\n", "print(\"🗺️ Query:\", query)\n", "if isinstance(result, dict):\n", " print(\"\\n🧠 Thinking Process:\", result[\"thinking\"])\n", " print(\"\\n📝 Final Answer:\", result[\"answer\"])\n", "else:\n", " print(\"\\n📝 Response:\", result)" ] }, { "cell_type": "markdown", "id": "5d8f1b3a", "metadata": {}, "source": [ "## 3. Technical Problem Solving 💻\n", "\n", "Showcase coding/optimization capabilities:" ] }, { "cell_type": "code", "execution_count": null, "id": "e5d4a3e1", "metadata": {}, "outputs": [], "source": [ "def solve_technical_problem(problem):\n", " \"\"\"Solve complex technical problems with structured reasoning\"\"\"\n", " response = client.complete(\n", " messages=[\n", " UserMessage(content=f\"{problem} Please reason step by step, and put your final answer within \\boxed{{}}.\")\n", " ],\n", " model=model_name,\n", " temperature=0.3,\n", " max_tokens=2048\n", " )\n", " \n", " return response.choices[0].message.content\n", "\n", "# Database optimization example\n", "problem = \"\"\"How can I optimize a PostgreSQL database handling 10k transactions/second?\n", "Consider indexing strategies, hardware requirements, and query optimization.\"\"\"\n", "\n", "print(\"🔧 Problem:\", problem)\n", "print(\"\\n⚙️ Solution:\", solve_technical_problem(problem))" ] }, { "cell_type": "markdown", "id": "3b9f7a8c", "metadata": {}, "source": [ "## 4. Best Practices & Considerations\n", "\n", "1. **Reasoning Handling**: Use regex to separate <think> content from final answers\n", "2. **Safety**: Built-in content filtering - handle HttpResponseError for violations\n", "3. **Performance**:\n", " - Max tokens: 4096\n", " - Rate limit: 200K tokens/minute\n", "4. **Cost**: Pay-as-you-go with serverless deployment\n", "5. **Streaming**: Implement response streaming for long completions\n", "\n", "```python\n", "# Streaming example\n", "response = client.complete(..., stream=True)\n", "for chunk in response:\n", " print(chunk.choices[0].delta.content or \"\", end=\"\")\n", "```\n", "\n", "## 🎯 Key Takeaways\n", "- Leverage 128K context for detailed analysis\n", "- Extract reasoning steps for debugging/analysis\n", "- Combine with Azure AI Content Safety for production\n", "- Monitor token usage via response.usage\n", "\n", "> Always validate model outputs for critical applications!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 5 }

2-notebooks/1-chat_completion/5-deep-seek-r1.ipynb (218 lines of code) (raw):