2-notebooks/1-chat_completion/5-deep-seek-r1.ipynb (218 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"id": "a3d8f7b1",
"metadata": {},
"source": [
"# π DeepSeek-R1 Model with Azure AI Inference π§ \n",
"\n",
"**DeepSeek-R1** is a state-of-the-art reasoning model combining reinforcement learning and supervised fine-tuning, excelling at complex reasoning tasks with 37B active parameters and 128K context window.\n",
"\n",
"In this notebook, you'll learn to:\n",
"1. **Initialize** the ChatCompletionsClient for Azure serverless endpoints\n",
"2. **Chat** with DeepSeek-R1 using reasoning extraction\n",
"3. **Implement** a travel planning example with step-by-step reasoning\n",
"4. **Leverage** the 128K context window for complex scenarios\n",
"\n",
"## Why DeepSeek-R1?\n",
"- **Advanced Reasoning**: Specializes in chain-of-thought problem solving\n",
"- **Massive Context**: 128K token window for detailed analysis\n",
"- **Efficient Architecture**: 37B active parameters from 671B total\n",
"- **Safety Integrated**: Built-in content filtering capabilities\n"
]
},
{
"cell_type": "markdown",
"id": "d6e3a4c2",
"metadata": {},
"source": [
"## 1. Setup & Authentication\n",
"\n",
"Required packages:\n",
"- `azure-ai-inference`: For chat completions\n",
"- `python-dotenv`: For environment variables\n",
"\n",
".env file requirements:\n",
"```bash\n",
"AZURE_INFERENCE_ENDPOINT=<your-endpoint-url>\n",
"AZURE_INFERENCE_KEY=<your-api-key>\n",
"MODEL_NAME=DeepSeek-R1\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a53f8d4c",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import re\n",
"from dotenv import load_dotenv\n",
"from azure.ai.inference import ChatCompletionsClient\n",
"from azure.ai.inference.models import SystemMessage, UserMessage\n",
"from azure.core.credentials import AzureKeyCredential\n",
"\n",
"# Load environment\n",
"load_dotenv()\n",
"endpoint = os.getenv(\"AZURE_INFERENCE_ENDPOINT\")\n",
"key = os.getenv(\"AZURE_INFERENCE_KEY\")\n",
"model_name = os.getenv(\"MODEL_NAME\", \"DeepSeek-R1\")\n",
"\n",
"# Initialize client\n",
"try:\n",
" client = ChatCompletionsClient(\n",
" endpoint=endpoint,\n",
" credential=AzureKeyCredential(key)\n",
" )\n",
" print(\"β
Client initialized | Model:\", client.get_model_info().model_name)\n",
"except Exception as e:\n",
" print(\"β Initialization failed:\", e)"
]
},
{
"cell_type": "markdown",
"id": "3c01d5d9",
"metadata": {},
"source": [
"## 2. Intelligent Travel Planning βοΈ\n",
"\n",
"Demonstrate DeepSeek-R1's reasoning capabilities for trip planning:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e6a5d8d9",
"metadata": {},
"outputs": [],
"source": [
"def plan_trip_with_reasoning(query, show_thinking=False):\n",
" \"\"\"Get travel recommendations with reasoning extraction\"\"\"\n",
" messages = [\n",
" SystemMessage(content=\"You are a travel expert. Provide detailed plans with rationale.\"),\n",
" UserMessage(content=f\"{query} Include hidden gems and safety considerations.\")\n",
" ]\n",
" \n",
" response = client.complete(\n",
" messages=messages,\n",
" model=model_name,\n",
" temperature=0.7,\n",
" max_tokens=1024\n",
" )\n",
" \n",
" content = response.choices[0].message.content\n",
" \n",
" # Extract reasoning if present\n",
" if show_thinking:\n",
" match = re.search(r\"<think>(.*?)</think>(.*)\", content, re.DOTALL)\n",
" if match:\n",
" return {\"thinking\": match.group(1).strip(), \"answer\": match.group(2).strip()}\n",
" return content\n",
"\n",
"# Example usage\n",
"query = \"Plan a 5-day cultural trip to Kyoto in April\"\n",
"result = plan_trip_with_reasoning(query, show_thinking=True)\n",
"\n",
"print(\"πΊοΈ Query:\", query)\n",
"if isinstance(result, dict):\n",
" print(\"\\nπ§ Thinking Process:\", result[\"thinking\"])\n",
" print(\"\\nπ Final Answer:\", result[\"answer\"])\n",
"else:\n",
" print(\"\\nπ Response:\", result)"
]
},
{
"cell_type": "markdown",
"id": "5d8f1b3a",
"metadata": {},
"source": [
"## 3. Technical Problem Solving π»\n",
"\n",
"Showcase coding/optimization capabilities:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e5d4a3e1",
"metadata": {},
"outputs": [],
"source": [
"def solve_technical_problem(problem):\n",
" \"\"\"Solve complex technical problems with structured reasoning\"\"\"\n",
" response = client.complete(\n",
" messages=[\n",
" UserMessage(content=f\"{problem} Please reason step by step, and put your final answer within \\boxed{{}}.\")\n",
" ],\n",
" model=model_name,\n",
" temperature=0.3,\n",
" max_tokens=2048\n",
" )\n",
" \n",
" return response.choices[0].message.content\n",
"\n",
"# Database optimization example\n",
"problem = \"\"\"How can I optimize a PostgreSQL database handling 10k transactions/second?\n",
"Consider indexing strategies, hardware requirements, and query optimization.\"\"\"\n",
"\n",
"print(\"π§ Problem:\", problem)\n",
"print(\"\\nβοΈ Solution:\", solve_technical_problem(problem))"
]
},
{
"cell_type": "markdown",
"id": "3b9f7a8c",
"metadata": {},
"source": [
"## 4. Best Practices & Considerations\n",
"\n",
"1. **Reasoning Handling**: Use regex to separate <think> content from final answers\n",
"2. **Safety**: Built-in content filtering - handle HttpResponseError for violations\n",
"3. **Performance**:\n",
" - Max tokens: 4096\n",
" - Rate limit: 200K tokens/minute\n",
"4. **Cost**: Pay-as-you-go with serverless deployment\n",
"5. **Streaming**: Implement response streaming for long completions\n",
"\n",
"```python\n",
"# Streaming example\n",
"response = client.complete(..., stream=True)\n",
"for chunk in response:\n",
" print(chunk.choices[0].delta.content or \"\", end=\"\")\n",
"```\n",
"\n",
"## π― Key Takeaways\n",
"- Leverage 128K context for detailed analysis\n",
"- Extract reasoning steps for debugging/analysis\n",
"- Combine with Azure AI Content Safety for production\n",
"- Monitor token usage via response.usage\n",
"\n",
"> Always validate model outputs for critical applications!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}