doc/code/targets/use_huggingface_chat_target.ipynb (153 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"id": "0",
"metadata": {
"lines_to_next_cell": 2
},
"source": [
"# HuggingFace Chat Target Testing - optional\n",
"\n",
"This notebook is designed to demonstrate **instruction models** that use a **chat template**, allowing users to experiment with structured chat-based interactions. Non-instruct models are excluded to ensure consistency and reliability in the chat-based interactions. More instruct models can be explored on Hugging Face.\n",
"\n",
"## Key Points:\n",
"\n",
"1. **Supported Instruction Models**:\n",
" - This notebook supports the following **instruct models** that follow a structured chat template. These are examples, and more instruct models are available on Hugging Face:\n",
" - `HuggingFaceTB/SmolLM-360M-Instruct`\n",
" - `microsoft/Phi-3-mini-4k-instruct`\n",
"\n",
" - `...`\n",
"\n",
"2. **Excluded Models**:\n",
" - Non-instruct models (e.g., `\"google/gemma-2b\"`, `\"princeton-nlp/Sheared-LLaMA-1.3B-ShareGPT\"`) are **not included** in this demo, as they do not follow the structured chat template required for the current local Hugging Face model support.\n",
"\n",
"3. **Model Response Times**:\n",
" - The tests were conducted using a CPU, and the following are the average response times for each model:\n",
" - `HuggingFaceTB/SmolLM-1.7B-Instruct`: 5.87 seconds\n",
" - `HuggingFaceTB/SmolLM-135M-Instruct`: 3.09 seconds\n",
" - `HuggingFaceTB/SmolLM-360M-Instruct`: 3.31 seconds\n",
" - `microsoft/Phi-3-mini-4k-instruct`: 4.89 seconds\n",
" - `Qwen/Qwen2-0.5B-Instruct`: 1.38 seconds\n",
" - `Qwen/Qwen2-1.5B-Instruct`: 2.96 seconds\n",
" - `stabilityai/stablelm-2-zephyr-1_6b`: 5.31 seconds\n",
" - `stabilityai/stablelm-zephyr-3b`: 8.37 seconds\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running model: Qwen/Qwen2-0.5B-Instruct\n",
"Average response time for Qwen/Qwen2-0.5B-Instruct: 2.60 seconds\n",
"\n",
"\u001b[22m\u001b[39mConversation ID: b73edd3f-7c20-42af-94ee-ab80e55e02ef\n",
"\u001b[1m\u001b[34muser: What is 3*3? Give me the solution.\n",
"\u001b[22m\u001b[33massistant: The solution for \\(3 \\times 3\\) is 9.\n",
"\n",
"To understand how multiplication works, let's simplify \\(3\\) and \\(3\\\n",
"\u001b[22m\u001b[39mConversation ID: dac45be7-5ff5-469c-a343-ad9670655fe4\n",
"\u001b[1m\u001b[34muser: What is 4*4? Give me the solution.\n",
"\u001b[22m\u001b[33massistant: The answer to \"4 * 4\" is 16.\n",
"Qwen/Qwen2-0.5B-Instruct: 2.60 seconds\n"
]
}
],
"source": [
"import time\n",
"\n",
"from pyrit.common import IN_MEMORY, initialize_pyrit\n",
"from pyrit.orchestrator import PromptSendingOrchestrator\n",
"from pyrit.prompt_target import HuggingFaceChatTarget\n",
"\n",
"initialize_pyrit(memory_db_type=IN_MEMORY)\n",
"\n",
"# models to test\n",
"model_id = \"Qwen/Qwen2-0.5B-Instruct\"\n",
"\n",
"# List of prompts to send\n",
"prompt_list = [\"What is 3*3? Give me the solution.\", \"What is 4*4? Give me the solution.\"]\n",
"\n",
"# Dictionary to store average response times\n",
"model_times = {}\n",
"\n",
"print(f\"Running model: {model_id}\")\n",
"\n",
"try:\n",
" # Initialize HuggingFaceChatTarget with the current model\n",
" target = HuggingFaceChatTarget(model_id=model_id, use_cuda=False, tensor_format=\"pt\", max_new_tokens=30)\n",
"\n",
" # Initialize the orchestrator\n",
" orchestrator = PromptSendingOrchestrator(objective_target=target, verbose=False)\n",
"\n",
" # Record start time\n",
" start_time = time.time()\n",
"\n",
" # Send prompts asynchronously\n",
" responses = await orchestrator.send_prompts_async(prompt_list=prompt_list) # type: ignore\n",
"\n",
" # Record end time\n",
" end_time = time.time()\n",
"\n",
" # Calculate total and average response time\n",
" total_time = end_time - start_time\n",
" avg_time = total_time / len(prompt_list)\n",
" model_times[model_id] = avg_time\n",
"\n",
" print(f\"Average response time for {model_id}: {avg_time:.2f} seconds\\n\")\n",
"\n",
" # Print the conversations\n",
" await orchestrator.print_conversations_async() # type: ignore\n",
"\n",
"except Exception as e:\n",
" print(f\"An error occurred with model {model_id}: {e}\\n\")\n",
" model_times[model_id] = None\n",
"\n",
"# Print the model average time\n",
"if model_times[model_id] is not None:\n",
" print(f\"{model_id}: {model_times[model_id]:.2f} seconds\")\n",
"else:\n",
" print(f\"{model_id}: Error occurred, no average time calculated.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2",
"metadata": {},
"outputs": [],
"source": [
"from pyrit.memory import CentralMemory\n",
"\n",
"memory = CentralMemory.get_memory_instance()\n",
"memory.dispose_engine()"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}