api_inference/inference-api.ipynb (174 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"id": "ae8374a3",
"metadata": {},
"source": [
"This notebook demonstrates the use of the Inference API to test the Llama 3.1 model with 70B parameters! You can easily query the model using huggingface_hub's [Inference Client](https://huggingface.co/docs/huggingface_hub/guides/inference).\n",
"\n",
"Ensure you have huggingface_hub library installed or run the following cell:\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "8afba198",
"metadata": {},
"outputs": [],
"source": [
"!pip install -q huggingface_hub"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "709fe89b",
"metadata": {},
"outputs": [],
"source": [
"from huggingface_hub import InferenceClient, login"
]
},
{
"cell_type": "markdown",
"id": "37aedded",
"metadata": {},
"source": [
"Please, ensure you are logged in to Hugging Face or run the following cell:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6649919b",
"metadata": {},
"outputs": [],
"source": [
"login()"
]
},
{
"cell_type": "markdown",
"id": "b2084476",
"metadata": {},
"source": [
"We initialize a client for the Inference API endpoint.\n"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "c9914c23",
"metadata": {},
"outputs": [],
"source": [
"client = InferenceClient()"
]
},
{
"cell_type": "markdown",
"id": "923e8c04",
"metadata": {},
"source": [
"We send a list of messages to the endpoint. The appropriate chat template will be automatically applied.\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "00bf8af8",
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"chat_completion = client.chat.completions.create(\n",
" model=\"meta-llama/Llama-3.1-70B-Instruct\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"You are a helpful an honest programming assistant.\"},\n",
" {\"role\": \"user\", \"content\": \"Is Rust better than Python?\"},\n",
" ],\n",
" stream=True,\n",
" max_tokens=500,\n",
")\n"
]
},
{
"cell_type": "markdown",
"id": "0472f0d2",
"metadata": {},
"source": [
"Since streaming mode was enabled, we'll receive incremental responses from the server rather than waiting for the full response. We can iterate through the stream like this:\n"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "bc56aa9c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The answer ultimately depends on your specific needs, goals, and priorities. Both Rust and Python are excellent programming languages, but they have different strengths and use cases.\n",
"\n",
"**Rust's strengths:**\n",
"\n",
"1. **Memory safety**: Rust is designed with memory safety in mind, using a concept called ownership and borrowing. This prevents common errors like null pointer dereferences and data corruption.\n",
"2. **Performance**: Rust is a systems programming language that can generate highly optimized machine code. This makes it a great choice for building operating systems, file systems, and high-performance applications.\n",
"3. **Concurrency**: Rust's ownership system and explicit concurrency features make it easier to write safe and efficient concurrent programs.\n",
"4. **Error handling**: Rust's error handling system is designed to be explicit and ergonomic, making it easier to write robust code.\n",
"\n",
"**Python's strengths:**\n",
"\n",
"1. **Ease of use**: Python is a high-level language with a simple syntax and a vast number of libraries and frameworks. This makes it an excellent choice for beginners and rapid prototyping.\n",
"2. **Dynamic typing**: Python's dynamic typing system allows for more flexibility and ease of development, although it may require more testing and debugging.\n",
"3. **Large community**: Python has a massive community of developers, with many libraries, frameworks, and tools available.\n",
"4. **Data science and machine learning**: Python is a leader in the data science and machine learning communities, with popular libraries like NumPy, pandas, and scikit-learn.\n",
"\n",
"**When to use Rust:**\n",
"\n",
"1. **Systems programming**: Use Rust when building operating systems, file systems, or other low-level systems software.\n",
"2. **High-performance applications**: Use Rust when performance is critical, such as in games, scientific simulations, or high-frequency trading apps.\n",
"3. **Concurrency and parallelism**: Use Rust when building concurrent or parallel systems that require explicit control over memory and threading.\n",
"\n",
"**When to use Python:**\n",
"\n",
"1. **Rapid prototyping**: Use Python when you need to quickly test an idea or build a proof-of-concept.\n",
"2. **Data science and machine learning**: Use Python for data analysis, machine learning, and scientific computing tasks.\n",
"3. **Web development**: Use Python for web development with popular frameworks like Django, Flask, or Pyramid.\n",
"\n",
"Ultimately, Rust is not inherently \"better\" than Python or vice versa. The best language for your project depends on your specific requirements, goals, and the type of problem you're trying to solve."
]
}
],
"source": [
"for message in chat_completion:\n",
" print(message.choices[0].delta.content, end=\"\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}