# Use OpenAI SDK with Meta Llama 3 NIM - 8B Instruct in Azure AI Foundry and Azure ML

Use `openai` SDK to consume Meta-llama-3.1-8B NIM deployments in Azure AI  Foundry and Azure ML. The Nvidia Meta Llama 3 family of models in Azure AI and Azure ML offers an API compatible with the OpenAI Chat Completion API. It allows customers and users to transition seamlessly from OpenAI models to Meta LLama LLMs. 

The API can be directly used with OpenAI's client libraries or third-party tools, like LangChain or LlamaIndex.

The example below shows how to make this transition using the OpenAI Python Library. Notice that Llama3 supports both text completions and chat completions API.

## Prerequisites

Before we start, there are certain steps we need to take to deploy the models:

* Register for a valid Azure account with subscription 
* Make sure you have access to [Azure AI Studio](https://learn.microsoft.com/en-us/azure/ai-studio/what-is-ai-studio?tabs=home)
* Create a project and resource group
* Select Nvidia NIM:  Meta Llama 3.1 -8b Instruct NIM models from Model catalog

![nim-models.png](nim-models.png)

Once deployed successfully, you should be assigned for an API endpoint and a security key for inference. 



To complete this tutorial, you will need to:

* Install `openai`:

    ```bash
    pip install openai
    ```

## Example

The following is an example about how to use `openai` with a Meta Llama 3 chat model deployed in Azure AI and Azure ML:

In [None]:
from openai import OpenAI

You will need to have a Endpoint url and Authentication Key associated with that endpoint. This can be acquired from previous steps. 
To work with `openai`, configure the client as follows:

- `base_url`: Use the endpoint URL from your deployment. Include `/v1` as part of the URL.
- `api_key`: Use your API key.

In [None]:
client = OpenAI(
    base_url="https://<endpoint>.<region>.inference.ml.azure.com/v1", api_key="<key>"
)

Use the client to create chat completions requests:

In [None]:
response = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Who is the most renowned French painter? Provide a short answer.",
        }
    ],
    model="meta/llama-3.1-8b-instruct",
)

The generated text can be accessed as follows:

In [None]:
print(response.choices[0].message.content)