# Using Elastic and OpenELM to Prototype Apple-Inspired AI

This is the supporting material for [this blog post.](https://search-labs.elastic.co/search-labs/blog/using-openelm-models)


In [2]:
%pip install elasticsearch

Collecting elasticsearch
  Downloading elasticsearch-8.15.1-py3-none-any.whl.metadata (8.7 kB)
Collecting elastic-transport<9,>=8.13 (from elasticsearch)
  Downloading elastic_transport-8.15.0-py3-none-any.whl.metadata (3.6 kB)
Downloading elasticsearch-8.15.1-py3-none-any.whl (524 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m524.6/524.6 kB[0m [31m31.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading elastic_transport-8.15.0-py3-none-any.whl (64 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.4/64.4 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: elastic-transport, elasticsearch
Successfully installed elastic-transport-8.15.0 elasticsearch-8.15.1


In [65]:
from elasticsearch import Elasticsearch, helpers, exceptions, ConnectionTimeout
from getpass import getpass

In [43]:
# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id
ELASTIC_CLOUD_ID = getpass("Elastic Cloud ID: ")

# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key
ELASTIC_API_KEY = getpass("Elastic Api Key: ")

# https://huggingface.co/docs/hub/en/security-tokens
HUGGINGFACE_TOKEN = getpass("Huggingface Token: ")

# https://huggingface.co/apple/OpenELM
MODEL = "apple/OpenELM-3B-Instruct"

# Create the client instance
client = Elasticsearch(
    # For local development
    # hosts=["http://localhost:9200"]
    cloud_id=ELASTIC_CLOUD_ID,
    api_key=ELASTIC_API_KEY,
)

Elastic Cloud ID: ··········
Elastic Api Key: ··········
Huggingface Token: ··········


## 2. Deploy the OpenELM Model


In [5]:
!git clone https://huggingface.co/apple/OpenELM

Cloning into 'OpenELM'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects: 100% (11/11), done.[K
remote: Compressing objects: 100% (11/11), done.[K
remote: Total 12 (delta 4), reused 0 (delta 0), pack-reused 1 (from 1)[K
Unpacking objects: 100% (12/12), 8.28 KiB | 2.07 MiB/s, done.


In [18]:
prompt = "Once upon a time there was"

In [19]:
!python /content/OpenELM/generate_openelm.py --model '{MODEL}' --hf_access_token '{HUGGINGFACE_TOKEN}' --prompt '{prompt}' --generate_kwargs repetition_penalty=1.2 prompt_lookup_num_tokens=10

Loading checkpoint shards: 100% 2/2 [00:01<00:00,  1.25it/s]
2024-09-30 04:46:39.147179: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-09-30 04:46:39.163451: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-30 04:46:39.181587: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-30 04:46:39.186881: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-30

## 3. Index Data in Elasticsearch


In [None]:
try:
    # client.options(request_timeout=5).inference.delete(inference_id="my-elser-model")
    client.options(request_timeout=5).inference.put(
        task_type="sparse_embedding",
        inference_id="my-elser-model",
        body={
            "service": "elser",
            "service_settings": {"num_allocations": 1, "num_threads": 1},
        },
    )
except ConnectionTimeout:
    pass

In [45]:
# Create the index
index_name = "mobile-assistant"
client.indices.delete(index=index_name, ignore_unavailable=True)
index_body = {
    "mappings": {
        "properties": {
            "title": {"type": "text", "analyzer": "english"},
            "description": {
                "type": "text",
                "analyzer": "english",
                "copy_to": "semantic_field",
            },
            "semantic_field": {
                "type": "semantic_text",
                "inference_id": "my-elser-model",
            },
        }
    }
}

client.indices.create(index=index_name, body=index_body)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'mobile-assistant'})

In [46]:
documents = [
    {
        "_index": index_name,
        "_id": "email1",
        "title": "Team Meeting Agenda",
        "description": "Hello team, Let's discuss our project progress in tomorrow's meeting. Please prepare your updates. Best regards, Manager",
    },
    {
        "_index": index_name,
        "_id": "email2",
        "title": "Client Proposal Draft",
        "description": "Hi, I've attached the draft of our client proposal. Could you review it and provide feedback? Thanks, Colleague",
    },
    {
        "_index": index_name,
        "_id": "email3",
        "title": "Weekly Newsletter",
        "description": "This week in tech: AI advancements, new smartphone releases, and cybersecurity updates. Read more on our website!",
    },
    {
        "_index": index_name,
        "_id": "email4",
        "title": "Urgent: Project Deadline Update",
        "description": "Dear team, Due to recent developments, we need to move up our project deadline. The new submission date is next Friday. Please adjust your schedules accordingly and let me know if you foresee any issues. We'll discuss this in detail during our next team meeting. Best regards, Project Manager",
    },
    {
        "_index": index_name,
        "_id": "email5",
        "title": "Invitation: Company Summer Picnic",
        "description": "Hello everyone, We're excited to announce our annual company summer picnic! It will be held on Saturday, July 15th, at Sunny Park. There will be food, games, and activities for all ages. Please RSVP by replying to this email with the number of guests you'll be bringing. We look forward to seeing you there! Best, HR Team",
    },
]

In [47]:
success, errors = helpers.bulk(client, documents, raise_on_error=False)
print(f"Successfully indexed {success} documents")
if errors:
    print("Errors encountered during bulk indexing:")
    for error in errors:
        print(error)

Successfully indexed 5 documents


## 4. Asking Questions


In [176]:
# https://github.com/riccardomusmeci/mlx-llm/blob/main/src/mlx_llm/prompt/openelm.py
def build_prompt(question, elasticsearch_documents):
    docs_text = "\n".join(
        [
            f"Subject: {doc['title']}\nBody: {doc['description']}"
            for doc in elasticsearch_documents
        ]
    )

    prompt = f"""
    You are a helpful virtual assistant.
    You must classify an email in one of the following categories:
    ['SPAM', 'Marketing', 'Project']
    Do not make up emails or email categories.
    EMAIL:
    {docs_text}
    Category:
    """

    return prompt


def retrieve_documents(question):
    search_body = {
        "size": 1,
        "query": {"semantic": {"query": question, "field": "semantic_field"}},
    }
    response = client.search(index=index_name, body=search_body)
    return [hit["_source"] for hit in response["hits"]["hits"]]

In [177]:
question = "how is the project going?"
documents = retrieve_documents(question)
prompt = build_prompt(question, documents)
prompt

"\n    You are a helpful virtual assistant.\n    You must classify an email in one of the following categories:\n    ['SPAM', 'Marketing', 'Project']\n    Do not make up emails or email categories.\n    EMAIL:\n    Subject: Urgent: Project Deadline Update\nBody: Dear team, Due to recent developments, we need to move up our project deadline. The new submission date is next Friday. Please adjust your schedules accordingly and let me know if you foresee any issues. We'll discuss this in detail during our next team meeting. Best regards, Project Manager\n    Category:\n    "

In [118]:
from OpenELM.generate_openelm import generate

In [178]:
output_text, generation_time = generate(
    prompt=prompt,
    model=MODEL,
    hf_access_token=HUGGINGFACE_TOKEN,
    generate_kwargs={"repetition_penalty": 1.2, "prompt_lookup_num_tokens": 10},
)
print("-----GENERATION TIME-----")
print(f"\033[92m {round(generation_time, 2)} \033[0m")
print("-----RESPONSE-----")
print(output_text)



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

-----GENERATION TIME-----
[92m 31.87 [0m
-----RESPONSE-----

    You are a helpful virtual assistant.
    You must classify an email in one of the following categories:
    ['SPAM', 'Marketing', 'Project']
    Do not make up emails or email categories.
    EMAIL:
    Subject: Urgent: Project Deadline Update
Body: Dear team, Due to recent developments, we need to move up our project deadline. The new submission date is next Friday. Please adjust your schedules accordingly and let me know if you foresee any issues. We'll discuss this in detail during our next team meeting. Best regards, Project Manager
    Category:
    1. SPAM: Email is spammy and does not meet the given criteria.
![alt text](./Assets/spam.PNG)
    2. Marketing: Email is promotional in nature and does not meet the given criteria.
![alt text](./Assets/marketing.PNG)
    3. Project: Email meets the given criteria.
<p align="center">
    <img src="./Assets/corrected_output.png" alt="Project" height="200"/>
</p>

---

###