# Named Entity Recognition on IPU using BERT - Inference

Integration of the Graphcore Intelligence Processing Unit (IPU) and the [ü§ó Transformers library](https://huggingface.co/docs/transformers/index) means that it only takes a few lines of code to perform complex tasks which require deep learning.

In this notebook we perform **name entity recognition (NER)**  also known as token classification. Name entity recognition uses natural language processing models to classify the words inside a prompt. 

The ease-of-use of the `pipeline` interface lets us quickly experiment with the pre-trained models and identify which one will work best.
This simple interface means that it is straightforward to access the fast inference performance of the IPU on your application.

<img src="images/name_entity_extraction.png" alt="Widget inference on a token classification task" style="width:800px;">


|  Domain | Tasks | Model | Datasets | Workflow |   Number of IPUs   | Execution time |
|---------|-------|-------|----------|----------|--------------|--------------|
| Natural language processing | Token classification | Multiple | - | Inference | 4 | ~4min |

[![Join our Slack Community](https://img.shields.io/badge/Slack-Join%20Graphcore's%20Community-blue?style=flat-square&logo=slack)](https://www.graphcore.ai/join-community)

## Environment setup

The best way to run this demo is on Paperspace Gradient's cloud IPUs because everything is already set up for you.

[![Run on Gradient](https://assets.paperspace.io/img/gradient-badge.svg)](https://ipu.dev/3XgZ7V2)

To run the demo using other IPU hardware, you need to have the Poplar SDK enabled. Refer to the [Getting Started guide](https://docs.graphcore.ai/en/latest/getting-started.html#getting-started) for your system for details on how to enable the Poplar SDK. Also refer to the [Jupyter Quick Start guide](https://docs.graphcore.ai/projects/jupyter-notebook-quick-start/en/latest/index.html) for how to set up Jupyter to be able to run this notebook on a remote IPU machine.

## Dependencies and configuration

In order to improve usability and support for future users, Graphcore would like to collect information about the
applications and code being run in this notebook. The following information will be anonymised before being sent to Graphcore:

- User progression through the notebook
- Notebook details: number of cells, code being run and the output of the cells
- Environment details

You can disable logging at any time by running `%unload_ext graphcore_cloud_tools.notebook_logging.gc_logger` from any cell.

Install the dependencies for this notebook.

In [None]:
%pip install "optimum-graphcore==0.7"
%pip install emoji==0.6.0 gradio
%pip install graphcore-cloud-tools[logger]@git+https://github.com/graphcore/graphcore-cloud-tools
%load_ext graphcore_cloud_tools.notebook_logging.gc_logger

The location of the cache directories can be configured through environment variables or directly in the notebook:

In [None]:
import os
executable_cache_dir = os.getenv("POPLAR_EXECUTABLE_CACHE_DIR", "./exe_cache/")
share_gradio = bool(os.getenv("GRADIO_SHARE_APP", False))

## NER with the `transformers` pipelines on the IPU


The simplest way to get a model running on the IPU is through the `transformers` library, which provides the `pipeline` function that bundles together a set of models which have been validated to work on a range of different tasks. 

Let's load our model config to start using pipelines on the IPU:

In [None]:
from optimum.graphcore import pipelines
inference_config = dict(layers_per_ipu=[40], ipus_per_replica=1, enable_half_partials=True,
                        executable_cache_dir=executable_cache_dir)

For our named entity extraction (NER) task, we can use the `pipeline` function and set our task to `ner` which loads the [TokenClassificationPipeline](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/pipelines#transformers.TokenClassificationPipeline).

The `inference_config` can now be used to initialise the pipeline on the IPU:

In [None]:
ner_pipeline = pipelines.pipeline("ner", 
                                  ipu_config=inference_config, 
                                  padding='max_length', 
                                  max_length=256)

Now we can create a prompt which we can use to test our pipeline.
The general `ner_pipeline` should identify locations, names, organisations and miscellaneous items.

In [None]:
ner_examples = [
    "My name is Janet and I live in Berlin, I work at the hospital as a Doctor.",
    "Anita was an incredible software developer working for Google, she lived in Spain but commuted to London regularly",
    "The best thing about shopping at Walmart is the options! I never got this many options when I lived in Croatia."
]

We can use our model pipeline to do NER on our examples. For instance, let's look at our model outputs for our first prompt.


In [None]:
output_ner = ner_pipeline(ner_examples[0])
output_ner

The above output displays the results of our model for the first prompt in our examples list. This output is not very intuitive or immediately useful.
Instead, let's see what our model's outputs are if we build a fast and simple `gradio` app which uses our pipeline to process our outputs on the IPU.

Using `gradio`, the `app_for_pipeline` function will build a small app which includes a text prompt and will render the entities which were identified:

In [None]:
import gradio as gr
prompt = "Let's use an app to do some text summarization!"
out = ner_pipeline(prompt)

def app_for_pipeline(pipeline, examples=[], description="", label_description=""):
    demo = gr.Blocks(   
        title=description,
    )
    with demo:
        inputs = gr.Textbox(
            lines=3,
        )
        outputs=gr.HighlightedText(
            label=label_description,
            combine_adjacent=True,
            value=dict(text=prompt, entities=out)
        )
        examples_block = gr.Examples(examples=examples, inputs=inputs, outputs=outputs)
        inputs.change(
            fn=lambda x: dict(text=x, entities=pipeline(x)),
            inputs=inputs, outputs=outputs, postprocess=True
        )
    return demo

Now let's see what our examples look like within the app.

In [None]:
out = ner_pipeline(prompt)
demo = app_for_pipeline(ner_pipeline, ner_examples).launch(share=share_gradio)

That looks great!

Using `gradio` we are able to clearly tell which words are being correctly categorised by our model.

This is all aided by the IPU which quickly processes our inputs and returns the model outputs to create a very responsive interface.

Next we must detach our model from the IPU to release resources. You can learn more about this in our resource management notebook `useful-tips/managing_ipu_resources.ipynb`.

In [None]:
ner_pipeline.model.detachFromDevice()

Pipelines on the IPU also provides us with the flexibility and simplicity to quickly change the task to suit our needs. 

In the next sections, we will see how easy it is to swap out the default `ner` model with a multilingual model and a biomedical model, enabling us to effectively run experiments on prompts specific to these applications. This will be achieved by creating pipelines which are just as responsive and interactive as our first experiment as we will be utilising the processing power of the IPU.

### Multilingual model

The advantage of using pipelines on the IPU is that we can quickly load different models for different tasks.

The first pipeline was specifically trained for English, but we can look at other model checkpoints which are able to classify inputs from multiple languages.

The [`Davlan/bert-base-multilingual-cased-ner-hrl`](https://huggingface.co/Davlan/bert-base-multilingual-cased-ner-hrl) has been fine-tuned for 10 languages: Arabic, German, English, Spanish, French, Italian, Latvian, Dutch, Portuguese and Chinese. 

This checkpoint is able to identify similar classes to our first pipeline. It can identify location (LOC), organizations (ORG), and person (PER). 

Let's load this checkpoint using the `pipeline` function:

In [None]:
multilingual_model = "Davlan/bert-base-multilingual-cased-ner-hrl"
ner_pipeline_multilingual = pipelines.pipeline(
    "ner", model=multilingual_model, ipu_config=inference_config,
    padding='max_length', max_length=256
)
multilingual_output = ner_pipeline_multilingual(prompt)

Now we can create some prompts that should work within this new model. The following examples are in French, Latvian and Spanish.

In [None]:
multilingual_examples = ["A Budapest, il y a une grande piscine que les touristes visitent.",
                         "Vai Marriot viesnƒ´cƒÅ BarselonƒÅ ir palikusi brƒ´va vieta?",
                         "Usamos la aerol√≠nea Easy Jet para llegar all√≠."]

Again, we can port our model pipeline to the `gradio` app which we created earlier.


In [None]:
multilingual_demo = app_for_pipeline(ner_pipeline_multilingual, examples=multilingual_examples)
multilingual_demo.launch(share=share_gradio)

From doing that we have seen how easy it is to swap to the multilingual model, which works really well at identifying and extracting information from a variety of different languages.

We can now free up resources by detaching the model from the IPU.

In [None]:
ner_pipeline_multilingual.model.detachFromDevice()

### BioMedical BERT

In this section, we will see how to use pipelines to execute name entity extraction within the biomedical field.

Within the biomedical industry, hospital staff often have to read and analyse a large amount of text from patient records such as medical histories. 
Their ability to reliably retrieve specific information about patients is extremely vital to their job, which could be challenging, particularly for patients with large medical histories. 

NER would be a powerful tool to utilise for assisted tagging. Highlighting the critical information within these records could enable hospital workers to analyse information with more ease and efficiency.

Thankfully, we already have a model which is trained to do exactly that. The [`d4data/biomedical-ner-all`](https://huggingface.co/d4data/biomedical-ner-all?n.) has been fine tuned using biomedical case studies, and is able to extract 84 different entity types related to age, sex, medical history, symptoms, events, and many other classes.

Let's load up this checkpoint and see how it performs.

In [None]:
medical_model = "d4data/biomedical-ner-all"
ner_pipeline_medical = pipelines.pipeline(
    "ner", model= medical_model, 
    ipu_config=inference_config,
    padding='max_length', 
    max_length=256
)
medical_output = ner_pipeline_medical(prompt)

We can create some examples which are more focused on medical cases to see how useful the model could be.

In [None]:
medical_examples = [
"The 56 year old patient had a really bad sprain in their knee. We might have to do surgery as they have a previous history of a damaged ACL.",
"This winter there were outbreaks of Covid-19 , flu and colds. The worst cases were in those over the age of 70 with pre-existing health conditions such as heart disease.",
"The 98 year old woman was extremely healthy with very few medical conditions, just arthritis and high cholesterol as expected for her age."]

Let's see another use case for NER with a model tuned for biomedical data!

In [None]:
app_for_pipeline(
    ner_pipeline_medical, 
    examples=medical_examples,
    description="Try prompting me with some medical anecdotes!"
).launch(share=share_gradio)

After testing out our app and model we must now free up resources by detaching the model from the IPU.

In [None]:
ner_pipeline_medical.model.detachFromDevice()

As we can see, the results of the model are very descriptive and we are able to test out different inputs in our pipeline to enable us to identify important patient information using the biomedical model.

## Conclusion

This notebook showed us how easy it is to use the IPU interactively through a `gradio` app. The IPU was utilised as a powerful backend for inference, giving us an incredibly fast and responsive interface for real-time results on user inputs for the NER task.

This was done using only 2 lines of code! All we had to do was define the IPU config and pass that to the pipeline. 

This ease-of-use allowed for flexibility when changing tasks to solve problems in the biomedical field and for multilingual inputs. Using this notebook you can go a step further by experimenting with many other NER models which are available on the [ü§ó Models Hub](https://huggingface.co/models?pipeline_tag=token-classification&sort=downloads).

While this notebook is focused on using the model for inference, our token classification `other-use-cases/token_classification.ipynb` notebook will show you how to use your own dataset to fine-tune a model using the [`datasets`](https://huggingface.co/docs/datasets/index) package.

The method used to enable the IPU to use pipelines can even be replicated for other tasks such as sentiment analysis, translation and summarization, meaning that you can get started on any task with hundreds of models available on the [ü§ó Models Hub.](https://huggingface.co/models?pipeline_tag=token-classification&sort=downloads). Look at our sentiment analysis notebook `sentiment_analysis.ipynb` to try out another example.

