Copyright (c) 2023 Graphcore Ltd. All rights reserved.

For all available notebooks, check [IPU-powered Jupyter Notebooks](https://www.graphcore.ai/ipu-jupyter-notebooks) to see how IPUs perform on other tasks.

# Zero-Shot Text Classification on IPUs using MT5 - Inference

This notebook shows you how to use the multilingual variant of T5, [MT5](https://huggingface.co/models?other=arxiv:2010.11934), for [zero-shot text classification](https://huggingface.co/tasks/zero-shot-classification) in languages other than English on the Graphcore IPU. We use the large configuration of MT5 finetuned on the [XNLI corpus](https://huggingface.co/datasets/xnli) to showcase this. 

Since MT5 has no sequence classification head, it is currently not compatible with the [zero-shot-classification Huggingface pipelines API](https://huggingface.co/docs/transformers/main/main_classes/pipelines#transformers.ZeroShotClassificationPipeline). We demonstrate explicitly how to perform text-generation in cases like this. The content displayed in this notebook is the same as that shown in the [Alan Turing Institute MT5 large XNLI finetuned zero-shot example](https://huggingface.co/alan-turing-institute/mt5-large-finetuned-mnli-xtreme-xnli) with minor changes for IPU exeuction. 

|  Domain | Tasks | Model | Datasets | Workflow |   Number of IPUs   | Execution time |
|---------|-------|-------|----------|----------|--------------|--------------|
| Natural language processing | Zero-shot classification | mt5-large | - | Inference | 8 | 30min |

[![Join our Slack Community](https://img.shields.io/badge/Slack-Join%20Graphcore's%20Community-blue?style=flat-square&logo=slack)](https://www.graphcore.ai/join-community)

## Environment setup

The best way to run this demo is on Paperspace Gradient's cloud IPUs because everything is already set up for you.

[![Run on Gradient](https://assets.paperspace.io/img/gradient-badge.svg)]()

To run the demo using other IPU hardware, you need to have the Poplar SDK enabled. Refer to the [Getting Started guide](https://docs.graphcore.ai/en/latest/getting-started.html#getting-started) for your system for details on how to enable the Poplar SDK. Also refer to the [Jupyter Quick Start guide](https://docs.graphcore.ai/projects/jupyter-notebook-quick-start/en/latest/index.html) for how to set up Jupyter to be able to run this notebook on a remote IPU machine.

## Dependencies and configuration

In order to improve usability and support for future users, Graphcore would like to collect information about the
applications and code being run in this notebook. The following information will be anonymised before being sent to Graphcore:

- User progression through the notebook
- Notebook details: number of cells, code being run and the output of the cells
- Environment details

You can disable logging at any time by running `%unload_ext graphcore_cloud_tools.notebook_logging.gc_logger` from any cell.

Install the dependencies for this notebook.

In [None]:
%pip install "optimum-graphcore==0.7"
%pip install graphcore-cloud-tools[logger]@git+https://github.com/graphcore/graphcore-cloud-tools
%load_ext graphcore_cloud_tools.notebook_logging.gc_logger

In [None]:
import os

num_available_ipus = int(os.getenv("NUM_AVAILABLE_IPU", 0))
if num_available_ipus < 8:
    raise EnvironmentError(
        f"This notebook requires 8 IPUs but only {num_available_ipus} are available. "
        "Try this notebook on IPU-POD16 or Bow-POD16 on Paperspace."
    )
executable_cache_dir = os.getenv("POPLAR_EXECUTABLE_CACHE_DIR", "/tmp/exe_cache/") + "/mt5_zero_shot_classification"

## Configure MT5 for the IPU


Ordinarily given a finetuned model checkpoint and pipeline task, inference on the IPU can be performed by making a few changes as shown in the example below:

```diff
-from transformers import pipeline
+from optimum.graphcore import pipeline, IPUConfig
 
-pipe = pipeline(task="text2text-generation", model="t5-small")
+ipu_config = IPUConfig.from_pretrained("Graphcore/t5-small-ipu", inference_layers_per_ipu=[3, 3,3 ,3])
+pipe = pipeline(task="text2text-generation", model="t5-small", ipu_config=ipu_config)
 pipe("Translate English to Romanian: The quick brown fox jumped over the lazy dog")
 [{'generated_text': 'vulporul brun a sărit rapid peste câinul leneş'}]
}
```

However since MT5 is not a supported model for the zero-shot-classification pipeline we cannot proceed with inference in the same was as above. Instead, we proceed by:
1. Instantiating an MT5 model 
2. Configuring the model to run on the IPU in inference mode
3. Tokenizing input sequences for zero-shot classification
4. Obtaining output logits from the inference model to compute classification probabilities

For the first step we load the fine-tuned mt5 model on the XNLI corpus from the Huggingface Model Hub:


In [None]:
from transformers import MT5ForConditionalGeneration
model_checkpoint = "alan-turing-institute/mt5-large-finetuned-mnli-xtreme-xnli"
model = MT5ForConditionalGeneration.from_pretrained(model_checkpoint).eval()

In order to configure the model to run on the IPU, we load the corresponding IPUConfig from the Huggingface hub that specifies how to place the constituent layers of `mt5-large` on the IPU:

In [None]:
from optimum.graphcore import IPUConfig
ipu_config = IPUConfig.from_pretrained("Graphcore/mt5-large-ipu", executable_cache_dir=executable_cache_dir)

Now we can configure the model to run on the IPU by firstly obtaining the IPU pipelined variant of the model by using the `to_pipelined` function from Optimum Graphcore. Before parallelizing the model for the IPU, we also set the model to run in half precision:

In [None]:
from optimum.graphcore.modeling_utils import to_pipelined
ipu_model = to_pipelined(model, ipu_config=ipu_config).half().parallelize(for_generation=True)

## Preprocessing the data 

In order to use MT5 for the XNLI task, any input sequence needs to be of the form "xnli: premise: {example premise} hypothesis: {example hypothesis}". Below we create some example Spanish sequences for classification and define a function that manipulates each example to be in the required format. 

In [None]:
sequence_to_classify = "¿A quién vas a votar en 2020?"
candidate_labels = ["Europa", "salud pública", "política"]
hypothesis_template = "Este ejemplo es {}."

# construct sequence of premise, hypothesis pairs
pairs = [(sequence_to_classify, hypothesis_template.format(label)) for label in
        candidate_labels]

print(pairs)

def process_nli(premise: str, hypothesis: str):
    """ process to required xnli format with task prefix """
    return "".join(['xnli: premise: ', premise, ' hypothesis: ', hypothesis])

# format for mt5 xnli task
seqs = [process_nli(premise=premise, hypothesis=hypothesis) for
        premise, hypothesis in pairs]

seqs

The input sequences can now be tokenized for input to the model:

In [None]:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
inputs = tokenizer.batch_encode_plus(seqs, return_tensors="pt", padding=True)
inputs

## Zero-Shot text classification

Given the example inputs and model configured for the IPU, we can now perform text generation and obtain output logit scores that can be used for classification:

In [None]:
import torch

out = ipu_model.generate(**inputs, output_scores=True, return_dict_in_generate=True,
                     num_beams=1)

# sanity check that our sequences are expected length (1 + start token + end token = 3)
for i, seq in enumerate(out.sequences):
    assert len(seq) == 3, f"generated sequence {i} not of expected length, 3. Actual length: {len(seq)}"

# get the scores for our only token of interest
# we'll now treat these like the output logits of a `*ForSequenceClassification` model
scores = out.scores[0].to(torch.float32)

The output scores have dimension number of sequences x vocabulary size. However for [Natural Language Inference](http://nlpprogress.com/english/natural_language_inference.html) we are interested in only the logits scores for the tokens `contradicts`, `neutral` and `entails`. Below we subset the obtained scores to include only the aforementioned tokens.

In [None]:
ENTAILS_LABEL = "▁0"
NEUTRAL_LABEL = "▁1"
CONTRADICTS_LABEL = "▁2"

label_inds = tokenizer.convert_tokens_to_ids(
    [ENTAILS_LABEL, NEUTRAL_LABEL, CONTRADICTS_LABEL])

# scores has a size of the model's vocab.
# However, for this task we have a fixed set of labels
# sanity check that these labels are always the top 3 scoring
for i, sequence_scores in enumerate(scores):
    top_scores = sequence_scores.argsort()[-3:]
    assert set(top_scores.tolist()) == set(
        label_inds
    ), f"top scoring tokens are not expected for this task. Expected: {label_inds}. Got: {top_scores.tolist()}."

# new indices of entailment and contradiction in scores
entailment_ind = 0
contradiction_ind = 2

# cut down scores to our task labels
scores = scores[:, label_inds]
scores

We can use the logits to show a binary classification view per input sequence of entailment vs contradiction. Alternatively a multinomial representaion can be shown for the single premise and its hypotheses:

In [None]:
# we can show, per item, the entailment vs contradiction probabilities
entail_vs_contra_scores = scores[:, [entailment_ind, contradiction_ind]]
entail_vs_contra_probabilities = torch.nn.functional.softmax(entail_vs_contra_scores, dim=1)
for seq, binary_score in dict(zip(seqs, entail_vs_contra_probabilities.tolist())).items():
    print(seq, binary_score)

# or we can show probabilities similar to `ZeroShotClassificationPipeline`
# this gives a zero-shot classification style output across labels
entail_scores = scores[:, entailment_ind]
entail_probabilities = torch.nn.functional.softmax(entail_scores, dim=0)

print(sequence_to_classify, dict(zip(candidate_labels, entail_probabilities.tolist())))

The cell below detaches the MT5 model from the device, allowing you to use available IPUs for other workloads.

In [None]:
ipu_model.detachFromDevice()

## Conclusion and next steps

We have demonstrated how to use MT5 to perform the task of zero-shot text classification on the IPU with a small number of modifications. In this notebook we have used a fine-tuned checkpoint available on the Huggingface hub, however, if you would like to see how to fine-tune MT5, take a look at our Machine Translation on IPUs using MT5 - Fine-tuning `mt5_translation.ipynb` notebook. For all available notebooks, check [IPU-powered Jupyter Notebooks](https://www.graphcore.ai/ipu-jupyter-notebooks) to see how IPUs perform on other tasks.


Have a question? Please contact us on our [Graphcore community channel](https://www.graphcore.ai/join-community).