# Managing IPU Resources from Notebooks

The execution model of IPUs and notebooks means that as you experiment with different models
you might keep hardware locked in an idle state, preventing other users from using it. It may also happen that 
your experiments fail because you have insufficient hardware.
Releasing hardware is particularly important in notebooks as the long life time of the
underlying `ipython` kernel can keep a lock on IPUs long after you are done interacting
with the hardware.

The Graphcore frameworks operate a computational architecture of 1 model = 1 IPU device;
this means that each model will attach to specific IPUs and will only release them when
that model goes out of scope or when resources are explicitly released.

In this notebook you will learn how to:

- Monitor how many IPUs your notebook is currently using
- Release IPUs by detaching a model
- Reattach a model to IPUs, to continue using a model after a period of inactivity.

For more information on the basics of IPU computational architecture refer to 
the [IPU Programmer's Guide](https://docs.graphcore.ai/projects/ipu-programmers-guide/en/latest/ipu_introduction.html).

## Environment Setup

The best way to run this demo is on Paperspace Gradient's cloud IPUs because everything is already set up for you.

To run the demo using other IPU hardware, you need to have the Poplar SDK enabled with PopTorch installed. Refer to the [Getting Started guide](https://docs.graphcore.ai/en/latest/getting-started.html#getting-started) for your system for details on how to enable the Poplar SDK and install PopTorch. Also refer to the [Jupyter Quick Start guide](https://docs.graphcore.ai/projects/jupyter-notebook-quick-start/en/latest/index.html) for how to set up Jupyter to be able to run this notebook on a remote IPU machine.

In order to improve usability and support for future users, Graphcore would like to collect information about the
applications and code being run in this notebook. The following information will be anonymised before being sent to Graphcore:

- User progression through the notebook
- Notebook details: number of cells, code being run and the output of the cells
- Environment details

You can disable logging at any time by running `%unload_ext graphcore_cloud_tools.notebook_logging.gc_logger` from any cell.

## Dependencies and configuration

Install the dependencies for this notebook.

In [None]:
%pip install "optimum-graphcore==0.7"
%pip install graphcore-cloud-tools[logger]@git+https://github.com/graphcore/graphcore-cloud-tools
%load_ext graphcore_cloud_tools.notebook_logging.gc_logger

In [None]:
import os

## Monitoring resources

Grapchore provides the `gc-monitor` utility for inspecting the number of available IPUs and their usage:

In [None]:
!gc-monitor

In a notebook, we can run this Bash command using `!` in a regular code cell. It provides detailed information on the IPUs that exist in the current partition.
The first section of the output is the `card-info`, which is generic information about the IP addresses and serial numbers of all the cards visible to the process.
The second section of the output indicates usage information of the IPU and shows the user, host and PIDs which are attached to the different IPUs.

When monitoring IPUs, it can be useful to run `gc-monitor` without displaying the static IPU information:

In [None]:
!gc-monitor --no-card-info

Finally, we can write a command that will monitor only the IPUs which are attached from this specific notebook. We do that by only displaying the IPUs attached to a specific PID:

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

Since we've not attached to any IPUs yet, there is no output.

Beyond `gc-monitor`, Graphcore also provides a library for monitoring usage called `gcipuinfo` which can be used in Python. This library is not covered in this tutorial but [examples are available in the `gcipuinfo` documentation](https://docs.graphcore.ai/projects/gcipuinfo/en/latest/examples.html).

### Creating models

Now let's create some models and attach them to IPUs. The simplest way to create a small model is using the inference `pipeline` provided by the `optimum-graphcore` library.

In [None]:
from optimum.graphcore import pipelines
sentiment_pipeline = pipelines.pipeline("sentiment-analysis")
sentiment_pipeline(["IPUs are great!", "Notebooks are easy to program in"])

Now let's check how many IPUs are in use:

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

These IPUs will be associated with the model in the pipeline until:

- The `sentiment_pipeline` object goes out of scope or
- The model is explicitly detached from the IPU.

By remaining attached to the IPUs, the model can be very fast, providing fast responses to new prompts:

In [None]:
%%timeit
sentiment_pipeline(["IPUs are fast once the pipeline is attached", "and Notebooks are easy to program in"])

If you are testing different models you might have multiple pipelines using IPUs:

In [None]:
sentiment_pipeline_2 = pipelines.pipeline("text-classification")
sentiment_pipeline_2(["IPUs are great!", "Notebooks are easy to program in"])

Checking the IPU usage we can see that we are now using four IPUs:

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

## Managing resources

From this we see that we are using four IPUs, two per active pipeline. While it may make sense for us to keep both pipelines active if we are testing both at the same time, we may need to free up resources to continue experimenting with more models.

To do that we can call the `detachFromDevice` method on the model:

In [None]:
sentiment_pipeline.model.detachFromDevice()

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

This method has freed up the IPU resources while keeping the pipeline object available, meaning that we can quickly reattach the same pipeline to an IPU simply by calling it:

In [None]:
simple_test_data=["I love you.", "I hate you!"]

In [None]:
%%time
sentiment_pipeline(simple_test_data)

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

The first call is slow as the model is loaded onto the accelerator, but subsequent calls will be fast:

In [None]:
%%time
sentiment_pipeline(simple_test_data)

The other way to release resources is to let the `sentiment_pipeline` Python variable go out of scope.
There are two main ways to do that:

1. if you want to use the resources for another pipeline you can assign another variable to the same name:

In [None]:
sentiment_pipeline = sentiment_pipeline_2

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

2. Explicitly use `del` to delete the variables:

In [None]:
# Note that after the assignment sentiment_pipeline and sentiment_pipeline_2
# refer to the same object so both symbols must be deleted to release the resources
del sentiment_pipeline
del sentiment_pipeline_2

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

As expected, no IPUs are used by the process anymore.

Alternatively, all IPUs will be released when the notebook kernel is restarted. This can be done from the Notebook graphical user interface by clicking on `Kernel > Restart`:

![Restart ipykernel](images/restart_kernel.png)


## Conclusion

In this simple tutorial we saw how to manage IPU resources from a notebook to make sure that we do not try to use more IPUs than are available on a single system.

Refer to the guide [Using IPUs from Jupyter Notebooks](https://github.com/graphcore/examples/tree/master/tutorials//tutorials/standard_tools/using_jupyter) for more information on using IPUs and the Poplar SDK through Jupyter notebooks.