# Tutorial notebook for modular `BoTorchModel` customization

NOTE: The functionality in this tutorial is still in its alpha stages.

Contents:
1. Overview of modular `BoTorchModel`
2. `BoTorchModel` instantiation
3. Use a custom BoTorch `AcquisitionFunction`
  1. [Path 1] Use the default Ax `Acquisition` class
  2. [Path 2] Create a custom Ax `Acquisition` subclass
  3. Set up storage for the new setup
4. Details of `BoTorchModel` Subcomponent Classes

# Overview of modular `BoTorchModel`

**`BoTorchModel` = `Surrogate` + `Acquisition`**

A `BoTorchModel` consists of two main subcomponents: a surrogate model and an acquisition function. A surrogate model is represented as an instance of [Ax’s `Surrogate` class](https://github.com/facebook/Ax/blob/main/ax/models/torch/botorch_modular/surrogate.py), which is a wrapper around [BoTorch's `Model` class](https://github.com/pytorch/botorch/blob/main/botorch/models/model.py). The acquisition function is represented as an instance of [Ax’s `Acquisition` class](https://github.com/facebook/Ax/blob/main/ax/models/torch/botorch_modular/acquisition.py), a wrapper around [BoTorch's `AcquisitionFunction` class](https://github.com/pytorch/botorch/blob/main/botorch/acquisition/acquisition.py). These two subcomponents are described in greater detail at the bottom of this tutorial.

**Core methods of `BoTorchModel`:** <br>
`fit` calls `Surrogate.fit` <br>
`predict` calls `Surrogate.predict` <br>
`gen` calls `Acquisition.optimize`

# `BoTorchModel` instantiation

In [1]:
from ax.models.torch.botorch_modular.acquisition import Acquisition
from ax.models.torch.botorch_modular.kg import KnowledgeGradient
from ax.models.torch.botorch_modular.model import BoTorchModel
from ax.models.torch.botorch_modular.surrogate import Surrogate
from botorch.models.gp_regression import FixedNoiseGP, SingleTaskGP

# Explicit instantiation of `BoTorchModel`.
model = BoTorchModel(
    surrogate=Surrogate(FixedNoiseGP),
    acquisition_class=KnowledgeGradient,     # This is a subclass of `Acquisition`.
)

If `surrogate` and/or `acquisition_class` are not passed into the constructor, then they will auto-selected based on properties of the experiment, search space, and the data available for it [like so](https://github.com/facebook/Ax/blob/main/ax/models/torch/botorch_modular/utils.py).

In [2]:
# The surrogate is not specified, so it will be auto-selected during `model.fit`.
model = BoTorchModel(
    acquisition_class=KnowledgeGradient
)

# The acquisition class is not specified, so it will be auto-selected during `model.gen`.
model = BoTorchModel(
    surrogate=Surrogate(FixedNoiseGP)
)

# Both the surrogate and acquisition class will be auto-selected.
model = BoTorchModel()

To use `ExpectedImprovement` and `NoisyExpectedImprovement`, initialize the `BoTorchModel` with the kwarg `botorch_acqf_class` instead of `acquisition_class`. By default, `acquisition_class` will be set to the base Ax `Acquisition` class.

In [3]:
from botorch.acquisition.monte_carlo import qExpectedImprovement
from botorch.acquisition.monte_carlo import qNoisyExpectedImprovement

EI_model = BoTorchModel(
    surrogate=Surrogate(FixedNoiseGP),
    botorch_acqf_class=qExpectedImprovement
)
NEI_model = BoTorchModel(
    surrogate=Surrogate(SingleTaskGP),
    botorch_acqf_class=qNoisyExpectedImprovement
)

# Use a Custom BoTorch `AcquisitionFunction`

## Choose between the default Ax `Acquisition` class and creating a custom `Acquisition` subclass 
In many cases, even when you want to use a custom BoTorch `AcquisitionFunction`, the default Ax `Acquisition` class may be enough **[Path 1]**. A custom Ax `Acquisition` subclass **[Path 2]** will be needed only if:

- a custom acquisition function optimization method is required
- a custom “model dependency” is required, where a “model dependency” is defined as any value that is computed based on the state or properties of the `Surrogate` model and needs to be passed into the constructor for the `AcquisitionFunction`

## [Path 1] Use the default Ax `Acquisition` class

Construct your model in the same way that `ExpectedImprovement` and `NoisyExpectedImprovement` are constructed above. Then, if you want to set up storage for the model, skip to after **[Path 2]**.

## [Path 2] Create a custom Ax `Acquisition` subclass

To start, here is the inheritance tree for the `KnowledgeGradient` and `MultiFidelityKnowledgeGradient` subclasses. 

`Acquisition` <br>
↳  `MultiFidelityAcquisition(Acquisition)` <br>
↳  `KnowledgeGradient(Acquisition)` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;↳  `MultiFidelityKnowledgeGradient(MultiFidelityAcquisition, KnowledgeGradient)` <br>
↳  `MyAcquisition(Acquisition)`    **← your new subclass**

The `Acquisition` class defines a default `optimize` function and `compute_model_dependencies` function. <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
**`optimize`**: makes a call to BoTorch's acquisition function optimizer with a specific set of kwargs specified by this function. <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
**`compute_model_dependencies`**: returns a dict of inputs to the BoTorch `AcquisitionFunction`.

Creating a custom `Acquisition` subclass involves overriding either (or both) of these two functions.

First, create the structure for your `Acquisition` subclass. Each `Acquisition` subclass must have a BoTorch `AcquisitionFunction` class associated with it. We will add **`optimize`** and **`compute_model_dependencies`** to this class.

In [4]:
# `qKnowledgeGradient` is being used as a placeholder here.

# from botorch.acquisition.my_acquisition import qMyAcquisition
from botorch.acquisition.knowledge_gradient import qKnowledgeGradient

class MyAcquisition(Acquisition):
    # default_botorch_acqf_class = qMyAcquisition
    default_botorch_acqf_class = qKnowledgeGradient


### [Path 2: Step 1] Override `Acquisition.optimize`

By default, the `Acquisition` subclasses run `super().optimize` but they specify their own `optimizer_options` (if needed). These `optimizer_options` are then sent into [BoTorch's `optimize_acqf` optimizer](https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).

The following arguments are always passed into the optimizer:
- `bounds`
- `q`
- `inequality_constraints`
- `fixed_features`
- `post_processing_func`

Any kwargs that `optimize_acqf` takes in that are not part of this list can be set by `optimizer_options`.

**NOTE:** If the optimizer for the BoTorch `AcquisitionFunction` that you want to use does not require any kwargs other than those in the list, then you do not need to override `Acquisition.optimize`.

As an example, for `MaxValueEntropySearch`, we want to use "sequential greedy" optimization of the acquisition function with a batch of `q > 1` candidates. So, we want `sequential=True` to be passed into `optimize_acqf`:

In [5]:
from typing import Any, Callable, Dict, List, Optional, Tuple
from ax.models.torch.botorch_modular.acquisition import Optimizer
from torch import Tensor

def optimize(
    self,
    bounds: Tensor,
    n: int,
    optimizer_class: Optional[Optimizer] = None,
    inequality_constraints: Optional[List[Tuple[Tensor, Tensor, float]]] = None,
    fixed_features: Optional[Dict[int, float]] = None,
    rounding_func: Optional[Callable[[Tensor], Tensor]] = None,
    optimizer_options: Optional[Dict[str, Any]] = None,
) -> Tuple[Tensor, Tensor]:
    optimizer_options = optimizer_options or {}
    optimizer_options["sequential"] = True
    return super().optimize(
        bounds=bounds,
        n=n,
        inequality_constraints=None,
        fixed_features=fixed_features,
        rounding_func=rounding_func,
        optimizer_options=optimizer_options,
    )

And with that, we are done overriding `Acquisition.optimize`.

### [Path 2: Step 2] Override Acquisition.compute_model_dependencies

Similar to the base `Acquisition.optimize`, the `Acquisition` subclasses run `super().compute_model_dependencies` but they add to the dictionary their own dependencies (if any). This `model_deps` dictionary is then sent into the BoTorch `AcquisitionFunction` constructor as `**model_deps`.

**NOTE:** If the BoTorch `AcquisitionFunction` that you want to use does not require any special `__init__` arguments other than `model`, `objective`, `X_pending`, and `X_baseline`, then you do not need to override `Acquisition.compute_model_dependencies`.

As an example, for `MaxValueEntropySearch`, we must pass into `qMaxValueEntropy.__init__` a `candidate_set: Tensor` and we also want to specify `maximize: bool`. For the exact code, [see here](https://github.com/facebook/Ax/blob/main/ax/models/torch/botorch_modular/mes.py).

In [6]:
@classmethod
def compute_model_dependencies(
    cls,
    surrogate: Surrogate,
    bounds: List[Tuple[float, float]],
    objective_weights: Tensor,
    pending_observations: Optional[List[Tensor]] = None,
    outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None,
    linear_constraints: Optional[Tuple[Tensor, Tensor]] = None,
    fixed_features: Optional[Dict[int, float]] = None,
    target_fidelities: Optional[Dict[int, float]] = None,
    options: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:

    # Get the dependencies of the parent class.
    dependencies = super().compute_model_dependencies(...)

    # Calculate `candidate_set`.
    candidate_set = ...
    # Calculate `maximize`.
    maximize = ...

    # Update and return the model dependencies.
    dependencies.update(
        {"candidate_set": candidate_set, "maximize": maximize}
    )
    return dependencies

And with that, we are done overriding `Acquisition.compute_model_dependencies`.

### [Path 2: Step 3] Put it all together and try it out

In [7]:
# `qKnowledgeGradient` is being used as a placeholder here.

# from botorch.acquisition.my_acquisition import qMyAcquisition
from botorch.acquisition.knowledge_gradient import qKnowledgeGradient

class MyAcquisition(Acquisition):
    # default_botorch_acqf_class = qMyAcquisition
    default_botorch_acqf_class = qKnowledgeGradient
    
    def optimize(
        self,
        # other kwargs,
    ) -> Tuple[Tensor, Tensor]:
        ...
        return super().optimize(...)
    
    @classmethod
    def compute_model_dependencies(
        cls,
        # other kwargs,
    ) -> Dict[str, Any]:
        dependencies = super().compute_model_dependencies(...)
        ...
        return dependencies

Now, we can use the new custom `Acquisition` subclass in the same way as we do for `KnowledgeGradient` in the **`BoTorchModel` instantiation** section.

## Set up storage for the new setup

Optionally, to allow the Ax models to be serializable (and allow for resumable optimization via JSON or SQL storage), navigate [here](https://github.com/facebook/Ax/blob/main/ax/storage/botorch_modular_registry.py).

1. If you created a new `Acquisition` subclass, add it to `ACQUISITION_REGISTRY`.
2. Add the corresponding BoTorch `AcquisitionFunction` class to `ACQUISITION_FUNCTION_REGISTRY`.

```
ACQUISITION_REGISTRY: Dict[Type[Acquisition], int] = {
    Acquisition: 0,
    KnowledgeGradient: 1,
    ...
    MyAcquisition: 5,            # Add this line.
}

ACQUISITION_FUNCTION_REGISTRY: Dict[Type[AcquisitionFunction], int] = {
    qExpectedImprovement: 0,
    qNoisyExpectedImprovement: 1,
    ...
    qMyAcquisition: 6,           # Add this line.
}
```

# Details of `BoTorchModel` Subcomponent Classes

## class `Surrogate`

Ax wrapper for BoTorch GP `Model` classes. Optionally, a BoTorch `MarginalLogLikelihood` class can also be passed into the construction of Surrogate.

**Core methods:** `fit`, `predict`, `update`, `construct`, `best_in_sample_point`, `best_out_of_sample_point`
These core methods are all called by BoTorchModel internally.







In [8]:
from botorch.models.gp_regression import FixedNoiseGP
from gpytorch.mlls.exact_marginal_log_likelihood import ExactMarginalLogLikelihood

surrogate = Surrogate(
    botorch_model_class=FixedNoiseGP,       # required kwarg
    mll_class=ExactMarginalLogLikelihood,   # optional kwarg
)

## class `Acquisition`

Base Ax class for BoTorch `AcquisitionFunction` classes.

**Core methods:** `optimize`, `compute_model_dependencies`