---
title: "mlrMBO: A brief introduction"
vignette: >
  %\VignetteIndexEntry{Quick introduction}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE, cache = FALSE}
library(mlrMBO)
library(rgenoud)
set.seed(123)
knitr::opts_chunk$set(cache = TRUE, collapse = FALSE)
knitr::knit_hooks$set(document = function(x){
  gsub("```\n*```r*\n*", "", x)
})
```


# Overview

The main goal of `mlrMBO` is to optimize *expensive black-box functions* by *model-based optimization* (aka Bayesian optimization) and to provide a unified interface for different optimization tasks and algorithmic MBO variants.
Supported are, among other things:

- Efficient global optimization (EGO) of problems with numerical domain and Kriging as surrogate
- Using arbitrary regression models from [mlr](https://github.com/mlr-org/mlr/) as surrogates
- Built-in parallelization using multi-point proposals
- Mixed-space optimization with categorical and subordinate parameters, for parameter configuration and tuning
- Multi-criteria optimization

This vignette gives a brief overview of the features of `mlrMBO`.
A more detailed documentation can be found on: <https://mlrmbo.mlr-org.com/>.

# Quickstart

## Prerequisites

Installing `mlrMBO` will also install and load the dependencies `mlr`, `ParamHelpers`, and `smoof`.
For this tutorial, you also need the additional packages `DiceKriging` and `randomForest`.

```{r load_package}
library(mlrMBO)
```

## General MBO workflow

1. Define **objective function** and its parameters using the package `smoof`.
2. Generate **initial design** (optional).
3. Define `mlr` learner for **surrogate model** (optional).
4. Set up a **MBO control** object.
5. Start the optimization with `mbo()`.

As a simple example we minimize a cosine-like function with an initial design of 5 points and 10 sequential MBO iterations.
Thus, the optimizer is allowed 15 evaluations of the objective function in total to approximate the optimum.


### Objective Function

Instead of manually defining the objective, we use the [*smoof*](https://cran.r-project.org/package=smoof) package which offers many toy and benchmark functions for optimization.


```{r cosine_fun}
obj.fun = makeCosineMixtureFunction(1)
obj.fun = convertToMinimization(obj.fun)
print(obj.fun)
ggplot2::autoplot(obj.fun)
```

You are not limited to these test functions but can define arbitrary objective functions with *smoof*.

```{r smoof_custom_objective}
makeSingleObjectiveFunction(
  name = "my_sphere",
  fn = function(x) {
    sum(x*x) + 7
  },
  par.set = makeParamSet(
    makeNumericVectorParam("x", len = 2L, lower = -5, upper = 5)
  ),
  minimize = TRUE
)
```

Check `?smoof::makeSingleObjectiveFunction` for further details.


### Initial Design

Before MBO can really start, it needs a set of already evaluated points - the *inital design*, as we have to initially learn our first machine learning regression model to propose new points for evaluation.
If no design is given (i.e. `design = NULL`), `mbo()` will use a *Maximin Latin Hypercube* `lhs::maximinLHS()` design with `n = 4 * getNumberOfParameters(obj.fun)` points.
If the design does not include function outcomes `mbo()` will evaluate the design first before starting with the MBO algorithm.
In this example we generate our own design.

```{r}
des = generateDesign(n = 5, par.set = getParamSet(obj.fun), fun = lhs::randomLHS)
```

We will also precalculate the results:

```{r}
des$y = apply(des, 1, obj.fun)
```

_Note:_ *mlrMBO* uses `y` as a default name for the outcome of the objective function.
This can be changed in the control object.

### Surrogate Model

We decide to use Kriging as our surrogate model because it has proven to be quite effective for numerical domains.
The surrogate must be specified as a mlr regression learner:

```{r}
surr.km = makeLearner("regr.km", predict.type = "se", covtype = "matern3_2", control = list(trace = FALSE))
```

_Note:_ If no surrogate learner is defined, `mbo()` automatically uses Kriging for a numerical domain, otherwise *random forest regression*.

### MBOControl

The `MBOControl` object allows customization of the optimization run and algorithmic behavior of MBO.
It is created with `makeMBOControl()`, and can be modified with further setter-functions.


For further customization there are the following functions:

* `setMBOControlTermination()`: It is obligatory to define a termination criterion like the number of MBO iterations.
* `setMBOControlInfill()`: It is recommended to set the infill criterion. For learners that support `predict.type = "se"` the Confidence Bound `"cb"` and the Expected Improvement `"ei"` are a good choice.
* `setMBOControlMultiPoint()`: Needed, in case you want to evaluate more then just one point per MBO-Iteration you can control this process here. This makes sense for parallelization.
* `setMBOControlMultiObj()`: Needed, in case you want to optimize a multi-objective target function.


```{r cosine_setup}
control = makeMBOControl()
control = setMBOControlTermination(control, iters = 10)
control = setMBOControlInfill(control, crit = makeMBOInfillCritEI())
```

### Start the optimization

Finally, we start the optimization process and print the result object.
It contains the best best found solution and its corresponding objective value.

```{r cosine_run, results='hold'}
run = mbo(obj.fun, design = des, learner = surr.km, control = control, show.info = TRUE)
print(run)
```

## Visualization

For more insights into the MBO process, we can also start the previous optimization with the function `exampleRun()` instead of `mbo()`.
This augments the results of `mbo()` with additional information for plotting.
Here, we plot the optimization state at iterations 1, 2, and 10.

```{r cosine_examplerun, results="hide"}
run = exampleRun(obj.fun, learner = surr.km, control = control, show.info = FALSE)
```

```{r cosine_plot_examplerun, warning=FALSE}
print(run)
plotExampleRun(run, iters = c(1L, 2L, 10L), pause = FALSE)
```