--- title: "mlrMBO: A brief introduction" vignette: > %\VignetteIndexEntry{Quick introduction} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE, cache = FALSE} library(mlrMBO) library(rgenoud) set.seed(123) knitr::opts_chunk$set(cache = TRUE, collapse = FALSE) knitr::knit_hooks$set(document = function(x){ gsub("```\n*```r*\n*", "", x) }) ``` # Overview The main goal of `mlrMBO` is to optimize *expensive black-box functions* by *model-based optimization* (aka Bayesian optimization) and to provide a unified interface for different optimization tasks and algorithmic MBO variants. Supported are, among other things: - Efficient global optimization (EGO) of problems with numerical domain and Kriging as surrogate - Using arbitrary regression models from [mlr](https://github.com/mlr-org/mlr/) as surrogates - Built-in parallelization using multi-point proposals - Mixed-space optimization with categorical and subordinate parameters, for parameter configuration and tuning - Multi-criteria optimization This vignette gives a brief overview of the features of `mlrMBO`. A more detailed documentation can be found on: . # Quickstart ## Prerequisites Installing `mlrMBO` will also install and load the dependencies `mlr`, `ParamHelpers`, and `smoof`. For this tutorial, you also need the additional packages `DiceKriging` and `randomForest`. ```{r load_package} library(mlrMBO) ``` ## General MBO workflow 1. Define **objective function** and its parameters using the package `smoof`. 2. Generate **initial design** (optional). 3. Define `mlr` learner for **surrogate model** (optional). 4. Set up a **MBO control** object. 5. Start the optimization with `mbo()`. As a simple example we minimize a cosine-like function with an initial design of 5 points and 10 sequential MBO iterations. Thus, the optimizer is allowed 15 evaluations of the objective function in total to approximate the optimum. ### Objective Function Instead of manually defining the objective, we use the [*smoof*](https://cran.r-project.org/package=smoof) package which offers many toy and benchmark functions for optimization. ```{r cosine_fun} obj.fun = makeCosineMixtureFunction(1) obj.fun = convertToMinimization(obj.fun) print(obj.fun) ggplot2::autoplot(obj.fun) ``` You are not limited to these test functions but can define arbitrary objective functions with *smoof*. ```{r smoof_custom_objective} makeSingleObjectiveFunction( name = "my_sphere", fn = function(x) { sum(x*x) + 7 }, par.set = makeParamSet( makeNumericVectorParam("x", len = 2L, lower = -5, upper = 5) ), minimize = TRUE ) ``` Check `?smoof::makeSingleObjectiveFunction` for further details. ### Initial Design Before MBO can really start, it needs a set of already evaluated points - the *inital design*, as we have to initially learn our first machine learning regression model to propose new points for evaluation. If no design is given (i.e. `design = NULL`), `mbo()` will use a *Maximin Latin Hypercube* `lhs::maximinLHS()` design with `n = 4 * getNumberOfParameters(obj.fun)` points. If the design does not include function outcomes `mbo()` will evaluate the design first before starting with the MBO algorithm. In this example we generate our own design. ```{r} des = generateDesign(n = 5, par.set = getParamSet(obj.fun), fun = lhs::randomLHS) ``` We will also precalculate the results: ```{r} des$y = apply(des, 1, obj.fun) ``` _Note:_ *mlrMBO* uses `y` as a default name for the outcome of the objective function. This can be changed in the control object. ### Surrogate Model We decide to use Kriging as our surrogate model because it has proven to be quite effective for numerical domains. The surrogate must be specified as a mlr regression learner: ```{r} surr.km = makeLearner("regr.km", predict.type = "se", covtype = "matern3_2", control = list(trace = FALSE)) ``` _Note:_ If no surrogate learner is defined, `mbo()` automatically uses Kriging for a numerical domain, otherwise *random forest regression*. ### MBOControl The `MBOControl` object allows customization of the optimization run and algorithmic behavior of MBO. It is created with `makeMBOControl()`, and can be modified with further setter-functions. For further customization there are the following functions: * `setMBOControlTermination()`: It is obligatory to define a termination criterion like the number of MBO iterations. * `setMBOControlInfill()`: It is recommended to set the infill criterion. For learners that support `predict.type = "se"` the Confidence Bound `"cb"` and the Expected Improvement `"ei"` are a good choice. * `setMBOControlMultiPoint()`: Needed, in case you want to evaluate more then just one point per MBO-Iteration you can control this process here. This makes sense for parallelization. * `setMBOControlMultiObj()`: Needed, in case you want to optimize a multi-objective target function. ```{r cosine_setup} control = makeMBOControl() control = setMBOControlTermination(control, iters = 10) control = setMBOControlInfill(control, crit = makeMBOInfillCritEI()) ``` ### Start the optimization Finally, we start the optimization process and print the result object. It contains the best best found solution and its corresponding objective value. ```{r cosine_run, results='hold'} run = mbo(obj.fun, design = des, learner = surr.km, control = control, show.info = TRUE) print(run) ``` ## Visualization For more insights into the MBO process, we can also start the previous optimization with the function `exampleRun()` instead of `mbo()`. This augments the results of `mbo()` with additional information for plotting. Here, we plot the optimization state at iterations 1, 2, and 10. ```{r cosine_examplerun, results="hide"} run = exampleRun(obj.fun, learner = surr.km, control = control, show.info = FALSE) ``` ```{r cosine_plot_examplerun, warning=FALSE} print(run) plotExampleRun(run, iters = c(1L, 2L, 10L), pause = FALSE) ```