To illustrate the MEFISTO method in MOFA2 we simulate a small example
data set with 4 different views and one covariates defining a timeline
using make_example_data. The simulation is based on 4
factors, two of which vary smoothly along the covariate (with different
lengthscales) and two are independent of the covariate.
set.seed(2020)
# set number of samples and time points
N <- 200
time <- seq(0,1,length.out = N)
# generate example data
dd <- make_example_data(sample_cov = time, n_samples = N,
n_factors = 4, n_features = 200, n_views = 4,
lscales = c(0.5, 0.2, 0, 0))
# input data
data <- dd$data
# covariate matrix with samples in columns
time <- dd$sample_cov
rownames(time) <- "time"Let’s have a look at the simulated latent temporal processes, which we want to recover:
df <- data.frame(dd$Z, t(time))
df <- gather(df, key = "factor", value = "value", starts_with("simulated_factor"))
ggplot(df, aes(x = time, y = value)) + geom_point() + facet_grid(~factor)Using the MEFISTO framework is very similar to using MOFA2. In addition to the omics data, however, we now additionally specify the time points for each sample. If you are not familiar with the MOFA2 framework, it might be helpful to have a look at MOFA2 tutorials first.
To create the MOFA object we need to specify the training data and
the covariates for pattern detection and inference of smooth factors.
Here, sample_cov is a matrix with samples in columns and
one row containing the time points. The sample order must match the
order in data columns. Alternatively, a data frame can be provided
containing one sample columns with samples names matching
the sample names in the data.
First, we start by creating a standard MOFA model.
## Creating MOFA object from a list of matrices (features as rows, sample as columns)...
Now, we can add the additional temporal covariate, that we want to use for training.
## Untrained MEFISTO model with the following characteristics:
## Number of views: 4
## Views names: view_1 view_2 view_3 view_4
## Number of features (per view): 200 200 200 200
## Number of groups: 1
## Groups names: group1
## Number of samples (per group): 200
## Number of covariates per sample: 1
##
We now successfully created a MOFA object that contains 4 views, 1 group and 1 covariate giving the time point for each sample.
Before training, we can specify various options for the model, the
training and the data preprocessing. If no options are specified, the
model will use the default options. See also
get_default_data_options,
get_default_model_options and
get_default_training_options to have a look at the defaults
and change them where required. For illustration, we only use a small
number of iterations.
Importantly, to activate the use of the covariate for a functional
decomposition (MEFISTO) we now additionally to the standard MOFA options
need to specify mefisto_options. For this you can just use
the default options (get_default_mefisto_options), unless
you want to make use of advanced options such as alignment across
groups.
data_opts <- get_default_data_options(sm)
model_opts <- get_default_model_options(sm)
model_opts$num_factors <- 4
train_opts <- get_default_training_options(sm)
train_opts$maxiter <- 100
mefisto_opts <- get_default_mefisto_options(sm)
sm <- prepare_mofa(sm, model_options = model_opts,
mefisto_options = mefisto_opts,
training_options = train_opts,
data_options = data_opts)Now, the MOFA object is ready for training. Using
run_mofa we can fit the model, which is saved in the file
specified as outfile. If none is specified the output is
saved in a temporary location.
Using plot_variance_explained we can explore which
factor is active in which view. plot_factor_cor shows us
whether the factors are correlated.
The MOFA model has learnt scale parameters for each factor, which give us an indication of the smoothness per factor along the covariate (here time) and are between 0 and 1. A scale of 0 means that the factor captures variation independent of time, a value close to 1 tells us that this factor varys very smoothly along time.
## Factor1 Factor2 Factor3 Factor4
## 0.00000000 0.01082415 0.99875934 0.99602044
In this example, we find two factors that are non-smooth and two
smooth factors. Using plot_factors_vs_cov we can plot the
factors along the time line, where we can distinguish smooth and non
smooth variation along time.
For more customized plots, we can extract the underlying data containing the factor and covariate values for each sample.
## sample covariate value.covariate factor value.factor group color_by
## 1 sample_1 time 0.00000000 Factor1 -4.7768636 group1 0.00000000
## 2 sample_1 time 0.00000000 Factor2 0.5450544 group1 0.00000000
## 3 sample_1 time 0.00000000 Factor4 -1.4026482 group1 0.00000000
## 4 sample_1 time 0.00000000 Factor3 -2.5425510 group1 0.00000000
## 5 sample_10 time 0.04522613 Factor4 -1.4980567 group1 0.04522613
## 6 sample_10 time 0.04522613 Factor2 0.1091706 group1 0.04522613
## shape_by
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 1
We can compare the above plots to the factors that were simulated above and find that the model recaptured the two smooth as well as two non-smooth patterns in time. Note that factors are invariant to the sign, e.g. Factor 4 is the negative of the simulated factor but we can simply multiply the factors and its weights by -1 to obtain exactly the simulated factor.
As with standard MOFA, we can now look deeper into the meaning of these factors by exploring the weights or performing feature set enrichment analysis.
In addition, we can take a look at the top feature values per factor along time and see that their patterns are in line with the pattern of the corresponding Factor (here Factor 3).
Furthermore, we can interpolate or extrapolate a factor to new
values. Here, we only show the mean of the prediction, to obtain
uncertainties you need to specify the new values before training in
get_default_mefisto_options(sm)$new_values.
sm <- interpolate_factors(sm, new_values = seq(0,1.1,0.01))
plot_interpolation_vs_covariate(sm, covariate = "time",
factors = "Factor3")## R version 4.6.0 (2026-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] pheatmap_1.0.13 lubridate_1.9.5 forcats_1.0.1
## [4] stringr_1.6.0 dplyr_1.2.1 purrr_1.2.2
## [7] readr_2.2.0 tidyr_1.3.2 tibble_3.3.1
## [10] tidyverse_2.0.0 data.table_1.18.2.1 MOFA2_1.22.0
## [13] ggplot2_4.0.3 BiocStyle_2.40.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 farver_2.1.2 filelock_1.0.3
## [4] S7_0.2.2 fastmap_1.2.0 GGally_2.4.0
## [7] digest_0.6.39 timechange_0.4.0 lifecycle_1.0.5
## [10] magrittr_2.0.5 compiler_4.6.0 rlang_1.2.0
## [13] sass_0.4.10 tools_4.6.0 yaml_2.3.12
## [16] corrplot_0.95 knitr_1.51 ggsignif_0.6.4
## [19] S4Arrays_1.12.0 labeling_0.4.3 reticulate_1.46.0
## [22] DelayedArray_0.38.0 plyr_1.8.9 RColorBrewer_1.1-3
## [25] abind_1.4-8 HDF5Array_1.40.0 Rtsne_0.17
## [28] withr_3.0.2 BiocGenerics_0.58.0 sys_3.4.3
## [31] grid_4.6.0 stats4_4.6.0 ggpubr_0.6.3
## [34] Rhdf5lib_1.32.0 scales_1.4.0 mvtnorm_1.3-7
## [37] cli_3.6.6 rmarkdown_2.31 generics_0.1.4
## [40] otel_0.2.0 RSpectra_0.16-2 tzdb_0.5.0
## [43] reshape2_1.4.5 cachem_1.1.0 rhdf5_2.54.1
## [46] splines_4.6.0 parallel_4.6.0 BiocManager_1.30.27
## [49] XVector_0.52.0 matrixStats_1.5.0 basilisk_1.24.0
## [52] vctrs_0.7.3 Matrix_1.7-5 jsonlite_2.0.0
## [55] carData_3.0-6 dir.expiry_1.20.0 car_3.1-5
## [58] hms_1.1.4 IRanges_2.46.0 S4Vectors_0.50.0
## [61] ggrepel_0.9.8 rstatix_0.7.3 Formula_1.2-5
## [64] maketools_1.3.2 h5mread_1.4.0 jquerylib_0.1.4
## [67] glue_1.8.1 codetools_0.2-20 ggstats_0.13.0
## [70] cowplot_1.2.0 uwot_0.2.4 RcppAnnoy_0.0.23
## [73] stringi_1.8.7 gtable_0.3.6 pillar_1.11.1
## [76] rappdirs_0.3.4 htmltools_0.5.9 rhdf5filters_1.22.0
## [79] R6_2.6.1 evaluate_1.0.5 lattice_0.22-9
## [82] png_0.1-9 backports_1.5.1 broom_1.0.12
## [85] bslib_0.10.0 Rcpp_1.1.1-1.1 nlme_3.1-169
## [88] SparseArray_1.12.0 mgcv_1.9-4 xfun_0.57
## [91] MatrixGenerics_1.24.0 buildtools_1.0.0 pkgconfig_2.0.3