---
output: 
  rmarkdown::html_vignette:
    toc: true
    keep_md: true
vignette: >
  %\VignetteIndexEntry{Manual for the combi pacakage}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

# combi package: vignette

\setcounter{tocdepth}{5}
\tableofcontents

# Introduction

This package implements a novel data integration model for sample-wise integration of different views. It accounts for compositionality and employs a non-parametric mean-variance trend for sequence count data. The resulting model can be conveniently plotted to allow for explorative visualization of variability shared over different views.

<!-- # Publication -->

<!-- The underlying method of the combi package is described in detail in the following article:  -->

# Installation

The package can be installed and loaded using the following commands:

```{r load-packages, warning=FALSE, message=FALSE, echo=FALSE}
knitr::opts_chunk$set(cache = FALSE, autodep = TRUE, warning = FALSE, 
                      message = FALSE, echo = TRUE, eval = TRUE, 
                      tidy = TRUE, fig.width = 9, fig.height = 6, purl = TRUE, 
                      fig.show = "hold", cache.lazy = FALSE)
palStore = palette()
#Load all fits, to avoid refitting every time rebuilding the vignette
load(system.file("extdata", "zhangFits.RData", package = "combi"))
```

```{r install, eval = FALSE}
library(BiocManager)
BiocManager::install("combi", update = FALSE)
```

```{r installDevtools, eval = FALSE}
library(devtools)
install_github("CenterForStatistics-UGent/combi")
```

```{r loadcombipackage}
suppressPackageStartupMessages(library(combi))
cat("combi package version", 
    as.character(packageVersion("combi")), "\n")
```

```{r loadData}
data(Zhang)
```

## Unconstrained integration

For an unconstrained ordination, a named list of datasets with overlapping samples must be supplied. The datasets can currently be supplied as a raw data matrix (with features in the columns), or as a phyloseq, SummarizedExperiment or ExpressionSet object. In addition, information on the required distribution ("quasi" for quasi-likelihood fitting, "gaussian" for normal data) and compositional nature (TRUE/FALSE) should be supplied

```{r unconstr}
microMetaboInt = combi(
 list("microbiome" = zhangMicrobio, "metabolomics" = zhangMetabo),
 distributions = c("quasi", "gaussian"), compositional = c(TRUE, FALSE),
 logTransformGaussian = FALSE)
```

One can print basic infor about the ordination

```{r show}
microMetaboInt
```

A simple plot function is available for the result, for samples and shapes, a data frame should also be supplied

```{r simplePlot}
plot(microMetaboInt)
```

```{r colourPlot}
plot(microMetaboInt, samDf = zhangMetavars, samCol = "ABX")
```

By default, only the most important features (furthest away from the origin) are shown. To show all features, one can resort to point cloud plots or density plots as follows:

```{r cloudPlot}
plot(microMetaboInt, samDf = zhangMetavars, samCol = "ABX", 
     featurePlot = "points")
```

```{r denPlot}
plot(microMetaboInt, samDf = zhangMetavars, samCol = "ABX", 
     featurePlot = "density")
```

The drawback is that now no feature labels are shown.

## Adding projections

As an aid to interpretation of compositional views, links between features can be plotted and projected onto samples by providing their names or approximate coordinates

```{r projections}
#First define the plot, and return the coordinates
mmPlot = plot(microMetaboInt, samDf = zhangMetavars, samCol = "ABX", returnCoords = TRUE, featNum = 10)
#Providing feature names, and sample coordinates, but any combination is allowed
addLink(mmPlot, links = cbind("Staphylococcus_819c11","OTU929ffc"), Views = 1, samples = c(0,1))
```

## Coordinates

Finally, one can extract the coordinates for use in third-party software

```{r extractCoords}
coords = extractCoords(microMetaboInt, Dim = c(1,2))
```

## Constrained integration

For a constrained ordination also a data frame of sample variables should be supplied

```{r constr}
microMetaboIntConstr = combi(
     list("microbiome" = zhangMicrobio, "metabolomics" = zhangMetabo),
     distributions = c("quasi", "gaussian"), compositional = c(TRUE, FALSE),
     logTransformGaussian = FALSE, covariates = zhangMetavars)
```

Also here we can get a quick overview

```{r printConstr}
microMetaboIntConstr
```

and plot the ordination

```{r colourPlotConstr}
plot(microMetaboIntConstr, samDf = zhangMetavars, samCol = "ABX")
```

## Diagnostics

Convergence of the iterative algorithm can be assessed as follows:

```{r convPlot}
convPlot(microMetaboInt)
```

Influence of the different views can be investigated through

```{r inflPlot}
inflPlot(microMetaboInt, samples = 1:20, plotType = "boxplot")
```

# Session info

This vignette was generated with following version of R:

```{r sessionInfo}
sessionInfo()
```