---
title: "cytofkit: Quick Start"
output:
  BiocStyle::html_document:
    toc: true
---
<!--
%% \VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{Quick Start}
-->


# Install and Load the Package


To install **cytofkit** package, start R and run the following codes on R console:

```{r, eval=FALSE}
source("https://bioconductor.org/biocLite.R")
biocLite("cytofkit")
```

`Notes`: **cytofkit** GUI is dependent on XQuartz windowing system (X Windows) on Mac (OS X > 10.7). Install XQuartz from http://xquartz.macosforge.org.

* Download the disk image (dmg) file for [XQuartz](https://dl.bintray.com/xquartz/downloads/XQuartz-2.7.9.dmg).
* Double-clicking on the file to start the installation, you can do it safely by clicking through all the defaults.
* After the installation, you'll have to log out and back on to your Mac OS X account.

Load the Package:

```{r, message=FALSE}
library("cytofkit") 
```

Read the package description:

```{r, eval=FALSE}
?"cytofkit-package"
```


# Options for Using **cytofkit** Package

**cytofkit** provides three ways to employ the workforce of this package:

## Run with GUI 

The easiest way to use **cytofkit** package is through the GUI. The GUI provides all main options of **cytofkit** on a visual interface. To launch the GUI, load the package and type the following command:

```{r, eval=FALSE}
cytofkit_GUI()  
```

The interface will appear like below, you can click the information button **!** to check the explanation for each entry and customize your own analysis.

![cytofkit GUI](cytofkit_GUI.png)

Start your analysis as simply as following:   

- Choose the input fcs files from the directory where you store FCS data; 
- Select the markers from the auto-generated marker list; 
- Choose the directory where to save your output; 
- Give a project name as a prefix for the names of result files; 
- Select a data merging method if you have multiple FCS files;
- Select your clustering method(s), visualization method(s), progression estimation method(s)

Then submit it, that's all. 

Depends on the size of your data, it will take some time to run the analysis. Once done, a window will pop up, showing you the path where the results have been stored, and asking you if open the shiny web APP. If YES, the shiny APP will be deployed locally and opened in your default web browser. Among the saved results, a special R data object with suffix of **_.RData_** is for loading the results into the shiny APP. Choose the **_.RData_** file on the shiny APP then submit it, your journey of exploring the reulsts starts.


## Run with the Core Function 

**Cytofkit** provides a core function `cytofkit()` to drive the analysis pipeline of mass cytometry data. Users only need to define several key parameters to make their analysis automatically. One simple example of running cytofkit using the core function is like this:

```{r, eval=FALSE}
set.seed(100)
dir <- system.file('extdata',package='cytofkit')
file <- list.files(dir ,pattern='.fcs$', full=TRUE)
parameters <- list.files(dir, pattern='.txt$', full=TRUE)
res <- cytofkit(fcsFiles = file, 
                markers = parameters, 
                projectName = 'cytofkit_test',
                transformMethod = "cytofAsinh", 
                mergeMethod = "all",
                dimReductionMethod = "tsne",
                clusterMethods = c("Rphenograph", "ClusterX"),    ## accept multiple methods
                visualizationMethods = c("tsne", "pca"),          ## accept multiple methods
                progressionMethod = "isomap",
                clusterSampleSize = 500,
                resultDir = getwd(),
                saveResults = TRUE, 
                saveObject = TRUE, 
                saveToFCS = TRUE)
```

You can customize the parameters for your own need, run `?cytofkit` to get information of all the parameters for `cytofkit`. As running with GUI, once the analysis is done, the results will be saved under `resultDir` automatically.
      

## Run with Commands (Step-by-Step) 

You can make use of the functions exported from cytofkit to make your analysis more flexible and fit your own need. Here we use a sample data for demo:

### Pre-processing

```{r}
## Loading the FCS data:  
dir <- system.file('extdata',package='cytofkit')
file <- list.files(dir ,pattern='.fcs$', full=TRUE)
paraFile <- list.files(dir, pattern='.txt$', full=TRUE)
parameters <- as.character(read.table(paraFile, header = TRUE)[,1])

## File name
file

## parameters
parameters
```

```{r}
## Extract the expression matrix with transformation
data_transformed <- cytof_exprsExtract(fcsFile = file, markers = parameters, 
                                       comp = FALSE, 
                                       transformMethod = "cytofAsinh")
## If analysing flow cytoetry data, you can set comp to TRUE or 
## provide a transformation matrix to apply compensation

## If you have multiple FCS files, expression can be extracted and combined
combined_data_transformed <- cytof_exprsMerge(fcsFiles = file, comp=FALSE,
                                              markers = parameters, 
                                              transformMethod = "cytofAsinh",
                                              mergeMethod = "all")
## change mergeMethod to apply different combination strategy

## Take a look at the extracted expression matrix
head(data_transformed[ ,1:3])
```

### Cell Subset Detection

```{r, message=FALSE, }
## use clustering algorithm to detect cell subsets
## to speed up our test here, we only use 1000 cells
data_transformed_1k <- data_transformed[1:1000, ]

## run PhenoGraph
cluster_PhenoGraph <- cytof_cluster(xdata = data_transformed_1k, method = "Rphenograph")

## run ClusterX
data_transformed_1k_tsne <- cytof_dimReduction(data=data_transformed_1k, method = "tsne")
cluster_ClusterX <- cytof_cluster(ydata = data_transformed_1k_tsne,  method="ClusterX")
```

```{r, eval=FALSE}
## run DensVM (takes long time, we skip here)
cluster_DensVM <- cytof_cluster(xdata = data_transformed_1k, 
                                ydata = data_transformed_1k_tsne, method = "DensVM")
```

```{r, message=FALSE}
## run FlowSOM with cluster number 15
cluster_FlowSOM <- cytof_cluster(xdata = data_transformed_1k, method = "FlowSOM", FlowSOM_k = 12)

## combine data
data_1k_all <- cbind(data_transformed_1k, data_transformed_1k_tsne, 
                     PhenoGraph = cluster_PhenoGraph, ClusterX=cluster_ClusterX, 
                     FlowSOM=cluster_FlowSOM)
data_1k_all <- as.data.frame(data_1k_all)
```


### Cell Subset Visualization and Interpretation 

```{r, message=FALSE}
## PhenoGraph plot on tsne
cytof_clusterPlot(data=data_1k_all, xlab="tsne_1", ylab="tsne_2", 
                  cluster="PhenoGraph", sampleLabel = FALSE)

## PhenoGraph cluster heatmap
PhenoGraph_cluster_median <- aggregate(. ~ PhenoGraph, data = data_1k_all, median)
cytof_heatmap(PhenoGraph_cluster_median[, 2:37], baseName = "PhenoGraph Cluster Median")
```

```{r}
## ClusterX plot on tsne
cytof_clusterPlot(data=data_1k_all, xlab="tsne_1", ylab="tsne_2", cluster="ClusterX", sampleLabel = FALSE)

## ClusterX cluster heatmap
ClusterX_cluster_median <- aggregate(. ~ ClusterX, data = data_1k_all, median)
cytof_heatmap(ClusterX_cluster_median[, 2:37], baseName = "ClusterX Cluster Median")
```


```{r}
## FlowSOM plot on tsne
cytof_clusterPlot(data=data_1k_all, xlab="tsne_1", ylab="tsne_2", 
                  cluster="FlowSOM", sampleLabel = FALSE)

## FlowSOM cluster heatmap
FlowSOM_cluster_median <- aggregate(. ~ FlowSOM, data = data_1k_all, median)
cytof_heatmap(FlowSOM_cluster_median[, 2:37], baseName = "FlowSOM Cluster Median")
```

### Inference of Subset Progression

```{r, message=FALSE}
## Inference of PhenoGraph cluster relatedness
PhenoGraph_progression <- cytof_progression(data = data_transformed_1k, 
                                            cluster = cluster_PhenoGraph, 
                                            method="isomap", clusterSampleSize = 50, 
                                            sampleSeed = 5)
p_d <- data.frame(PhenoGraph_progression$sampleData, 
                  PhenoGraph_progression$progressionData, 
                  cluster = PhenoGraph_progression$sampleCluster, 
                  check.names = FALSE)

## cluster relatedness plot
cytof_clusterPlot(data=p_d, xlab="isomap_1", ylab="isomap_2", 
                  cluster="cluster", sampleLabel = FALSE)

## marker expression profile
markers <- c("(Sm150)Di<GranzymeB>", "(Yb173)Di<Perforin>")

cytof_colorPlot(data=p_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[1])
cytof_colorPlot(data=p_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[2])

cytof_progressionPlot(data=p_d, markers=markers, orderCol="isomap_1", clusterCol = "cluster")
```


```{r, message=FALSE}
## Inference of ClusterX cluster relatedness
ClusterX_progression <- cytof_progression(data = data_transformed_1k, 
                                          cluster = cluster_ClusterX, 
                                          method="isomap", 
                                          clusterSampleSize = 30, 
                                          sampleSeed = 3)
c_d <- data.frame(ClusterX_progression$sampleData, 
                  ClusterX_progression$progressionData,
                  cluster=ClusterX_progression$sampleCluster, 
                  check.names = FALSE)

## cluster relatedness plot
cytof_clusterPlot(data=c_d, xlab="isomap_1", ylab="isomap_2", 
                  cluster="cluster", sampleLabel = FALSE)

## marker expression profile
markers <- c("(Sm150)Di<GranzymeB>", "(Yb173)Di<Perforin>")

cytof_colorPlot(data=c_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[1])
cytof_colorPlot(data=c_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[2])

cytof_progressionPlot(data=c_d, markers, orderCol="isomap_1", clusterCol = "cluster")
```


### Post-processing

```{r, eval=FALSE}
## save analysis results to FCS file
cytof_addToFCS(data_1k_all, rawFCSdir=dir, analyzedFCSdir="analysed_FCS", 
               transformed_cols = c("tsne_1", "tsne_2"), 
               cluster_cols = c("PhenoGraph", "ClusterX", "FlowSOM"))
```

 
# Session Information

```{r}
sessionInfo()
```