---
title: "A quick tour of RCSL"
author: "Qinglin Mei"
date: "`r Sys.Date()`"
output:
  BiocStyle::html_document:
    toc: true
    toc_depth: 2
vignette: >
  %\VignetteIndexEntry{RCSL package manual}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
  
```{r knitr-options, echo=FALSE, message=FALSE, warning=FALSE, include = FALSE}
knitr::opts_chunk$set(tidy = FALSE,
                      cache = FALSE,
                      dev = "png",
                      message = FALSE, error = FALSE, warning = TRUE)
```

# Introduction
`RCSL` is an R toolkit for single-cell clustering and trajectory analysis using single-cell RNA-seq data.

# Installation
### Install RCSL package and other requirements
`RCSL` can be installed directly from GitHub with 'devtools'.

```{r, eval=FALSE}
library(devtools)
devtools::install_github("QinglinMei/RCSL")
```

Now we can load `RCSL`. We also load the `SingleCellExperiment`, `ggplot2` and `igraph`  package.
```{r, results="hide"}
library(RCSL)
library(SingleCellExperiment)
library(ggplot2)
library(igraph)
library(umap)
```

# Run RCSL
## Load dataset (yan)
We illustrate the usage of RCSL on a human preimplantation embryos and embryonic stem cells(*Yan et al., (2013)*). The yan data is distributed together with the RCSL package, with 90 cells and 20,214 genes:
```{r}
data(yan, package = "RCSL")
head(ann)
yan[1:3, 1:3]
origData <- yan
label <- ann$cell_type1
```

## 1. Pre-processing
In practice, we find it always beneficial to pre-process single-cell RNA-seq datasets, including:
1. Log transformation.
2. Gene filter

```{r, cache=TRUE}
data <- log2(as.matrix(origData) + 1)
gfData <- GenesFilter(data)
```

## 2. Calculate the initial similarity matrix S
```{r, cache=TRUE}
resSimS <- SimS(gfData)
```

## 3. Estimate the number of clusters C
```{r, cache=TRUE}
Estimated_C <- EstClusters(resSimS$drData,resSimS$S)
```

## 4. Calculate the block diagonal matrix B
```{r, cache=TRUE}
resBDSM <- BDSM(resSimS$S, Estimated_C)
```

# Calculate accuracy of the clustering
```{r, cache=TRUE}
ARI_RCSL <- igraph::compare(resBDSM$y, label, method = "adjusted.rand")
```

# Trajectory analysis to time-series datasets
```{r, cache=TRUE}
DataName <- "Yan"
res_TrajecAnalysis <- TrajectoryAnalysis(gfData, resSimS$drData, resSimS$S,
                                         clustRes = resBDSM$y, TrueLabel = label, 
                                         startPoint = 1, dataName = DataName)
```										 

# Display the constructed MST 
```{r, cache=TRUE}
res_TrajecAnalysis$MSTPlot
```										 

# Display the plot of the pseudo-temporal ordering 
```{r, cache=TRUE}
res_TrajecAnalysis$PseudoTimePlot
```	

# Display the plot of the inferred developmental trajectory
```{r, cache=TRUE}
res_TrajecAnalysis$TrajectoryPlot
```