---
title: "A simple introduction to the ImmuneSpaceR package"
date: "`r Sys.Date()`"
output: html_document
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{An introduction to the ImmuneSpaceR package}
---
```{r knitr, echo=FALSE, cache = FALSE}
library(knitr)
library(rmarkdown)
opts_chunk$set(cache = FALSE)
```
```{r netrc_req, echo = FALSE}
# This chunk is only useful for BioConductor checks and shouldn't affect any other setup
ISR_login <- Sys.getenv("ISR_login")
ISR_pwd <- Sys.getenv("ISR_pwd")
if(ISR_login != "" & ISR_pwd != ""){
netrc_file <- tempfile("ImmuneSpaceR_tmp_netrc")
netrc_string <- paste("machine www.immunespace.org login", ISR_login, "password", ISR_pwd)
write(x = netrc_string, file = netrc_file)
labkey.netrc.file <- netrc_file
}
```
This package provides a *thin* wrapper around `Rlabkey` and connects to the **ImmuneSpace** database, making it easier to fetch datasets, including gene expression data, hai, and so forth, from specific studies.
## Contents
1. [Configuration](#configuration)
2. [Connections](#connections)
3. [Datasets](#datasets)
4. [Gene expression](#ge)
5. [Plots](#quickplot)
6. [Cross study connections](#crossstudy)
7. [sessionInfo](#sessioninfo)
## Configuration
In order to connect to ImmuneSpace, you will need a `.netrc` file in your home
directory that will contain a `machine` name (hostname of ImmuneSpace), and
`login` and `password`. See [here](https://www.labkey.org/wiki/home/Documentation/page.view?name=netrc) for more information.
A netrc file may look like this:
```
machine www.immunespace.org
login myuser@domain.com
password supersecretpassword
```
### Set up your netrc file now
Put it in your home directory.
If you type:
```
ls ~/.netrc
```
at the command prompt, you should see it there. If it's not there, create one
now. Make sure you have a valid login and password. If you don't have one, go to
[ImmuneSpace](http://www.immunespace.org) now and set yourself up with an
account.
[Back to top](#contents)
## Instantiate a connection
We'll be looking at study `SDY269`. If you want to use a different study, change
that string. The connections have state, so you can instantiate multiple
connections to different studies simultaneously.
```{r CreateConnection, cache=FALSE, message=FALSE}
library(ImmuneSpaceR)
sdy269 <- CreateConnection(study = "SDY269")
sdy269
```
The call to `CreateConnection` instantiates the connection Printing the object
shows where it's connected, to what study, and the available data sets and gene
expression matrices.
Note that when a script is running on ImmuneSpace, some variables set in the
global environments will automatically indicate which study should be used and
the `study` argument can be skipped.
[Back to top](#contents)
## Fetching data sets
We can grab any of the datasets listed in the connection.
```{r getDataset}
sdy269$getDataset("hai")
```
The *sdy269* object is an **R5** class, so it behaves like a true object.
Functions (like `getDataset`) are members of the object, thus the `$` semantics
to access member functions.
The first time you retrieve a data set, it will contact the database. The data
is cached locally, so the next time you call `getDataset` on the same dataset,
it will retrieve the cached local copy. This is much faster.
To get only a subset of the data and speed up the download, filters can be
passed to `getDataset`. The filters are created using the `makeFilter` function
of the `Rlabkey` package.
```{r getDataset-filter, message = FALSE}
library(Rlabkey)
myFilter <- makeFilter(c("gender", "EQUAL", "Female"))
hai <- sdy269$getDataset("hai", colFilter = myFilter)
```
See `?makeFilter` for more information on the syntax.
For more information about `getDataset`'s options, refer to the dedicated vignette.
[Back to top](#contents)
## Fetching expression matrices
We can also grab a gene expression matrix
```{r getGEMatrix}
sdy269$getGEMatrix("LAIV_2008")
```
The object contacts the DB and downloads the matrix file. This is stored and
cached locally as a `data.table`. The next time you access it, it will be much
faster since it won't need to contact the database again.
It is also possible to call this function using multiple matrix names. In this
case, all the matrices are downloaded and combined into a single `ExpressionSet`.
```{r getGEMatrix-multiple}
sdy269$getGEMatrix(c("TIV_2008", "LAIV_2008"))
```
Finally, the summary argument will let you download the matrix with gene symbols
in place of priobe ids.
```{r getGEMatrix-summary}
gs <- sdy269$getGEMatrix("TIV_2008", summary = TRUE)
```
If the connection was created with `verbose = TRUE`, some functions will display
additional informations such as the valid dataset names.
[Back to top](#contents)
## Quick plots
A quick plot of a data set can be generated using the `quick_plot` function.
`quick_plot` automatically chooses the type of plot depending on the selected
dataset.
```{r, dev='png', fig.width=15}
sdy269$quick_plot("hai")
sdy269$quick_plot("elisa")
```
However, the `type` argument can be used to manually select from "boxplot",
"heatmap", "violin" and "line".
[Back to top](#contents)
## Cross study connections
To fetch data from multiple studies, simply create a connection at the project level.
```{r, cross-connection}
con <- CreateConnection("")
```
This will instantiate a connection at the `Studies` level. Most functions work
cross study connections just like they do on single studies.
You can get a list of datasets and gene expression matrices available accross
all studies.
```{r, cross-connection-print}
con
```
In cross-study connections, `getDataset` and `getGEMatrix` will combine the
requested datasets or expression matrices. See the dedicated vignettes for more
information.
Likewise, `quick_plot` will plot accross studies. Note that in most cases the
datasets will have too many cohorts/subjects, making the filtering of the data
a necessity. The `colFilter` argument can be used here, as described in the
`getDataset` section.
```{r cross-connection-qplot, dev='png', fig.align="center"}
plotFilter <- makeFilter(c("cohort", "IN", "TIV 2010;TIV Group 2008"))
con$quick_plot("elispot", filter = plotFilter)
```
The figure above shows the ELISPOT results for two different years of TIV
vaccine cohorts from two different studies.
[Back to top](#contents)
## sessionInfo()
```{r sessionInfo}
sessionInfo()
```