In this document, I aim at showing a typical analysis of a spectral cytometry file, including the construction of the spectral decomposition matrix, the actual decomposition, correction of the resulting file (as there generally are minor differences between the single-stained controls and the fully stained sample) and finally converting the resulting flowFrame or flowSet to a dataframe that can be used for any downstream application. Note: This whole package is very much dependent on flowCore, and much of the functionality herein works as an extention of the basic flowCore functionality.
This is how to install the package, if that has not already been done:
if(!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("flowSpecs")
The dataset that is used in this vinjette, and that is the example dataset in the package generally, is a PBMC sample stained with 12 fluorochrome-conjugated antibodies against a wide range of leukocyte antigens. Included is also a set of single-stained controls, that fill the same function with spectral cytometry as in conventional ditto. The files were generated on a 44 channel Cytek Aurora® instrument 2018-10-25.
library(flowSpecs)
library(flowCore)
data("unmixCtrls")
unmixCtrls
## A flowSet with 15 experiments.
##
## column names(47): Time SSC-H ... R9-A R10-A
data('fullPanel')
fullPanel[,seq(4,7)]
## flowFrame object 'fullPanel.fcs'
## with 8000 cells and 4 observables:
## name desc range minRange maxRange
## $P4 V1-A NA 4194304 -111 4194303
## $P5 V2-A NA 4194304 -111 4194303
## $P6 V3-A NA 4194304 -111 4194303
## $P7 V4-A NA 4194304 -111 4194303
## 416 keywords are stored in the 'description' slot
As can be noted, flowSpecs adheres to flowCore standards, and thus uses flowFrames and flowSets as input to all user functions.
To do this, we need the single-stained unmixing controls. As the fluorescent sources can be of different kinds, such as from antibodies, fluorescent proteins, or dead cell markers, the specMatCalc function accepts any number of different such groups. However, the groups need to have a common part of their names. If this was not the case during acquisition, the names of the fcs files can always be changed afterwards. To check the names, run the sampleNames function from flowCore:
sampleNames(unmixCtrls)
## [1] "Beads_AF647_IgM.fcs" "Beads_AF700_CD4.fcs" "Beads_APCCy7_CD19.fcs"
## [4] "Beads_BV605_CD14.fcs" "Beads_BV650_CD56.fcs" "Beads_BV711_CD11c.fcs"
## [7] "Beads_BV785_CD8a.fcs" "Beads_FITC_CD41b.fcs" "Beads_PB_CD3.fcs"
## [10] "Beads_PE_X.fcs" "Beads_PECy7_CD45RA.fcs" "Beads_unstained.fcs"
## [13] "Dead_PO_DCM.fcs" "Dead_unstained.fcs" "PBMC_unstained.fcs"
This shows that we have three groups of samples: “Beads”, “Dead” and “PBMC”. The two first are groups that define the fluorochromes from antibodies and the dead cell marker (which is pacific orange-NHS in this case). The last one, “PBMC”, will be used for autofluorescence correction. For obvious reasons, the autofluo control should always be from the same type of sample as the samples that will be analyzed downstream. With this knowledge about the groups of samples, we can now create the matrix:
specMat <- specMatCalc(unmixCtrls, groupNames = c("Beads_", "Dead_"),
autoFluoName = "PBMC_unstained.fcs")
str(specMat)
## num [1:13, 1:42] 0.000058 0.001032 0.000649 0.029998 0.052618 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:13] "AF647_IgM" "AF700_CD4" "APCCy7_CD19" "BV605_CD14" ...
## ..$ : chr [1:42] "V1-A" "V2-A" "V3-A" "V4-A" ...
Here we can see that a matrix with the original fluorescence detector names as column names, and the new fluorochrome/marker names as row names has been created. The function does a lot of preprocessing, with automatic gating of the most dominant population, etc, to ensure the best possible resolution and consistency in the determination of the matrix.
Now it is time to apply the newly constructed specMat to the fully stained sample. This is done in the following way:
fullPanelUnmix <- specUnmix(fullPanel, specMat)
fullPanelUnmix
## flowFrame object 'fullPanel.fcs'
## with 8000 cells and 18 observables:
## name desc range minRange maxRange
## $P1 Time Time 16777216 0 16777215
## $P2 SSC-H SSC-H 4194304 0 4194303
## $P3 SSC-A SSC-A 4194304 0 4194303
## $P4 FSC-H FSC-H 4194304 0 4194303
## $P5 FSC-A FSC-A 4194304 0 4194303
## ... ... ... ... ... ...
## $P14 PB_CD3 PB_CD3 4194304 -111 4194303
## $P15 PE_X PE_X 4194304 -111 4194303
## $P16 PECy7_CD45RA PECy7_CD45RA 4194304 -111 4194303
## $P17 PO_DCM PO_DCM 4194304 -111 4194303
## $P18 Autofluo Autofluo 4194304 -111 4194303
## 416 keywords are stored in the 'description' slot
Notable is that the names now have been exchanged for the fluorescent molecules instead of the detector channels. The algorithm below this function is currently least squares regression.
As with all cytometry data, for correct interpretation, the data needs to be transformed using one of the lin-log functions. As the arcsinh function is widely used and also has a single co-function that controls the level of compression aroud zero, it is used in this package. The function has a number of built-in features, such as automatic detection of if the file comes from mass or flow cytometry, and will give differenc cofactors accordingly. It is however always the best practice to set the cofactors individually, to ensure that no artifactual populations are created, which can happen, if there is too much resolution around zero. One automated strategy for this, which would make the arcTrans function unnecessary, is to use the flowVS package.
The arcTrans function requires the names of the variables that should be transformed to be specified.
fullPanelTrans <- arcTrans(fullPanelUnmix, transNames =
colnames(fullPanelUnmix)[6:18])
par(mfrow=c(1,2))
hist(exprs(fullPanelUnmix)[,7], main = "Pre transformation",
xlab = "AF700_CD4", breaks = 200)
hist(exprs(fullPanelTrans)[,7], main = "Post transformation",
xlab = "AF700_CD4", breaks = 200)
As can be seen in the histograms, the ranges, scales and resolution have now changed dramatically. (Biologically, the three peaks correspond to CD4- cells, CD4+myeloid cells and CD4+T-cells, respectively).
An important step in the early processing of cytometry files is to investigate if, or rather where, unmixing artifacts have arisen. There are multiple reasons for the occurrence of such artifacts, but listing them are outside of the scope of this vinjette. In the package, there is one function that is well suited for for this task, and that is the oneVsAllPlot function. When used without specifying a marker, the function will create a folder and save all possible combinations of markers to that folder. Looking at them gives a good overview of the data. In this case, for the vinjette purpose, I am only plotting one of the multi-graphs.
oneVsAllPlot(fullPanelTrans, "AF647_IgM", saveResult = FALSE)