The public server for DiscoRhythm is available here, however, local usage is advised for improved performance (see 1.2).
See the guide in section 4 for details on usage of the application.
To run the application locally or use DiscoRhythm with R, the DiscoRhythm R package must be installed.
DiscoRhythm and its dependencies can be installed by executing the following in R:
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("DiscoRhythm")
Note: Manual installation of pandoc is required in order to use the report generation features of DiscoRhythm.
Running the command library(DiscoRhythm); discoApp()
will then launch the
application. The application can be parallelized on multiple cores
for improved performance using discoApp(ncores=number_of_available_cores)
.
Alternatively, if docker is installed, the DiscoRhythm container on Docker Hub can be pulled and used to run the web application locally avoiding the need to install DiscoRhythm and its dependencies (See instructions on Docker Hub).
The same computations performed in the web application can be executed directly in R. This may be necessary for:
Section 5 describes this usage in more detail.
DiscoRhythm is a set of statistical tools for analyzing large scale temporal biological experiments with a hypothesized periodicity (e.g. circadian transcriptomic experiments). The main goal of this package is to take a normalized data matrix and characterize the rhythmicity of the features. The entire workflow can be run interactively in the web application or run directly in R to perform:
Section 3 describes the expected input to the DiscoRhythm workflow.
Next, section 4 will describe these steps and their use in the web application. Section 5 will then describe how to generate the same results using the DiscoRhythm R package directly.
Below is a small simulated circadian transcriptomic dataset generated using the simphony which follows the expected input format for DiscoRhythm. The dataset was generated to contain ~50% rhythmic transcripts with a diversity of phases of oscillation.
## Registered S3 methods overwritten by 'ggplot2':
## method from
## [.quosures rlang
## c.quosures rlang
## print.quosures rlang
## Registered S3 method overwritten by 'seriation':
## method from
## reorder.hclust gclus
## Registered S3 method overwritten by 'rvest':
## method from
## read_xml.response xml2
IDs | CT0_1_A | CT0_2_B | CT0_3_C | CT4_4_D | CT4_5_E |
---|---|---|---|---|---|
nonRhythmGene_Id1 | 6.3717658 | 4.0909466 | 8.3010577 | 7.0240708 | -4.6510406 |
nonRhythmGene_Id2 | -2.4340798 | -5.6695362 | -3.0800470 | -0.1800393 | 3.1282445 |
nonRhythmGene_Id3 | -9.8373929 | -5.8738313 | -0.7793628 | -2.7236067 | 2.6574806 |
nonRhythmGene_Id4 | 4.5951368 | 0.2952068 | 3.5660223 | -1.2458999 | 6.1385621 |
nonRhythmGene_Id5 | 0.3980836 | 3.9928686 | 1.5626314 | 0.0613089 | 0.7013376 |
nonRhythmGene_Id6 | -0.2139567 | -2.4819291 | -6.5602675 | -1.9901769 | -7.5526251 |
The first column should contain unique feature IDs (e.g. gene names in this case).
IDs | CT0_1_A | CT0_2_B | CT0_3_C | CT4_4_D | CT4_5_E |
---|---|---|---|---|---|
nonRhythmGene_Id1 | 6.3717658 | 4.0909466 | 8.3010577 | 7.0240708 | -4.6510406 |
nonRhythmGene_Id2 | -2.4340798 | -5.6695362 | -3.0800470 | -0.1800393 | 3.1282445 |
nonRhythmGene_Id3 | -9.8373929 | -5.8738313 | -0.7793628 | -2.7236067 | 2.6574806 |
nonRhythmGene_Id4 | 4.5951368 | 0.2952068 | 3.5660223 | -1.2458999 | 6.1385621 |
nonRhythmGene_Id5 | 0.3980836 | 3.9928686 | 1.5626314 | 0.0613089 | 0.7013376 |
nonRhythmGene_Id6 | -0.2139567 | -2.4819291 | -6.5602675 | -1.9901769 | -7.5526251 |
All subsequent columns contain experimental sample data.
IDs | CT0_1_A | CT0_2_B | CT0_3_C | CT4_4_D | CT4_5_E |
---|---|---|---|---|---|
nonRhythmGene_Id1 | 6.3717658 | 4.0909466 | 8.3010577 | 7.0240708 | -4.6510406 |
nonRhythmGene_Id2 | -2.4340798 | -5.6695362 | -3.0800470 | -0.1800393 | 3.1282445 |
nonRhythmGene_Id3 | -9.8373929 | -5.8738313 | -0.7793628 | -2.7236067 | 2.6574806 |
nonRhythmGene_Id4 | 4.5951368 | 0.2952068 | 3.5660223 | -1.2458999 | 6.1385621 |
nonRhythmGene_Id5 | 0.3980836 | 3.9928686 | 1.5626314 | 0.0613089 | 0.7013376 |
nonRhythmGene_Id6 | -0.2139567 | -2.4819291 | -6.5602675 | -1.9901769 | -7.5526251 |
Sample metadata is extracted from the column names of the matrix.
IDs | CT0_1_A | CT0_2_B | CT0_3_C | CT4_4_D | CT4_5_E |
---|---|---|---|---|---|
nonRhythmGene_Id1 | 6.3717658 | 4.0909466 | 8.3010577 | 7.0240708 | -4.6510406 |
nonRhythmGene_Id2 | -2.4340798 | -5.6695362 | -3.0800470 | -0.1800393 | 3.1282445 |
nonRhythmGene_Id3 | -9.8373929 | -5.8738313 | -0.7793628 | -2.7236067 | 2.6574806 |
nonRhythmGene_Id4 | 4.5951368 | 0.2952068 | 3.5660223 | -1.2458999 | 6.1385621 |
nonRhythmGene_Id5 | 0.3980836 | 3.9928686 | 1.5626314 | 0.0613089 | 0.7013376 |
nonRhythmGene_Id6 | -0.2139567 | -2.4819291 | -6.5602675 | -1.9901769 | -7.5526251 |
Names are expected to follow the pattern:
Prefix
Time
*_Unique Id
_Replicate Id
Field | Description | Examples |
---|---|---|
Prefix |
A unit of time. The web application will use this as the unit of time by default. | hr , ZT , CT |
Time * |
Indicates the time of collection for the respective sample. Can only be positive values. | 20 , 2.1 , 0.3 |
Unique Id |
A free field used to uniquely identify samples in visualizations and summaries. | GSM3186429 , sample1 , subjectA , AX |
Replicate Id |
Used to identify each biological sample uniquely when combined with Time . |
1 , A , rep1 |
Time
is the only required field (i.e. 32
, CT32
, CT32_AS_1
, 32_AS_1
are all valid naming styles) and the other fields are used to obtain a few
pieces of sample metadata if necessary:
Biological vs Technical Replicates - Time
+ Replicate Id
1 If no Replicate Id
is provided, all samples are assumed to be
independent biological replicates. are used to
identify independent samples collected at the same timepoint (biological
replicates). Samples with the same Time
and Replicate Id
are assumed to be technical replicates from a single biological sample.
Unique Sample Identity - Each column name should be unique such that all
samples can be uniquely identified to the user. The Unique Id
field is
intended for this purpose. If column
names are not unique, the Unique Id
field will be generated to provide
unique sample names during usage of DiscoRhythm.
Note: all fields should contain only alphanumeric values (with the exception
of ‘.’ in the Time
field which is allowed for decimal values.
The data extracted above is stored in the DiscoRhythm application as:
ID | Time | ReplicateID | |
---|---|---|---|
CT0_1_A | CT0_1_A | 0 | A |
CT0_2_B | CT0_2_B | 0 | B |
CT0_3_C | CT0_3_C | 0 | C |
CT4_4_D | CT4_4_D | 4 | D |
CT4_5_E | CT4_5_E | 4 | E |
CT4_6_F | CT4_6_F | 4 | F |
Note: If there are more groupings in the dataset or if the dataset does not fit into this design, it may be appropriate to split the dataset into subgroups that fit this design (each subgroup as one input dataset to DiscoRhythm).
DiscoRhythm defines time in an input dataset in one of two ways:
Linear time exists in systems where an experiment start time is meaningful (often setting t=0 to some specific event). Circular time exists in experiments where the start of experiment is not meaningful or left unobserved (e.g. time-of-day in a cross-sectional study). One of these two types must be specified for the input dataset and will influence how DiscoRhythm analysis is performed.
Example of linear time in hours: 1,2,3 ... 24,25,26
Example of circular time, time-of-day in hours: 1,2,3 ... 24,1,2
For example, a dataset obtained from mice entrained on a 12-h light, 12-h dark schedule before being released into total darkness prior to collection. The presence of a specific event (release into total darkness) would make the dataset suitable for the “Linear time” setting of DiscoRhythm.
Note: If samples were collected during the entrainment to the light/dark base-cycle, “Circular time” would be appropriate as mice sampled at the same point in the cycle on different days may be treated as biological replicates.
This section will walk the user through each section of the DiscoRhythm web application. It is recommended to keep this documentation open during first usage of the application to read details for each section.
The DiscoRhythm web interface is a dashboard where the sections of the analysis can be accessed in the sidebar and progress in a sequential fashion with data from the previous step being fed into the next.
There are interactive controls to set parameters for each section’s analysis. When parameters relevant to a figure change, the corresponding figure will dynamically update to reflect the newly calculated results. Since DiscoRhythm is intended for use sequentially through the sections, returning to earlier sections of the analysis and modifying parameters may result in unexpected behaviour.
Various download buttons are available throughout the application for archival of both plot outputs and numerical results.
Purpose: To upload, clean, and summarize the experimental design for the input dataset.
An input dataset is expected to be in comma separated value (CSV) format as specified in section 3. Upload the dataset using the “upload CSV” input method.
The simulated dataset (from section 3.1) is available to test the features of DiscoRhythm.
Messages or warnings may be seen at this point as DiscoRhythm imports the dataset and performs a few cleaning tasks:
Unique Id
2 See 3.3Specify the other analysis options for the dataset:
Time Type - See 3.4.
Period of Interest - The main hypothesized period should be specified in order to set appropriate defaults throughout the application. If unknown, set to the range of the sample collection times.
Time Unit/Observation Unit - Units to display in the axis labels throughout the application.
If the “Sampling Summary” table does not seem to accurately reflect the data, please refer back to section 3.3. It is also a good idea to expand the “Inspect Input Data Matrix” and “Inspect Parsed Metadata” boxes to ensure the data has been read correctly.
Experimental artifacts or errors commonly result in data from collected samples to be not accurately reflective of the true biological phenomenon. This can often be observed through systematic signals from a single sample which do not have biological plausibility. DiscoRhythm attempts to detect such systematic outliers by two methods:
Each method is applied independently to the dataset to detect outliers and then the filtering summary section is used to decide which detected outliers to remove. A reasonable standard deviation threshold for both methods would be around 2 to 3.
By default, no outliers will be flagged for removal. The DiscoRhythm web application will set the default threshold such that no outliers are flagged.
Purpose: Samples are pairwise correlated using either the Pearson or Spearman method of correlation to detect outliers.
Heatmap: The values of these pairwise correlations can be visualized in this tab, where samples with similar correlation values are grouped together using clustering.
Outlier Detection: The average correlation value for each sample is used as a metric of its overall similarity to all other samples and is summarized in this figure.
Samples with a high deviation below the mean will be flagged as outliers where the user may specify a number of standard deviations below the mean to use as a threshold.
Purpose: Utilize principal component analysis (PCA) to detect outliers.
PCA is used to extract the strongest recurring patterns in the dataset. Outliers detected in these patterns (PC scores) are flagged by their deviation from the mean where again the user may specify a threshold in units of standard deviations to use.
Scale Before PCA: Whether to scale rows to a standard deviation of 1 prior to PCA such that all rows are on an equal scale. Scaling is usually advisable.
PCs to Use For Outlier Detection: Click to change the list of PCs to use for outlier removal in the case a PC is determined inappropriate for use in outlier detection. You can remove unwanted PCs by pressing “delete” and add extra ones by typing their number.
Before CSV/After CSV: Downloadable summaries of the PCA before and after the detected outliers are removed.
Figures:
Distributions: The distributions of PC scores used to detect outliers. Only the PCs colored darkly are used for the final outlier flagging. Outliers will be shown with an ‘x’.
Scree: PCs are numbered where the amount of variance explained by each PC (therefore their ‘importance’) decreases with increasing PC number. This can be seen in the “Scree” figure. Users should choose an appropriate number of PCs to use for outlier detection by the shape of this scree plot.
One Pair and All Pairs: Plotting the PC scores of the components versus one another may reveal grouping that cannot be determined from simple analysis of individual PCs.
Purpose: Determine how to proceed with outlier removal.
The user may at this point choose to remove the flagged outliers or may disregard these flags if it is suspected the dissimilarity of these samples may be biologically relevant. The user may also remove samples which they deem to be unreliable for further analysis by other metrics (e.g. experimental quality metrics).
Raw Distributions: A boxplot for each individual sample which can be used to further evaluate sample selection.
Input and Final: Shows summary tables for data before and after outlier removal.
Purpose: Utilize any technical replicates present in the dataset to quantify the signal-to-noise for each row. Combine technical replicates for downstream analysis.
Technical replicates are not useful for the statistical tests used by DiscoRhythm for oscillation detection as they are not representative of the populational variance of the data (i.e. do not satisfy the independence assumptions). They will instead be used to identify rows of the dataset where the biological variation is greater than the technical variation (i.e. high signal-to-noise). A hypothesis test is performed using ANOVA procedures to determine whether there may be a real biological signal in the row.
ANOVA Method: 3 options are available for ANOVA:
F-statistic Cutoff: The user may choose to filter rows by the magnitude of the signal-to-noise rather than by statistical significance.
Replicates should be combined for downstream rhythmicity analysis. DiscoRhythm provides three methods for combining technical replicates:
Note: Users may also choose to not combine technical replicates. This is only advisable if the technical replicates do in fact represent independent samples of the population/dataset (i.e. if they were erroneously labelled in section 3).
Purpose: Summarize the strength of multiple periodicities across the entire dataset.
Purpose: Detecting the strength of a range of periods of oscillation across the entire dataset using a Cosinor based approach.
Spectral analysis will be limited3 For circular time, only harmonics of the base-cycle will be available for testing. from a smallest period of 3 times the sampling-interval up to the sampling duration and 12 periods will be tested evenly spaced across this range.
Note: Experiments typically have a hypothesized period length (such as the organisms period of activity) and that period should be used for analysis in section 4.6 when possible. When using this tool for determining which periodicity to use, follow up experiments may be appropriate for validation.
To determine the presence of a single rhythmic pattern in the data, PCA is performed to visualize and test the summarized temporal signal for rhythmicity (using the Cosinor method).
Purpose: Individually quantify rhythmicity of remaining rows of the dataset where each row will be tested for rhythmicity using methods suitable for the sample collections present.
The user must choose a single period4 If it is unknown which periodicity to test start with the dominant period seen in section 4.5. of oscillation to test across all rows of the dataset. The application may show warnings/messages regarding the choice of period. By default, the period input in the ‘Select Data’ section will be chosen.
JTK Cycle, Lomb-Scargle, and ARSER results are all obtained through the MetaCycle R package (meta2d function using minper=maxper). Cosinor is a built-in function of DiscoRhythm. A brief summary of each method:
Cosinor - a.k.a “Harmonic Regression” Fits a sinusoid
with a free phase parameter.
JTK Cycle - non-parametric test of rhythmicity robust to outliers.
Lomb-Scargle - an approach using spectral power density.
ARSER - removes linear trends and performs the Cosinor test.
Exclusion Criteria Matrix: A table is presented which describes the criteria which exclude a method from use and shown are criteria which are true for the loaded dataset5 If no criteria are present, the table will be absent.. The reasons may be due to either computational (causes errors under given conditions) or statistical restrictions (requirements of study design) of the method.
Criteria | Description |
---|---|
missing_value | Rows contain missing values. |
with_bio_replicate | Biological replicates are present. |
non_integer_interval | The spacing between samples is not an integer value. |
uneven_interval | Time between collections is not uniform. |
circular_t | Time is circular (see 3.4). |
invalidPeriod | Chosen period to test is not valid. |
invalidJTKperiod | Chosen period to test is not valid for JTK Cycle. |
Once rhythmicity computation is completed, 3 sections become available for viewing the results:
Individual Models: Allows inspection of the raw data for individual rows of the dataset. User may click a row of the table to display the raw data and a fitted curve for that row. If the Cosinor method is being viewed, the line will be the Cosinor fit, all other methods utilize a loess fit. If the error bar option is selected, a 95% confidence interval on the mean will be displayed for each timepoint.
Summary: Summarizes calculated rhythm parameters across all tested rows by all executed methods.
Method Comparison: Offers pairwise comparison of rhythmic parameters calculated by each method to determine the degree of agreement between methods.
Purpose: Archive the results of the DiscoRhythm session in a reproducible report and download R data associated with the session.
HTML Report R code is provided in this report to reproduce all results. The data can be reprocessed in the future using this code.
Session Results Alternatively, data may be downloaded directly through the app and accessed from the disco_ro object in R.
Session Inputs Users may also simply download an R data file containing their input dataset and all parameters such that results can be reprocessed using the DiscoRhythm R package.
This section will detail usage of the R functions used to perform the
analysis in section 4. For each of
the sections below, refer to the DiscoRhythm R package manual for
specific technical details on usage, arguments and methods or use ?
to
access individual manual pages. For instance,
get more
help for the function discoBatch()
with command ?discoBatch
.
SummarizedExperiment objects such as those generated by other Bioconductor packages should be suitable inputs to DiscoRhythm functions once modified to contain the following required data:
colData(se) - Stores sample metadata (See 3.3.1). 3 columns are required:
Objects with this structure will be used throughout usage of the package.
Note that at present DiscoRhythm will use the first assay
(i.e. assays(se)[[1]]
)
of the SummarizedExperiment and all others will be ignored.
The CSV inputs to the DiscoRhythm web interface described in section
3 can be read into R as
a data.frame.
To allow for users using the web application to use the same input for
analysis in R, the function discoDFtoSE
is available to convert the tabular
input into an apporpriate format for analysis in R described in
5.1 above.
The sample metadata will be
extracted from the column
names by matching the format6 See ?discoParseMeta
for regular expression specifications. described in section 3.3.
The checks for validity and uniqueness mentioned in section 4.2
will also be performed. Alternatively, this metadata may also be input directly
to discoDFtoSE
as a data.frame.
Loading in the same example dataset as section 3 using
discoGetSimu()
to read in the CSV system file as a data.frame:
library(DiscoRhythm)
indata <- discoGetSimu()
knitr::kable(head(indata[,1:6]), format = "markdown") # Inspect the data
IDs | CT0_1_A | CT0_2_B | CT0_3_C | CT4_4_D | CT4_5_E |
---|---|---|---|---|---|
nonRhythmGene_Id1 | 6.3717658 | 4.0909466 | 8.3010577 | 7.0240708 | -4.6510406 |
nonRhythmGene_Id2 | -2.4340798 | -5.6695362 | -3.0800470 | -0.1800393 | 3.1282445 |
nonRhythmGene_Id3 | -9.8373929 | -5.8738313 | -0.7793628 | -2.7236067 | 2.6574806 |
nonRhythmGene_Id4 | 4.5951368 | 0.2952068 | 3.5660223 | -1.2458999 | 6.1385621 |
nonRhythmGene_Id5 | 0.3980836 | 3.9928686 | 1.5626314 | 0.0613089 | 0.7013376 |
nonRhythmGene_Id6 | -0.2139567 | -2.4819291 | -6.5602675 | -1.9901769 | -7.5526251 |
And importing as a SummarizedExperiment
.
se <- discoDFtoSE(indata)
The reverse operation, discoSEtoDF
, is also available and is mainly intended
for the purpose of exporting data as CSV for input to the web application.
write.csv(discoSEtoDF(se),file = "DiscoRhythmInputFile.csv",row.names = FALSE)
Will export data for usage in the web application.
This function performs the row-wise checks for missing values and constant values as mentioned in section 4.2.
selectDataSE <- discoCheckInput(se)
The sample collection information present in colData(selectDataSE)
can be
summarized by the discoDesignSummary
function to detail the number of
biological and technical replicates available at each collection time. Number of
technical replicates is shown in brackets.
library(SummarizedExperiment)
Metadata <- colData(selectDataSE)
knitr::kable(discoDesignSummary(Metadata),format = "markdown")
0 | 4 | 8 | 12 | 16 | 20 | |
---|---|---|---|---|---|---|
Total | 9 | 9 | 9 | 9 | 9 | 9 |
Biological Sample A (3) | Biological Sample D (3) | Biological Sample G (3) | Biological Sample J (3) | Biological Sample M (3) | Biological Sample P (3) | |
Biological Sample B (3) | Biological Sample E (3) | Biological Sample H (3) | Biological Sample K (3) | Biological Sample N (3) | Biological Sample Q (3) | |
Biological Sample C (3) | Biological Sample F (3) | Biological Sample I (3) | Biological Sample L (3) | Biological Sample O (3) | Biological Sample R (3) |
Performs the analysis described in 4.3.1 to return some intermediate results and a vector indicating which samples were determined to be outliers.
CorRes <- discoInterCorOutliers(selectDataSE,
cor_method="pearson",
threshold=3,
thresh_type="sd")
Performs the analysis described in 4.3.2 to return some intermediate results and a vector indicating which samples were determined to be outliers.
PCAres <- discoPCAoutliers(selectDataSE,
threshold=3,
scale=TRUE,
pcToCut = c("PC1","PC2","PC3","PC4"))
A light wrapper was written for the stats::prcomp
function for better use with
the web application and it can be utilized as:
discoPCAres <- discoPCA(selectDataSE)
This returns the same output as prcomp with the addition of a reformatted
summary table (available as PCAresAfter$table
).
Below the results of the outlier detection analysis (CorRes
and PCAres
)
are used to subset the data to remove outliers:
FilteredSE <- selectDataSE[,!PCAres$outliers & !CorRes$outliers]
DT::datatable(as.data.frame(
colData(selectDataSE)[PCAres$outliers | CorRes$outliers,]
))
knitr::kable(discoDesignSummary(colData(FilteredSE)),format = "markdown")
0 | 4 | 8 | 12 | 16 | 20 | |
---|---|---|---|---|---|---|
Total | 9 | 8 | 9 | 8 | 9 | 9 |
Biological Sample A (3) | Biological Sample D (3) | Biological Sample G (3) | Biological Sample J (3) | Biological Sample M (3) | Biological Sample P (3) | |
Biological Sample B (3) | Biological Sample E (3) | Biological Sample H (3) | Biological Sample K (3) | Biological Sample N (3) | Biological Sample Q (3) | |
Biological Sample C (3) | Biological Sample F (2) | Biological Sample I (3) | Biological Sample L (2) | Biological Sample O (3) | Biological Sample R (3) |
Performs the analysis described in 4.4 returning the results
of the ANOVA test and the se
data object where technical replicates
are combined.
ANOVAres <- discoRepAnalysis(FilteredSE,
aov_method="Equal Variance",
aov_pcut=0.05,
aov_Fcut=1,
avg_method="Median")
FinalSE <- ANOVAres$se
Performs the analysis described in 4.5 to return a data.frame of Cosinor fits across a range of periods.
PeriodRes <- discoPeriodDetection(FinalSE,
timeType="linear",
main_per=24)
The main period of interest is fit using a Cosinor model to principal component scores as described in 4.5.
OVpca <- discoPCA(FinalSE)
OVpcaSE <- discoDFtoSE(data.frame("PC"=1:ncol(OVpca$x),t(OVpca$x)),
colData(FinalSE))
knitr::kable(discoODAs(OVpcaSE,period = 24,method = "CS")$CS,
format = "markdown")
acrophase | amplitude | Rsq | pvalue | intercept | sincoef | coscoef | qvalue |
---|---|---|---|---|---|---|---|
17.71281 | 17.6386786 | 0.9855849 | 0.0000000 | 0 | -8.7944228 | -0.6624756 | 0.0000000 |
23.68809 | 10.9255889 | 0.9182224 | 0.0000000 | 0 | -0.4455825 | 5.4445918 | 0.0000000 |
14.95058 | 0.4453926 | 0.0024716 | 0.9816116 | 0 | -0.1554194 | -0.1594943 | 0.9973953 |
10.36796 | 0.2569763 | 0.0008546 | 0.9936085 | 0 | 0.0532436 | -0.1169372 | 0.9973953 |
12.87527 | 0.7831344 | 0.0087132 | 0.9364723 | 0 | -0.0889421 | -0.3813321 | 0.9973953 |
15.09117 | 0.5088081 | 0.0041057 | 0.9696152 | 0 | -0.1841327 | -0.1755465 | 0.9973953 |
23.56921 | 0.9746275 | 0.0160074 | 0.8860104 | 0 | -0.0548425 | 0.4842179 | 0.9973953 |
23.14554 | 0.4729064 | 0.0040864 | 0.9697559 | 0 | -0.0524537 | 0.2305618 | 0.9973953 |
14.21972 | 0.6315381 | 0.0078395 | 0.9426803 | 0 | -0.1733449 | -0.2639350 | 0.9973953 |
20.71661 | 0.1314298 | 0.0003477 | 0.9973953 | 0 | -0.0497840 | 0.0428952 | 0.9973953 |
Performs the analysis described in 4.6.1
using just the Cosinor method.
discoODAs
will automatically run all appropraite methods
if none are provided.
discoODAres <- discoODAs(FinalSE,
period=24,
method="CS",
ncores=1,
circular_t=FALSE)
The entire analysis performed in section 5 may be
run through a single call to discoBatch()
to obtain the final discoODAres
results.
discoBatch(indata=indata,
report="discoBatch_example.html",
ncores=1,
main_per=24,
timeType="linear",
cor_threshold=3,
cor_method="pearson",
cor_threshType="sd",
pca_threshold=3,
pca_scale=TRUE,
pca_pcToCut=paste0("PC",seq_len(4)),
aov_method="None",
aov_pcut=0.05,
aov_Fcut=0,
avg_method="Median",
osc_method="CS",
osc_period=24)
This command will generate an html report called “discoBatch_example.html”
which includes the
visualizations seen in the DiscoRhythm application. indata
may be in either
of the two input formats described in 5.1 (data.frame
or
SummarizedExperiment
).
sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.9-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.9-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] SummarizedExperiment_1.14.0 DelayedArray_0.10.0
## [3] BiocParallel_1.18.0 matrixStats_0.54.0
## [5] Biobase_2.44.0 GenomicRanges_1.36.0
## [7] GenomeInfoDb_1.20.0 IRanges_2.18.0
## [9] S4Vectors_0.22.0 BiocGenerics_0.30.0
## [11] DiscoRhythm_1.0.0 BiocStyle_2.12.0
##
## loaded via a namespace (and not attached):
## [1] colorspace_1.4-1 class_7.3-15 modeltools_0.2-22
## [4] mclust_5.4.3 futile.logger_1.4.3 XVector_0.24.0
## [7] rstudioapi_0.10 flexmix_2.3-15 DT_0.5
## [10] mvtnorm_1.0-10 xml2_1.2.0 codetools_0.2-16
## [13] robustbase_0.93-4 knitr_1.22 jsonlite_1.6
## [16] broom_0.5.2 cluster_2.0.9 kernlab_0.9-27
## [19] shinydashboard_0.7.1 shiny_1.3.2 readr_1.3.1
## [22] BiocManager_1.30.4 compiler_3.6.0 httr_1.4.0
## [25] backports_1.1.4 assertthat_0.2.1 Matrix_1.2-17
## [28] lazyeval_0.2.2 later_0.8.0 formatR_1.6
## [31] htmltools_0.3.6 tools_3.6.0 gtable_0.3.0
## [34] glue_1.3.1 GenomeInfoDbData_1.2.1 reshape2_1.4.3
## [37] dplyr_0.8.0.1 Rcpp_1.0.1 trimcluster_0.1-2.1
## [40] gdata_2.18.0 nlme_3.1-139 crosstalk_1.0.0
## [43] iterators_1.0.10 fpc_2.1-11.2 xfun_0.6
## [46] stringr_1.4.0 rvest_0.3.3 miniUI_0.1.1.1
## [49] mime_0.6 shinycssloaders_0.2.0 gtools_3.8.1
## [52] dendextend_1.10.0 DEoptimR_1.0-8 MASS_7.3-51.4
## [55] zlibbioc_1.30.0 scales_1.0.0 TSP_1.1-6
## [58] shinyBS_0.61 hms_0.4.2 promises_1.0.1
## [61] lambda.r_1.2.3 RColorBrewer_1.1-2 yaml_2.2.0
## [64] heatmaply_0.15.2 gridExtra_2.3 ggplot2_3.1.1
## [67] UpSetR_1.3.3 ggExtra_0.8 stringi_1.4.3
## [70] highr_0.8 gclus_1.3.2 foreach_1.4.4
## [73] seriation_1.2-3 caTools_1.17.1.2 rlang_0.3.4
## [76] pkgconfig_2.0.2 prabclus_2.2-7 bitops_1.0-6
## [79] evaluate_0.13 lattice_0.20-38 purrr_0.3.2
## [82] htmlwidgets_1.3 tidyselect_0.2.5 plyr_1.8.4
## [85] magrittr_1.5 bookdown_0.9 R6_2.4.0
## [88] magick_2.0 gplots_3.0.1.1 generics_0.0.2
## [91] pillar_1.3.1 whisker_0.3-2 RCurl_1.95-4.12
## [94] nnet_7.3-12 tibble_2.1.1 crayon_1.3.4
## [97] futile.options_1.0.1 KernSmooth_2.23-15 plotly_4.9.0
## [100] rmarkdown_1.12 viridis_0.5.1 grid_3.6.0
## [103] data.table_1.12.2 matrixTests_0.1.2 digest_0.6.18
## [106] diptest_0.75-7 webshot_0.5.1 xtable_1.8-4
## [109] VennDiagram_1.6.20 tidyr_0.8.3 httpuv_1.5.1
## [112] munsell_0.5.0 kableExtra_1.1.0 registry_0.5-1
## [115] viridisLite_0.3.0 shinyjs_1.0