Help for package WallomicsData

Title:

Datasets for Multi-Omics Integration in a Plant Abiotic Stress Context

Version:

1.0

Description:

Datasets from the WallOmics project. Contains phenomics, metabolomics, proteomics and transcriptomics data collected from two organs of five ecotypes of the model plant Arabidopsis thaliana exposed to two temperature growth conditions. Exploratory and integrative analyses of these data are presented in Durufle et al (2020) <doi:10.1093/bib/bbaa166> and Durufle et al (2020) <doi:10.3390/cells9102249>.

License:

Encoding:

UTF-8

RoxygenNote:

7.1.2

Depends:

R (≥ 2.10)

LazyData:

true

NeedsCompilation:

no

Packaged:

2022-04-14 12:41:58 UTC; sdejean

Author:

Sébastien Déjean [aut, cre], Harold Duruflé [aut], Aurélie Mercadié [aut]

Maintainer:

Sébastien Déjean <sebastien.dejean@math.univ-toulouse.fr>

Repository:

CRAN

Date/Publication:

2022-04-15 10:32:29 UTC

WallomicsData: Datasets for Multi-Omics Integration in a Plant Abiotic Stress Context

Description

Datasets from the WallOmics project. Contains phenomics, metabolomics, proteomics and transcriptomics data collected from two organs of five ecotypes of the model plant Arabidopsis thaliana exposed to two temperature growth conditions. Exploratory and integrative analyses of these data are presented in Durufle et al (2020) <doi:10.1093/bib/bbaa166> and Durufle et al (2020) <doi:10.3390/cells9102249>.

Author(s)

Maintainer: Sébastien Déjean sebastien.dejean@math.univ-toulouse.fr

Authors:

Harold Duruflé harold.durufle@inrae.fr
Aurélie Mercadié aurelie.mercadie@inrae.fr

Altitude Cluster

Description

The Altitude Cluster factor identifies the environment height from which is originated a given plant from the sample under study, either high altitude (denoted High), moderate altitude (Low) or the reference group's environment height (Col, the lowest of all).

Usage

data("Altitude_Cluster")

Format

A factor with 3 levels.

Source

doi: 10.3390/cells9102249

Examples

# Load the data
data("Altitude_Cluster")

# Count how many samples are in each group
table(Altitude_Cluster)

Ecotype

Description

The Ecotype factor identifies the genotype specifically designed for a given ecosystem of the A. thaliana from which the studied sample comes from. We have a population of reference as well as 4 newly-described Pyrenean populations, namely:

Columbia, denoted Col (originating from Poland, acts as the reference ecotype)
Grip, denoted Grip
Herran, denoted Hern
L’Hospitalet-près-l’Andorre, denoted Hosp
Chapelle Saint Roch, denoted Roch

Usage

data("Ecotype")

Format

A factor with 5 levels of A. thaliana genotypes.

Source

doi: 10.3390/cells9102249

Examples

# Load the data
data("Ecotype")

# Count how many samples are in each group
table(Ecotype)

Genetic Cluster

Description

The Genetic Cluster factor identifies the genetic group from which comes from the studied sample, either Genetics Cluster 1 (constitued of Grip and Roch genotypes), Genetics Cluster 2 (Hern and Hosp genotypes) or Genetics Cluster 3 (Col genotype). See Ecotype for more information on genotypes.

Usage

data("Genetic_Cluster")

Format

A factor with 3 levels.

Source

doi: 10.3390/cells9102249

Examples

# Load the data
data("Genetic_Cluster")

# Count how many samples are in each group
table(Genetic_Cluster)

Metabolomics Rosettes

Description

A dataset containing metabolomics variables measured on rosettes of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Metabolomics_Rosettes")

Format

A data frame with 30 rows and 6 variables:

Pectin_RGI: Rhamnogalacturonan I (µg/100mg)
Pectin_HG: Homogalacturonan (µg/100mg)
XG: Xyloglucan (µg/100mg)
Pectin_linearity: Linearity of pectin (Ratio)
Contribution_RG: Contribution of rhamnogalacturonan to pectin population (Ratio)
RGI_branching: Branching of Rhamnogalacturonan I (Ratio)

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Metabolomics_Rosettes")

# Look at simple statistics
summary(Metabolomics_Rosettes)

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# A graphical representation
plot(x = as.factor(substr(row.names(Metabolomics_Rosettes), 1, 7)),
     y = Metabolomics_Rosettes$Pectin_linearity, col = "white", lty = 0,
     xlab = "Genotype x Temperature groups",
     ylab = "Pectin linearity (Ratio)",
     main = "Pectin linearity distribution by genotype and growth temperature")
grid()
abline(h = 1, lty = 2)
points(x = as.factor(substr(row.names(Metabolomics_Rosettes), 1, 7)),
       y = Metabolomics_Rosettes$Pectin_linearity, type = "p", pch = 19, lwd = 5,
       col = colors)

Metabolomics Stems

Description

A dataset containing metabolomics variables measured on floral stems of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Metabolomics_Stems")

Format

A data frame with 30 rows and 6 variables:

Pectin_RGI: Rhamnogalacturonan I (µg/100mg)
Pectin_HG: Homogalacturonan (µg/100mg)
XG: Xyloglucan (µg/100mg)
Pectin_linearity: Linearity of pectin (Ratio)
Contribution_RG: Contribution of rhamnogalacturonan to pectin population (Ratio)
RGI_branching: Branching of Rhamnogalacturonan I (Ratio)

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Metabolomics_Stems")

# Look at simple statistics
summary(Metabolomics_Stems)

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# A graphical representation
plot(x = as.factor(substr(row.names(Metabolomics_Stems), 1, 7)),
     y = Metabolomics_Stems$Pectin_linearity, col = "white", lty = 0,
     xlab = "Genotype x Temperature groups",
     ylab = "Pectin linearity (Ratio)",
     main = "Pectin linearity distribution by genotype and growth temperature")
grid()
abline(h = 1, lty = 2)
points(x = as.factor(substr(row.names(Metabolomics_Stems), 1, 7)),
       y = Metabolomics_Stems$Pectin_linearity, type = "p", pch = 19, lwd = 5,
       col = colors)

Metadata

Description

Bioinformatics Annotation and description, using the WallProtDB database, of all the Cell Wall Proteins (CWPs) identified on rosettes and floral stems of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for additionnal information.

Usage

data("Metadata")

Format

A data frame with 474 rows and 4 variables:

Acc_number: GenBank accession number (gene name)
Functional_classes: Functional classes of the CWPs
Protein_families: Protein families of the CWPs
Putative_functions: Putative functions of the CWPs

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Metadata")

# Look at the dataset's dimensions
dim(Metadata)
head(Metadata)

# How many functional classes ?
table(Metadata$Functional_classes)

# How many protein families ?
table(Metadata$Protein_families)

Phenomics Rosettes

Description

A dataset containing phenotypic variables measured on rosettes of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Phenomics_Rosettes")

Format

A data frame with 30 rows and 5 variables:

Mass: rosette mass (g)
Diameter: rosette diameter (cm)
Leaves_number: total number of leaves
Density: rosette density (g/cm²)
Area: projected rosette area (cm²)

Source

doi: 10.3390/cells9102249

Examples

# Load the data
data("Phenomics_Rosettes")

# Look at simple statistics
dim(Phenomics_Rosettes)
summary(Phenomics_Rosettes)

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# A graphical representation: Leaves number distribution
plot(x = as.factor(substr(row.names(Phenomics_Rosettes), 1, 7)),
     y = Phenomics_Rosettes$Leaves_number, col = "white", lty = 0,
     xlab = "Genotype x Temperature groups",
     ylab = "Number of rosette leaves",
     main = "Rosette leaves' distribution by genotype and growth temperature"
     )
grid()
points(x = as.factor(substr(row.names(Phenomics_Rosettes), 1, 7)),
       y = Phenomics_Rosettes$Leaves_number, type = "p", pch = 19, lwd = 5,
       col = colors)

Phenomics Stems

Description

A dataset containing phenotypic variables measured on floral stems of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Phenomics_Stems")

Format

A data frame with 30 rows and 4 variables:

Mass: floral stems mass (g)
Diameter: floral stems diameter (mm)
Length: length of the floral stems (cm)
Number_lateral_stems: number of lateral stems)

Source

doi: 10.3390/cells9102249

Examples

# Load the data
data("Phenomics_Stems")

# Look at simple statistics
dim(Phenomics_Stems)
summary(Phenomics_Stems)

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# A graphical representation: Lateral stems distribution
plot(x = as.factor(substr(row.names(Phenomics_Stems), 1, 7)),
     y = Phenomics_Stems$Number_lateral_stems, col = "white", lty = 0,
     xlab = "Genotype x Temperature groups",
     ylab = "Number of lateral stems",
     main = "Lateral stems' distribution by genotype and growth temperature"
     )
grid()
points(x = as.factor(substr(row.names(Phenomics_Stems), 1, 7)),
       y = Phenomics_Stems$Number_lateral_stems, type = "p", pch = 19, lwd = 5,
       col = colors)

Proteomics Rosettes Cell Wall

Description

A dataset containing the identification and quantification of Cell Wall Proteins (CWPs) performed using LC-MS/MS analysis on rosettes of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for additional information.

Usage

data("Proteomics_Rosettes_CW")

Format

A data frame with 30 rows and 364 variables.

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Proteomics_Rosettes_CW")

# Look at data frame dimensions
dim(Proteomics_Rosettes_CW)

# Look at the first rows and columns
head(Proteomics_Rosettes_CW[,c(1:10)])

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# PCA on proteomics
res.pca <- prcomp(Proteomics_Rosettes_CW, center = TRUE, scale. = TRUE)
plot(res.pca$x[,"PC1"], res.pca$x[,"PC2"], pch = 19, xlab = "PC1", ylab = "PC2", lwd = 5,
     main = "Individuals' plot (1 x 2) - PCA on Rosettes Cell Wall Proteomics",
     col = colors)
text(res.pca$x[,"PC1"], res.pca$x[,"PC2"], labels = row.names(res.pca$x), cex = 0.8, pos = 3)

Proteomics Stems Cell Wall

Description

A dataset containing the identification and quantification of Cell Wall Proteins (CWPs) performed using LC-MS/MS analysis on floral stems of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for additionnal information.

Usage

data("Proteomics_Stems_CW")

Format

A data frame with 30 rows and 414 variables.

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Proteomics_Stems_CW")

# Look at data frame dimensions
dim(Proteomics_Stems_CW)

# Look at the first rows and columns
head(Proteomics_Stems_CW[,c(1:10)])

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# PCA on proteomics
res.pca <- prcomp(Proteomics_Stems_CW, center = TRUE, scale. = TRUE)
plot(res.pca$x[,"PC1"], res.pca$x[,"PC2"], pch = 19, xlab = "PC1", ylab = "PC2", lwd = 5,
     main = "Individuals' plot (1 x 2) - PCA on Stems Cell Wall Proteomics",
     col = colors)
text(res.pca$x[,"PC1"], res.pca$x[,"PC2"], labels = row.names(res.pca$x), cex = 0.8, pos = 3)

Temperature

Description

The Temperature factor identifies the temperature at which the studied sample was exposed all along its growth, either 22°C (optimal condition) or 15°C (high altitude condition).

Usage

data("Temperature")

Format

A factor with 2 levels.

Source

doi: 10.3390/cells9102249

Examples

# Load the data
data("Temperature")

# Count how many samples are in each group
table(Temperature)

Transcriptomics Rosettes

Description

A dataset containing all the transcripts obtained by RNA-seq performed, according to the standard Illumina protocols, on rosettes of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Transcriptomics_Rosettes")

Format

A data frame with 30 rows and 19763 variables.

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Transcriptomics_Rosettes")

# Look at data frame dimensions
dim(Transcriptomics_Rosettes)

# Look at the first rows and columns
head(Transcriptomics_Rosettes[,c(1:10)])

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# PCA on transcriptomics
res.pca <- prcomp(Transcriptomics_Rosettes, center = TRUE, scale. = TRUE)
plot(res.pca$x[,"PC1"], res.pca$x[,"PC2"], pch = 19, xlab = "PC1", ylab = "PC2", lwd = 5,
     main = "Individuals' plot (1 x 2) - PCA on Rosettes' Transcriptomics",
     col = colors)
text(res.pca$x[,"PC1"], res.pca$x[,"PC2"], labels = row.names(res.pca$x), cex = 0.8, pos = 3)

Transcriptomics Rosettes Cell Wall

Description

A dataset containing the transcripts encoding Cell Wall Proteins (CWPs) sorted from the 19 763 transcripts (see Transcriptomics_Rosettes) obtained by RNA-seq performed, according to the standard Illumina protocols, on rosettes of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Transcriptomics_Rosettes_CW")

Format

A data frame with 30 rows and 364 variables.

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Transcriptomics_Rosettes_CW")

# Look at data frame dimensions
dim(Transcriptomics_Rosettes_CW)

# Look at the first rows and columns
head(Transcriptomics_Rosettes_CW[,c(1:10)])

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# PCA on transcriptomics
res.pca <- prcomp(Transcriptomics_Rosettes_CW, center = TRUE, scale. = TRUE)
plot(res.pca$x[,"PC1"], res.pca$x[,"PC2"], pch = 19, xlab = "PC1", ylab = "PC2", lwd = 5,
     main = "Individuals' plot (1 x 2) - PCA on Rosettes Cell Wall Transcriptomics",
     col = colors)
text(res.pca$x[,"PC1"], res.pca$x[,"PC2"], labels = row.names(res.pca$x), cex = 0.8, pos = 3)

Transcriptomics Stems

Description

A dataset containing all the transcripts obtained by RNA-seq performed, according to the standard Illumina protocols, on floral stems of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Transcriptomics_Stems")

Format

A data frame with 30 rows and 22570 variables.

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Transcriptomics_Stems")

# Look at data frame dimensions
dim(Transcriptomics_Stems)

# Look at the first rows and columns
head(Transcriptomics_Stems[,c(1:10)])

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# PCA on transcriptomics
res.pca <- prcomp(Transcriptomics_Stems, center = TRUE, scale. = TRUE)
plot(res.pca$x[,"PC1"], res.pca$x[,"PC2"], pch = 19, xlab = "PC1", ylab = "PC2", lwd = 5,
     main = "Individuals' plot (1 x 2) - PCA on Stems' Transcriptomics",
     col = colors)
text(res.pca$x[,"PC1"], res.pca$x[,"PC2"], labels = row.names(res.pca$x), cex = 0.8, pos = 3)

Transcriptomics Stems Cell Wall

Description

A dataset containing the transcripts encoding Cell Wall Proteins (CWPs) sorted from the 22 570 transcripts (see Transcriptomics_Stems) obtained by RNA-seq performed, according to the standard Illumina protocols, on floral stems of five A. thaliana genotypes at two growth temperatures. See Ecotype and Temperature for more information.

Usage

data("Transcriptomics_Stems_CW")

Format

A data frame with 30 rows and 414 variables.

Source

doi: 10.3390/cells9102249

Examples

# Load the dataset
data("Transcriptomics_Stems_CW")

# Look at data frame dimensions
dim(Transcriptomics_Stems_CW)

# Look at the first rows and columns
head(Transcriptomics_Stems_CW[,c(1:10)])

# Create a colors' vector
colors <- c(rep("#A6CEE3",3), rep("#1F78B4",3), rep("#B2DF8A",3), rep("#33A02C",3),
            rep("#FB9A99",3), rep("#E31A1C",3), rep("#FDBF6F",3), rep("#FF7F00",3),
            rep("#CAB2D6",3), rep("#6A3D9A",3))

# PCA on transcriptomics
res.pca <- prcomp(Transcriptomics_Stems_CW, center = TRUE, scale. = TRUE)
plot(res.pca$x[,"PC1"], res.pca$x[,"PC2"], pch = 19, xlab = "PC1", ylab = "PC2", lwd = 5,
     main = "Individuals' plot (1 x 2) - PCA on Stems Cell Wall Transcriptomics",
     col = colors)
text(res.pca$x[,"PC1"], res.pca$x[,"PC2"], labels = row.names(res.pca$x), cex = 0.8, pos = 3)