1 Basics

1.1 Install chevreuldata

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. chevreuldata is a R package available via the Bioconductor repository for packages. R can be installed on any operating system from CRAN after which you can install chevreuldata by using the following commands in your R session:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
      install.packages("BiocManager")
  }

BiocManager::install("chevreuldata")

## Check that you have a valid Bioconductor installation
BiocManager::valid()

1.2 Required knowledge

chevreuldata is an ExperimentHub based data package containing smart-seq based scRNA-seq data as a SingleCellExperiment object from human retinal organoids. All included data is generated by the Cobrinik laboratory at Children’s Hospital Los Angeles.

1.3 Citing chevreuldata

We hope that chevreuldata will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!

## Citation info
citation("chevreuldata")
#> To cite package 'chevreuldata' in publications use:
#> 
#>   Stachelek K (2024). _chevreuldata: Example data for the chevreul
#>   package_. R package version 0.99.15,
#>   <https://github.com/whtns/chevreuldata>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {chevreuldata: Example data for the chevreul package},
#>     author = {Kevin Stachelek},
#>     year = {2024},
#>     note = {R package version 0.99.15},
#>     url = {https://github.com/whtns/chevreuldata},
#>   }

2 Quick start to using chevreuldata

library("chevreuldata")

To access data use helper functions as below

chevreul_sce <- chevreuldata::human_gene_transcript_sce()
#> see ?chevreuldata and browseVignettes('chevreuldata') for documentation
#> downloading 1 resources
#> retrieving 1 resource
#> loading from cache

Data has been processed using the chevreul package. Expression information is available for both gene (main experiment) and transcript (alt experiment) features

mainExpName(chevreul_sce)
#> [1] "gene"
altExpNames(chevreul_sce)
#> [1] "transcript"

cell metadata includes organoid age Age, preparation method Prep.Method, and louvain clustering identities at multiple resolutions gene_snn_res.x.x

colData(chevreul_sce)
#> DataFrame with 883 rows and 49 columns
#>                  orig.ident nCount_gene nFeature_gene nCount_RNA nFeature_RNA
#>                 <character>   <numeric>     <integer>  <numeric>    <numeric>
#> ds20181001-0001  ds20181001      526209          1532    2384251         8908
#> ds20181001-0002  ds20181001      209036          1038     891703         6107
#> ds20181001-0003  ds20181001      470724          1696    1748316         9366
#> ds20181001-0004  ds20181001      780501          1723    2361597         8895
#> ds20181001-0005  ds20181001      406661          1235    1774651         7313
#> ...                     ...         ...           ...        ...          ...
#> ds20181001-1036  ds20181001      476671          1108    1921861         6284
#> ds20181001-1037  ds20181001      455445           998    1808767         5590
#> ds20181001-1038  ds20181001      223697          1036     869327         5795
#> ds20181001-1039  ds20181001      575075          1159    2110461         6610
#> ds20181001-1040  ds20181001      618681          1184    2627989         6826
#>                       sample_id     sample_id_1 tissue_type      Kit_ID
#>                     <character>     <character> <character> <character>
#> ds20181001-0001 ds20181001-0001 ds20181001-0001    organoid          1A
#> ds20181001-0002 ds20181001-0002 ds20181001-0002    organoid          1A
#> ds20181001-0003 ds20181001-0003 ds20181001-0003    organoid          1A
#> ds20181001-0004 ds20181001-0004 ds20181001-0004    organoid          1A
#> ds20181001-0005 ds20181001-0005 ds20181001-0005    organoid          1A
#> ...                         ...             ...         ...         ...
#> ds20181001-1036 ds20181001-1036 ds20181001-1036    organoid          3C
#> ds20181001-1037 ds20181001-1037 ds20181001-1037    organoid          3C
#> ds20181001-1038 ds20181001-1038 ds20181001-1038    organoid          3C
#> ds20181001-1039 ds20181001-1039 ds20181001-1039    organoid          3C
#> ds20181001-1040 ds20181001-1040 ds20181001-1040    organoid          3C
#>                 Kit_sample Seq_Number Tissue.Type Prep.Method Prep.Number
#>                  <numeric>  <numeric> <character> <character> <character>
#> ds20181001-0001          1          3    Organoid    Kuwahara         115
#> ds20181001-0002          2          3    Organoid    Kuwahara         115
#> ds20181001-0003          3          3    Organoid    Kuwahara         115
#> ds20181001-0004          4          3    Organoid    Kuwahara         115
#> ds20181001-0005          5          3    Organoid    Kuwahara         115
#> ...                    ...        ...         ...         ...         ...
#> ds20181001-1036         76          3    Organoid       Zhong        17-7
#> ds20181001-1037         77          3    Organoid       Zhong        17-7
#> ds20181001-1038         78          3    Organoid       Zhong        17-7
#> ds20181001-1039         79          3    Organoid       Zhong        17-7
#> ds20181001-1040         80          3    Organoid       Zhong        17-7
#>                       Age Time.Group Poor_Read_Number Moderate_Alignment
#>                 <numeric>  <numeric>        <logical>          <logical>
#> ds20181001-0001        56          1               NA                 NA
#> ds20181001-0002        56          1               NA                 NA
#> ds20181001-0003        56          1               NA                 NA
#> ds20181001-0004        56          1               NA                 NA
#> ds20181001-0005        56          1               NA                 NA
#> ...                   ...        ...              ...                ...
#> ds20181001-1036       225          8               NA                 NA
#> ds20181001-1037       225          8               NA                 NA
#> ds20181001-1038       225          8               NA                 NA
#> ds20181001-1039       225          8               NA                 NA
#> ds20181001-1040       225          8               NA                 NA
#>                 Rod_Cells Possible_Rods Non_Photoreceptors Collection_Method
#>                 <logical>     <logical>          <logical>       <character>
#> ds20181001-0001        NA            NA                 NA              FACS
#> ds20181001-0002        NA            NA                 NA              FACS
#> ds20181001-0003        NA            NA                 NA              FACS
#> ds20181001-0004        NA            NA                 NA              FACS
#> ds20181001-0005        NA            NA                 NA              FACS
#> ...                   ...           ...                ...               ...
#> ds20181001-1036        NA            NA                 NA              FACS
#> ds20181001-1037        NA            NA                 NA              FACS
#> ds20181001-1038        NA            NA                 NA              FACS
#> ds20181001-1039        NA            NA                 NA              FACS
#> ds20181001-1040        NA            NA                 NA              FACS
#>                  Outliers VSX2_Outlier X9_Cluster_Green_Rods HR_Cluster_LB
#>                 <logical>    <logical>             <logical>     <logical>
#> ds20181001-0001        NA           NA                    NA            NA
#> ds20181001-0002        NA           NA                    NA            NA
#> ds20181001-0003        NA           NA                    NA            NA
#> ds20181001-0004        NA           NA                    NA            NA
#> ds20181001-0005        NA           NA                    NA            NA
#> ...                   ...          ...                   ...           ...
#> ds20181001-1036        NA           NA                    NA            NA
#> ds20181001-1037        NA           NA                    NA            NA
#> ds20181001-1038        NA           NA                    NA            NA
#> ds20181001-1039        NA           NA                    NA            NA
#> ds20181001-1040        NA           NA                    NA            NA
#>                 HR_Cluster_Black HR_Cluster_LG HR_Cluster_DG HR_Cluster_Pink
#>                        <logical>     <logical>     <logical>       <logical>
#> ds20181001-0001               NA            NA            NA              NA
#> ds20181001-0002               NA            NA            NA              NA
#> ds20181001-0003               NA            NA            NA              NA
#> ds20181001-0004               NA            NA            NA              NA
#> ds20181001-0005               NA            NA            NA              NA
#> ...                          ...           ...           ...             ...
#> ds20181001-1036               NA            NA            NA              NA
#> ds20181001-1037               NA            NA            NA              NA
#> ds20181001-1038               NA            NA            NA              NA
#> ds20181001-1039               NA            NA            NA              NA
#> ds20181001-1040               NA            NA            NA              NA
#>                 Cluster_Color Fetal_Age Old_Seq_Number Old_Seq_Kit_ID
#>                     <logical> <logical>      <logical>      <logical>
#> ds20181001-0001            NA        NA             NA             NA
#> ds20181001-0002            NA        NA             NA             NA
#> ds20181001-0003            NA        NA             NA             NA
#> ds20181001-0004            NA        NA             NA             NA
#> ds20181001-0005            NA        NA             NA             NA
#> ...                       ...       ...            ...            ...
#> ds20181001-1036            NA        NA             NA             NA
#> ds20181001-1037            NA        NA             NA             NA
#> ds20181001-1038            NA        NA             NA             NA
#> ds20181001-1039            NA        NA             NA             NA
#> ds20181001-1040            NA        NA             NA             NA
#>                 excluded_because       batch           names        type
#>                      <character> <character>     <character> <character>
#> ds20181001-0001             keep    Kuwahara ds20181001-0001          PE
#> ds20181001-0002             keep    Kuwahara ds20181001-0002          PE
#> ds20181001-0003             keep    Kuwahara ds20181001-0003          PE
#> ds20181001-0004             keep    Kuwahara ds20181001-0004          PE
#> ds20181001-0005             keep    Kuwahara ds20181001-0005          PE
#> ...                          ...         ...             ...         ...
#> ds20181001-1036             keep       Zhong ds20181001-1036          PE
#> ds20181001-1037             keep       Zhong ds20181001-1037          PE
#> ds20181001-1038             keep       Zhong ds20181001-1038          PE
#> ds20181001-1039             keep       Zhong ds20181001-1039          PE
#> ds20181001-1040             keep       Zhong ds20181001-1040          PE
#>                 gene_snn_res.0.2 seurat_clusters gene_snn_res.0.4
#>                         <factor>        <factor>         <factor>
#> ds20181001-0001                0              2                 2
#> ds20181001-0002                0              2                 2
#> ds20181001-0003                3              12                5
#> ds20181001-0004                3              12                5
#> ds20181001-0005                0              2                 2
#> ...                          ...             ...              ...
#> ds20181001-1036                1               0                4
#> ds20181001-1037                1               3                1
#> ds20181001-1038                1               0                4
#> ds20181001-1039                1               0                4
#> ds20181001-1040                1               3                1
#>                 gene_snn_res.0.6 gene_snn_res.0.8 gene_snn_res.1  read_count
#>                         <factor>         <factor>       <factor> <character>
#> ds20181001-0001                2                2              2          NA
#> ds20181001-0002                2                2              2          NA
#> ds20181001-0003                5                5              6          NA
#> ds20181001-0004                5                5              6          NA
#> ds20181001-0005                2                2              2          NA
#> ...                          ...              ...            ...         ...
#> ds20181001-1036                4                4              4          NA
#> ds20181001-1037                0                4              4          NA
#> ds20181001-1038                4                4              4          NA
#> ds20181001-1039                4                4              4          NA
#> ds20181001-1040                0                0              0          NA
#>                 percent.mt nCount_transcript nFeature_transcript    ident
#>                  <numeric>         <numeric>           <integer> <factor>
#> ds20181001-0001   0.175937            525534                4290       2 
#> ds20181001-0002   0.375185            208914                2592       2 
#> ds20181001-0003   0.396471            470708                4646       12
#> ds20181001-0004   0.496239            779109                4845       12
#> ds20181001-0005   0.244852            406723                3244       2 
#> ...                    ...               ...                 ...      ...
#> ds20181001-1036   0.198974            474402                2722        0
#> ds20181001-1037   0.611019            454162                2485        3
#> ds20181001-1038   0.539674            222386                2257        0
#> ds20181001-1039   0.126259            572366                2867        0
#> ds20181001-1040   0.254418            615519                2910        3

For more information on data generation consult Shayler et al. https://www.biorxiv.org/content/10.1101/2023.02.28.530247v1

3 Reproducibility

The chevreuldata package (Stachelek, 2024) was made possible thanks to:

  • R (R Core Team, 2024)
  • BiocStyle (Oleś, 2024)
  • knitr (Xie, 2024)
  • RefManageR (McLean, 2017)
  • rmarkdown (Allaire, Xie, Dervieux, McPherson, Luraschi, Ushey, Atkins, Wickham, Cheng, Chang, and Iannone, 2024)
  • sessioninfo (Wickham, Chang, Flight, Müller, and Hester, 2021)
  • testthat (Wickham, 2011)

This package was developed using biocthis.

Code for creating the vignette

## Create the vignette
library("rmarkdown")
system.time(render("human_gene_transcript_sce.Rmd", "BiocStyle::html_document"))

## Extract the R code
library("knitr")
knit("human_gene_transcript_sce.Rmd", tangle = TRUE)

Date the vignette was generated.

#> [1] "2024-10-14 19:05:51 EDT"

Wallclock time spent generating the vignette.

#> Time difference of 23.441 secs

R session information.

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14)
#>  os       Ubuntu 24.04.1 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  C
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2024-10-14
#>  pandoc   2.7.3 @ /usr/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package              * version date (UTC) lib source
#>  abind                  1.4-8   2024-09-12 [3] CRAN (R 4.4.1)
#>  AnnotationDbi          1.67.0  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  AnnotationHub        * 3.13.3  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  backports              1.5.0   2024-05-23 [3] CRAN (R 4.4.1)
#>  bibtex                 0.5.1   2023-01-26 [3] CRAN (R 4.4.1)
#>  Biobase              * 2.65.1  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  BiocFileCache        * 2.13.2  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  BiocGenerics         * 0.51.3  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  BiocManager            1.30.25 2024-08-28 [2] CRAN (R 4.4.1)
#>  BiocStyle            * 2.33.1  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  BiocVersion            3.20.0  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  Biostrings             2.73.2  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  bit                    4.5.0   2024-09-20 [3] CRAN (R 4.4.1)
#>  bit64                  4.5.2   2024-09-22 [3] CRAN (R 4.4.1)
#>  blob                   1.2.4   2023-03-17 [3] CRAN (R 4.4.1)
#>  bookdown               0.40    2024-07-02 [3] CRAN (R 4.4.1)
#>  bslib                  0.8.0   2024-07-29 [3] CRAN (R 4.4.1)
#>  cachem                 1.1.0   2024-05-16 [3] CRAN (R 4.4.1)
#>  chevreuldata         * 0.99.15 2024-10-14 [1] Bioconductor
#>  cli                    3.6.3   2024-06-21 [3] CRAN (R 4.4.1)
#>  crayon                 1.5.3   2024-06-20 [3] CRAN (R 4.4.1)
#>  curl                   5.2.3   2024-09-20 [3] CRAN (R 4.4.1)
#>  DBI                    1.2.3   2024-06-02 [3] CRAN (R 4.4.1)
#>  dbplyr               * 2.5.0   2024-03-19 [3] CRAN (R 4.4.1)
#>  DelayedArray           0.31.14 2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  digest                 0.6.37  2024-08-19 [3] CRAN (R 4.4.1)
#>  dplyr                  1.1.4   2023-11-17 [3] CRAN (R 4.4.1)
#>  evaluate               1.0.1   2024-10-10 [3] CRAN (R 4.4.1)
#>  ExperimentHub        * 2.13.1  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  fansi                  1.0.6   2023-12-08 [3] CRAN (R 4.4.1)
#>  fastmap                1.2.0   2024-05-15 [3] CRAN (R 4.4.1)
#>  filelock               1.0.3   2023-12-11 [3] CRAN (R 4.4.1)
#>  generics               0.1.3   2022-07-05 [3] CRAN (R 4.4.1)
#>  GenomeInfoDb         * 1.41.2  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  GenomeInfoDbData       1.2.13  2024-09-30 [3] Bioconductor
#>  GenomicRanges        * 1.57.2  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  glue                   1.8.0   2024-09-30 [3] CRAN (R 4.4.1)
#>  htmltools              0.5.8.1 2024-04-04 [3] CRAN (R 4.4.1)
#>  httr                   1.4.7   2023-08-15 [3] CRAN (R 4.4.1)
#>  IRanges              * 2.39.2  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  jquerylib              0.1.4   2021-04-26 [3] CRAN (R 4.4.1)
#>  jsonlite               1.8.9   2024-09-20 [3] CRAN (R 4.4.1)
#>  KEGGREST               1.45.1  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  knitr                  1.48    2024-07-07 [3] CRAN (R 4.4.1)
#>  lattice                0.22-6  2024-03-20 [4] CRAN (R 4.4.1)
#>  lifecycle              1.0.4   2023-11-07 [3] CRAN (R 4.4.1)
#>  lubridate              1.9.3   2023-09-27 [3] CRAN (R 4.4.1)
#>  magrittr               2.0.3   2022-03-30 [3] CRAN (R 4.4.1)
#>  Matrix                 1.7-0   2024-04-26 [4] CRAN (R 4.4.1)
#>  MatrixGenerics       * 1.17.0  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  matrixStats          * 1.4.1   2024-09-08 [3] CRAN (R 4.4.1)
#>  memoise                2.0.1   2021-11-26 [3] CRAN (R 4.4.1)
#>  mime                   0.12    2021-09-28 [3] CRAN (R 4.4.1)
#>  pillar                 1.9.0   2023-03-22 [3] CRAN (R 4.4.1)
#>  pkgconfig              2.0.3   2019-09-22 [3] CRAN (R 4.4.1)
#>  plyr                   1.8.9   2023-10-02 [3] CRAN (R 4.4.1)
#>  png                    0.1-8   2022-11-29 [3] CRAN (R 4.4.1)
#>  purrr                  1.0.2   2023-08-10 [3] CRAN (R 4.4.1)
#>  R6                     2.5.1   2021-08-19 [3] CRAN (R 4.4.1)
#>  rappdirs               0.3.3   2021-01-31 [3] CRAN (R 4.4.1)
#>  Rcpp                   1.0.13  2024-07-17 [3] CRAN (R 4.4.1)
#>  RefManageR           * 1.4.0   2022-09-30 [3] CRAN (R 4.4.1)
#>  rlang                  1.1.4   2024-06-04 [3] CRAN (R 4.4.1)
#>  rmarkdown              2.28    2024-08-17 [3] CRAN (R 4.4.1)
#>  RSQLite                2.3.7   2024-05-27 [3] CRAN (R 4.4.1)
#>  S4Arrays               1.5.11  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  S4Vectors            * 0.43.2  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  sass                   0.4.9   2024-03-15 [3] CRAN (R 4.4.1)
#>  sessioninfo          * 1.2.2   2021-12-06 [3] CRAN (R 4.4.1)
#>  SingleCellExperiment * 1.27.2  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  SparseArray            1.5.44  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  stringi                1.8.4   2024-05-06 [3] CRAN (R 4.4.1)
#>  stringr                1.5.1   2023-11-14 [3] CRAN (R 4.4.1)
#>  SummarizedExperiment * 1.35.4  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  tibble                 3.2.1   2023-03-20 [3] CRAN (R 4.4.1)
#>  tidyselect             1.2.1   2024-03-11 [3] CRAN (R 4.4.1)
#>  timechange             0.3.0   2024-01-18 [3] CRAN (R 4.4.1)
#>  UCSC.utils             1.1.0   2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  utf8                   1.2.4   2023-10-22 [3] CRAN (R 4.4.1)
#>  vctrs                  0.6.5   2023-12-01 [3] CRAN (R 4.4.1)
#>  withr                  3.0.1   2024-07-31 [3] CRAN (R 4.4.1)
#>  xfun                   0.48    2024-10-03 [3] CRAN (R 4.4.1)
#>  xml2                   1.3.6   2023-12-04 [3] CRAN (R 4.4.1)
#>  XVector                0.45.0  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#>  yaml                   2.3.10  2024-07-26 [2] CRAN (R 4.4.1)
#>  zlibbioc               1.51.1  2024-10-14 [3] Bioconductor 3.20 (R 4.4.1)
#> 
#>  [1] /tmp/RtmpJ1oiJ3/Rinst36a13565502393
#>  [2] /media/volume/teran2_disk/pkgbuild/packagebuilder/workers/jobs/3332/R-libs
#>  [3] /media/volume/teran2_disk/biocbuild/bbs-3.20-bioc/R/site-library
#>  [4] /media/volume/teran2_disk/biocbuild/bbs-3.20-bioc/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

4 Bibliography

This vignette was generated using BiocStyle (Oleś, 2024) with knitr (Xie, 2024) and rmarkdown (Allaire, Xie, Dervieux et al., 2024) running behind the scenes.

Citations made with RefManageR (McLean, 2017).

[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.28. 2024. URL: https://github.com/rstudio/rmarkdown.

[2] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.

[3] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.33.1. 2024. DOI: 10.18129/B9.bioc.BiocStyle. URL: https://bioconductor.org/packages/BiocStyle.

[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2024. URL: https://www.R-project.org/.

[5] K. Stachelek. chevreuldata: Example data for the chevreul package. R package version 0.99.15. 2024. URL: https://github.com/whtns/chevreuldata.

[6] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5–10. URL: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.

[7] H. Wickham, W. Chang, R. Flight, et al. sessioninfo: R Session Information. R package version 1.2.2. 2021. URL: https://CRAN.R-project.org/package=sessioninfo.

[8] Y. Xie. knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.48. 2024. URL: https://yihui.org/knitr/.