Martin Morgan
October 29, 2014
R is a programming language
Your own functions
fun <- function(n) {
x <- rnorm(n)
y <- x + rnorm(n, sd=.5)
lm(y ~ x)
}
Iteration
## 'side effects'
for (i in 1:10)
message(i)
## result for subsequent computation
sapply(1:10, function(i) {
sum(rnorm(1000))
})
lapply()
, apply()
, mapply() / Map()
.Conditional execution
sapply(1:10, function(i)) {
if (sum(rnorm(1000)) > 0) {
"hi!"
} else {
"low!"
}
})
Scripts contain a combination of data and transformations. Often the
transformations are idiosyncratic, and rely heavily on functions
provided by various packages. Sometimes the script contains a useful
chunk of code that could be reused in different places. Examples we've
encountered in this course might include GC-content of DNA sequences
(or is there a function for that already? check out r
Biocpkg("Biostrings")
!) and creating a simple 'map' from one type of
annotation to another.
It is easy and beneficial to create a package.
What is an R package?
Simple directory structure of required files. Minimal:
MyPackage
|-- DESCRIPTION (title, author, version, etc.)
|-- NAMESPACE (imports / exports)
|-- R (R function definitions)
|-- gc.R
|-- annoHelper.R
|-- man (help pages)
|-- MyPackage.Rd
|-- gc.Rd
|-- annoHelper.Rd
Check out the Rstudio package wizard!
Write a function that takes a DNAStringSet
and returns the GC content.
Modify the function using a conditional statement to work whether
provided a DNAString
or a DNAStringSet
. Test the function.
Save the function in a file on your AMI.
Write a function that takes as its argument Ensembl gene identifiers
(like the rownames()
of the SummarizedExperiment object in the
RNASeq vignette yesterday) and uses the select()
method and
annotation package org.Hs.eg.db to return a named
character vector, where the names of the vector are the Ensembl
identifiers and the values are the corresponding gene SYMBOLs. Adopt
some simple-to-implement policy for handling Ensembl identifiers that
map to more than one gene symbol. Save this function to another file
Use the RStudio wizard to create a package from the files containing your GC-content and annotation-helper functions.