Martin Morgan (mtmorgan@fhcrc.org), Fred Hutchinson Cancer Research, Center, Seattle, WA, USA.
24 August 2014
Part I
Part II
Part III
1 # vector of length 1
[1] 1
c(1, 1, 2, 3, 5) # vector of length 5
[1] 1 1 2 3 5
c(TRUE, FALSE)
, integer, numeric, complex, character
c("A", "beta")
list(c(TRUE, FALSE), c("A", "beta"))
factor
, NA
Assignment and names
x <- c(1, 1, 2, 3, 5)
y = c(5, 5, 3, 2, 1)
z <- c(Female=12, Male=3)
=
and <-
are the sameOperations
x + y # vectorized
[1] 6 6 5 5 6
x / 5 # ...recylcing
[1] 0.2 0.2 0.4 0.6 1.0
x[c(3, 1)] # subset
[1] 2 1
Examples: c()
, concatenate values; rnorm()
, generate random normal deviates; plot()
x <- rnorm(1000) # 1000 normal deviates
y <- x + rnorm(1000, sd = 0.5)
args(rnorm)
function (n, mean = 0, sd = 1)
NULL
plot(x, y)
formula
: another way plot(y ~ x)
Within R
?rnorm
Rstudio
Main sections
Motivation: manipulate complicated data
x
and y
from previous example are related to one another –
same length, element i of y is a transformation of element i of xSolution: a “data frame” to coordinate access
df <- data.frame(X=x, Y=y)
head(df, 3)
X Y
1 -1.3692 -1.2625
2 1.9072 2.6103
3 -0.5395 -0.5987
class(df) # plain function
[1] "data.frame"
dim(df) # generic & method for data.frame
[1] 1000 2
head(df$X, 4) # column access
[1] -1.3692 1.9072 -0.5395 -1.3264
## create or update 'Z'
df$Z <- sqrt(abs(df$Y))
## subset rows and / or columns
head(df[df$X > 0, c("X", "Z")])
X Z
2 1.90720 1.6156
5 0.02705 0.5804
6 0.18376 0.5624
8 0.04149 0.2101
9 0.96177 0.3850
14 0.48720 1.0353
plot(Y ~ X, df) # Y ~ X, values from 'df'
## lm(): linear model, returns class 'lm'
fit <- lm(Y ~ X, df)
abline(fit) # plot regression line
anova(fit)
Analysis of Variance Table
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
X 1 1042 1042 4417 <2e-16
Residuals 998 235 0
X ***
Residuals
---
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.'
0.1 ' ' 1
fit
: object of class lm
anova()
: generic, with method for for class fit
methods(anova)
[1] anova.glm* anova.glmlist*
[3] anova.lm* anova.lmlist*
[5] anova.loess* anova.mlm*
[7] anova.nls*
Non-visible functions are asterisked
## class of object
class(fit)
## method discovery
methods(class=class(fit))
methods(anova)
## help on generic, and specific method
?anova
?anova.lm
Installed
length(rownames(installed.packages()))
[1] 227
Available
'Attached' (installed and available for use):
search() # attached packages
ls("package:stats") # functions in 'stats'
Attaching (make installed package available for use)
library(ggplot2)
Installing CRAN or Bioconductor packages
source("http://bioconductor.org/biocLite.R")
biocLite("GenomicRanges")
Packages
Best bet
R
[R]
;
R-help mailing listBioconductor
Funding
People