A Glossary of important R commands

The following table contains important R commands for its basic usage.

Description	R	Example
Assign values to a variable	`<-`	`x <- 1`
Compute several expressions at once	`;`	`x <- 1; 2 + 2; 3 * 8`
Create vectors by concatenating numbers	`c`	`c(1, 2, -1)`
Create sequential integer vectors	`:`	`1:10`
Create a matrix by columns	`cbind`	`cbind(1:3, c(0, 2, 0))`
Create a matrix by rows	`rbind`	`rbind(1:3, c(0, 2, 0))`
Create a data frame	`data.frame`	`data.frame(name1 = c(-1, 3), name2 = c(0.4, 1))`
Create a list	`list`	`list(obj1 = c(-1, 3), obj2 = -1:5, obj3 = rbind(1:2, 3:2))`
Access elements of a…
… vector	`[]`	`c(0.5, 2)[1], c(0.5, 2)[-1]; c(0.5, 2)[2:1]`
… matrix	`[, ]`	`cbind(1:2, 3:4)[1, 2]; cbind(1:2, 3:4)[1, ]`
… data frame	`[, ]` and `$`	`data.frame(name1 = c(-1, 3), name2 = c(0.4, 1))$name1; data.frame(name1 = c(-1, 3), name2 = c(0.4, 1))[2, 1]`
… list	`$`	`list(x = 2, y = 7:0)$y`
Summarize any object	`summary`	`summary(1:10)`

Some useful commands for performing simple and multiple linear regression are given in the next table. We assume that:

Description	R
Fit a simple linear model	`lm(response ~ pred1, data = dataset)`
Fit a multiple linear model…
… on two predictors	`lm(response ~ pred1 + pred2, data = dataset)`
… on all predictors	`lm(response ~ ., data = dataset)`
… on all predictors except `pred1`	`lm(response ~ . - pred1, data = dataset)`
Summarize linear model: coefficient estimates, standard errors, $t$-values, $p$-values for $H_0:\beta_j=0$, $\hat\sigma$ (Residual standard error), degrees of freedom, $R^2$, Adjusted $R^2$, $F$-test, $p$-value for $H_0:\beta_1=\ldots=\beta_k=0$	`summary(model)`
ANOVA decomposition	`anova(model)`
CIs coefficients	`confint(model, level = level)`
Prediction	`predict(model, newdata = new)`
CIs predicted mean	`predict(model, newdata = new, interval = "confidence", level = level)`
CIs predicted response	`predict(model, newdata = new, interval = "prediction", level = level)`
Variable selection	`stepwise(model)`
Multicollinearity detection	`vif(model)`
Compare model coefficients	`compareCoefs(model1, model2)`
Diagnostic plots	`plot(model, num)`

The following table contains important R commands for its basic usage. We assume the following dataset is available:

data <- data.frame(x = 1:10, y = c(-1, 2, 3, 0, 3, 1, -1, 3, 0, -1))

Description	R	Example
Data frame management
variable names	`names`	`names(data)`
structure	`str`	`str(data)`
dimensions	`dim`	`dim(data)`
beginning	`head`	`head(data)`
Vector related functions
create sequences	`seq`	`seq(0, 1, l = 10); seq(0, 1, by = 0.25)`
reverse a vector	`rev`	`rev(1:5)`
length of a vectors	`length`	`length(1:5)`
count repetitions in a vector	`table`	`table(c(1:5, 4:2))`
Logical conditions
relational operators	`<`, `<=`, `>`, `>=`, `==`, `!=`	`1 < 0; 1 <= 1; 2 > 1; 3 >= 4; 1 == 0; 1 != 0`
“and”	`&`	`TRUE & FALSE`
“or”	`\|`	`TRUE \| FALSE`
Subsetting
vector		`data$x[data$x > 0]; data$x[data$x > 2 & data$x < 8]`
data frame		`data[data$x > 0, ]; data[data$x < 2 \| data$x > 8, ]`
Distributions
sampling	`rxxxx`	`rnorm(n = 10, mean = 0, sd = 1)`
density	`dxxxx`	`x <- seq(-4, 4, l = 20); dnorm(x = x, mean = 0, sd = 1)`
distribution	`pxxxx`	`x <- seq(-4, 4, l = 20); pnorm(q = x, mean = 0, sd = 1)`
quantiles	`qxxxx`	`p <- seq(0.1, 0.9, l = 10); qnorm(p = p, mean = 0, sd = 1)`
Plotting
scatterplot	`plot`	`plot(rnorm(100), rnorm(100))`
plot a curve	`plot`, `seq`	`x <- seq(0, 1, l = 100); plot(x, x^2, type = "l")`
add lines	`lines`,	`x <- seq(0, 1, l = 100); plot(x, x^2 + rnorm(100, sd = 0.1)); lines(x, x^2, col = 2, lwd = 2)`

Some useful commands for performing logistic regression are given in the next table. We assume that:

Description	R
Fit a simple logistic model	`glm(response ~ pred1, data = dataset, family = "binomial")`
Fit a multiple logistic model…
… on two predictors	`glm(response ~ pred1 + pred2, data = dataset, family = "binomial")`
… on all predictors	`glm(response ~ ., data = dataset, family = "binomial")`
… on all predictors except `pred1`	`glm(response ~ . - pred1, data = dataset, family = "binomial")`
Summarize logistic model: coefficient estimates, standard errors, Wald statistics (`'z value'`), $p$-values for $H_0:\beta_j=0$, Null deviance, deviance (`'Residual deviance'`), AIC, number of iterations	`summary(model)`
CIs coefficients	`confint(model, level = level); confint.default(model, level = level)`
CIs exp-coefficients	`exp(confint(model, level = level)); exp(confint.default(model, level = level))`
Prediction	`predict(model, newdata = new, type = "response")`
CIs predicted probability	Not immediate. Use `predictCIsLogistic(model, newdata = new, level = level)` as seen in Section 4.6
Variable selection	`stepwise(model)`
Multicollinearity detection	`vif(model)`
$R^2$	Not immediate. Use `r2Log(model = model)` as seen in Section 4.8
Hit matrix	`table(data$resp, model$fitted.values > 0.5)`

Some useful commands for performing logistic regression are given in the next table. We assume that:

dataset is an imported dataset with several non-categorical variables (the variables must be continuous or discrete).
pca is a PCA object, this is, the output of princomp.

Description	R
Compute a PCA…
… unnormalized (if variables have the same scale)	`princomp(dataset)`
… normalized (if variables have different scales)	`princomp(dataset, cor = TRUE)`
Summarize PCA: standard deviation explained by each PC, proportion of variance explained by each PC, cumulative proportion of variance explained up to a given component	`summary(pca)`
Weights	`pca$loadings`
Scores	`pca$scores`
Standard deviations of the PCs	`pca$sdev`
Means of the original variables	`pca$center`
Screeplot	`plot(pca); plot(pca, type = "l")`
Biplot	`biplot(pca)`