Chapter 2 R Scripts and R Packages
2.1 Objectives
At the end of this chapter, readers will be able
- to write simple R scripts
- to understand R packages
- to install R packages
- to create a new RStudio project
- to be able to use RStudio Cloud
2.2 Introduction
An R script is simply a text file containing (almost) the same commands that you would enter on the command line of R. ( almost) refers to the fact that if you are using sink() to send the output to a file, you will have to enclose some commands in print() to get the same output as on the command line.
R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN (the Comprehensive R Archive Network).
2.3 Open a new R script
For beginner, you may start by writing some simple codes. Since these codes are written in R language, we call these codes as R scripts. To do this, go to File, then click R Script
- File -> R Script
- In Window OS, users can use this shortcut CTRL-SHIFT-N
2.3.1 Our first R script
Let us write our very first R codes inside an R script.
- In Line 1, type
2 + 3
- click CTRL-ENTER or CMD-ENTER
- see the outputs in the Console Pane
## [1] 5
After writing your codes inside the R script, you can save the R script file. This will allow you to open it up again to continue your work.
And to save R script, go to
- File ->
- Save As ->
- Choose folder ->
- Name the file
Now, types these codes to check the version of your R software
## _
## status
## major 4
The current version for R is 4.2.1
By they way if you are using lower version of R, then we recommend you to upgrade. To upgrade your R software
- and if you are using Windows, you can use installr package
- but if you use macOS, you may need to download R again and manually install
You may find more information from this link.
2.3.2 Function, Argument and Parameters
R codes contain
- function
- argument
- parameters
f <- function(<arguments>) {
## Do something interesting
}
For example, to list all the arguments for a function, you may use args()
. Let’s examine the arguments for the function lm()
, a function to estimate parameters for linear regression model.
## function (formula, data, subset, weights, na.action, method = "qr",
## model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
## contrasts = NULL, offset, ...)
## NULL
Once you understand the required arguments, you may use the parameters so the function can perform the desired task. For example:
##
## Call:
## lm(formula = weight ~ Time, data = ChickWeight)
##
## Coefficients:
## (Intercept) Time
## 27.467 8.803
2.3.3 If users requires further help
If users would like to see more extensive guides on certain function, they may type the \(?\) before the function. For example, users want to know more about the function lm
, then he may type the R codes below. Following that, R will open a help page with more detailed description, usage of the function and the relevant arguments.
## starting httpd help server ... done
Here, we provide an example how a Help Pane will look like.
2.4 Packages
R is a programming language. Furthermore, R software runs on packages. R packages are collections of functions and data sets developed by the community. They increase the power of R by improving existing base R codes and functions or by adding new ones.
A package is a suitable way to organize users’ work and share it with others if users want to. Typically, a package will include
- code (sometimes not just R codes but codes in other programming languages),
- documentation for the package and the functions inside,
- some tests to check that everything works as it should, and
- data sets.
Users can read more about R packages here.
2.4.1 Packages on CRAN
At the time of writing, the CRAN package repository features 12784 packages. Available R packages are listed on the Cran Task Views website.
CRAN task views aim to provide some guidance which packages on CRAN are relevant for tasks related to a certain topic. They give a brief overview of the included packages and can be automatically installed using the ctv package.
The views are intended to have a sharp focus so that it is sufficiently clear which packages should be included (or excluded) and they are not meant to endorse the “best” packages for a given task.
2.4.2 Checking availability of R package
To check if the desired package is available on users’ machine, users can this inside their R console:
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Users should not receive any error messages. Users who have not installed the package will receive an error message. Furthermore, it tells them that the package is not available in their R. By default, the package is stored in the R folder in their My Document or HOME directory
## [1] "C:/Users/A C E R/AppData/Local/R/win-library/4.4"
## [2] "C:/Program Files/R/R-4.4.1/library"
2.4.3 Install an R package
To install an R package, there are two ways:
- users can type the R codes like below (without the # tag)
- users can use the GUI in the RStudio IDE
Now, type the package you want to install. For example you want to install the tidyverse package
And then click the Install
button. And you need to have internet access to do this. You can also install packages from:
- a zip file (from your machine or USB),
- from github repository
- other repository
2.5 Working directory
Setting and knowing the R working directory is very important. Our working directory will contain the R codes, the R outputs, datasets or even resources or tutorials that can help us during in R project or during our R analysis/
The working directory is just a folder. Moreover, the folder can contain many sub-folders. We recommend that the folder contain the dataset (if you want to analyze your data locally) and other R objects. R will store many other R objects created during each R session.
Type this to locate the working directory:
## [1] "D:/Data_Analysis_CRC_multivar_data_analysis_codes"
2.5.1 Starting a new R job
There are two ways to start a new R job:
- create a new R project from RStudio IDE. This is the method that we recommend.
- setting your working directory using the
setwd()
function.
2.5.2 Creating a new R project
We highly encourage users to create a new R project. To do this users can
- go to
File -> New Project
When you see project type, click New Project
2.5.3 Location for dataset
Many data analysts use data stored on their local machines. R will read data and usually store this data in data frame format or class. When you read your data into RStudio, you will see the dataset in the environment pane. RStudio reads the original dataset and saves it to the RAM (random access memory). So you must know the size of your computer RAM. How much your RAM for your machine? The bigger the RAM, the larger R can read and store your data in the computer’s memory.
The data read (in memory) will disappear once you close RStudio. But the source dataset will stay in its original location, so there will be no change to your original data (be happy!) unless you save the data frame in the memory and replace the original file. However, we do not recommend you do this.
2.6 Upload data to RStudio Cloud
If users want to use data in the RStuio Cloud, they may have to upload the data to the RStudio Cloud directory. They may also use RStudio Cloud to read data from the Dropbox folder or Google Drive folder.
2.7 More resources on RStudio Cloud
There are a number of resources on RStudio Cloud. For example, on YouTube channel, there is RStudio Cloud for Education https://www.youtube.com/watch?v=PviVimazpz8. Another good resource on YouTube is Working with R in Cloud
https://www.youtube.com/watch?v=SFpzr21Pavg
2.8 Guidance and helps
To see further guidance and help, users may register and join RStudio Community at RStudio Community. Users can also ask questions on Stack Overflow. There are also mailing list groups on specific topics but users have to subscribe to it.
2.9 Bookdown
RStudio has provided a website to host online books, the Bookdown. The books at Bookdown are freely accessible online. There are some of the books that are available on Amazon or other book depository as physical books such as ours.
2.10 Summary
In this chapter, we describe R scripts and R packages. We also show how to write simple R scripts and how to check if any specific R package is available on your machine and how to install it if it is not available. We recommend using RStudio Cloud if you are very new to R. Working directory sometimes confuses new R users, hence we also recommend all R users to create new RStudio Project for new analysis task. There are resources available offline and online and many of them are freely accessible especially at the Bookdown website.