Chapter 3 Create and load data sets
In practice R is used to analyse data. There exist two type of data. In the one hand, we have “real” data where “real” means that the data was observed. On the other hand, we have data is is created by the user. We need such data for example to analyse the small sample behavior of estimators.
3.1 Create data set
We start by considering data that is created by the user. We can create data sets by using functions like rnorm()
, rexp
or runif
. After the data is created we can work with it as shown in the following:
## [1] 5.246211
## [1] 3.041183
## [1] 0.374657
## [1] 0.3600557
## [1] 3.993143
## [1] 0.6204115
Functions as the three stated exist for a lot of distribution in R. You can find the needed functions by searching in the web.
3.2 Load dataset
It is also possible to work with observed data. Typically, such data is stored in some file and has to be loaded into R. In the example below we load a dta file.
## longname shortnam step mort logmort0 risk loggdp campaign slave source0
## 1 Angola AGO 3 280.00 5.634789 5.36 7.77 1 0 0
## 2 Argentina ARG 4 68.90 4.232656 6.39 9.13 1 0 0
## 3 Australia AUS 4 8.55 2.145931 9.32 9.90 0 0 0
## 4 Burkina Faso BFA 2 280.00 5.634789 4.45 6.85 1 0 0
## 5 Bangladesh BGD 1 71.41 4.268438 5.14 6.88 1 0 1
## 6 Bahamas BHS 4 85.00 4.442651 7.50 9.29 0 0 0
## latitude neoeuro asia africa other edes1975 malaria other2 cons90 lado1995
## 1 0.1367 0 0 1 0 0 1.000 0 3 2
## 2 0.3778 0 0 0 0 90 0.000 0 6 5
## 3 0.3000 1 0 0 1 99 0.000 1 7 6
## 4 0.1444 0 0 1 0 0 1.000 0 1 4
## 5 0.2667 0 1 0 0 0 0.158 0 2 3
## 6 0.2683 0 0 0 0 10 NA 0 NA 4
## ajr_rnd2
## 1 0
## 2 0
## 3 0
## 4 1
## 5 1
## 6 0
After the file is loaded we can work with the data
## [1] 280.00 68.90 8.55 280.00 71.41
## [1] 280.00 68.90 8.55 280.00 71.41
## [1] 141.772
## [1] 128.6693
Apart from read.dta
there exist also read.csv
, read.xls
and many more to load datasets.