R Markdown

Homework number 2

From Victor SCHOTT (IES21199)

Code chunk 1 for HW1

head() is a function in base-R that display only the first 6 observations

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Code chunk 2 for HW1

tidying the raw data into the tidy data using pivot_longer() and separate() functions in the tidyr package

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✔ ggplot2 3.3.2     ✔ purrr   0.3.4
## ✔ tibble  3.0.1     ✔ dplyr   1.0.2
## ✔ tidyr   1.1.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
iris %>%
  pivot_longer(cols = -Species, names_to = "Part", values_to = "Value") %>%
  separate(col = "Part", into = c("Part", "Measure"))
## # A tibble: 600 x 4
##    Species Part  Measure Value
##    <fct>   <chr> <chr>   <dbl>
##  1 setosa  Sepal Length    5.1
##  2 setosa  Sepal Width     3.5
##  3 setosa  Petal Length    1.4
##  4 setosa  Petal Width     0.2
##  5 setosa  Sepal Length    4.9
##  6 setosa  Sepal Width     3  
##  7 setosa  Petal Length    1.4
##  8 setosa  Petal Width     0.2
##  9 setosa  Sepal Length    4.7
## 10 setosa  Sepal Width     3.2
## # … with 590 more rows

Code chunk 3 for HW1

transforming our data using group_by() and summarize() functions in the dplyr package

Because we created the Part variable in our tidy data,

we can easily calculate the mean of the Value by Species and Part

iris %>%
  pivot_longer(cols = -Species, names_to = "Part", values_to = "Value") %>%
  separate(col = "Part", into = c("Part", "Measure")) %>%
  group_by(Species, Part) %>%
  summarize(m = mean(Value))
## `summarise()` regrouping output by 'Species' (override with `.groups` argument)
## # A tibble: 6 x 3
## # Groups:   Species [3]
##   Species    Part      m
##   <fct>      <chr> <dbl>
## 1 setosa     Petal 0.854
## 2 setosa     Sepal 4.22 
## 3 versicolor Petal 2.79 
## 4 versicolor Sepal 4.35 
## 5 virginica  Petal 3.79 
## 6 virginica  Sepal 4.78

Code chunk 4 for HW1

visualizing our data using ggplot() function in the ggplot2 package

iris %>%
  pivot_longer(cols = -Species, names_to = "Part", values_to = "Value") %>%
  separate(col = "Part", into = c("Part", "Measure")) %>%
  ggplot(aes(x = Value, color = Part)) + geom_boxplot()