Chapter 9 Creating tables
In our final document we will want tables to describe our data. Below we create a table which contains demographic information on our sample.
For most tables you can use the examples shown below with Rmisc
and rstatix
. However for more complex tables we need to use a combination of gtsummary
and flextable
.
To this point I have emphasized that using spaces in column names is bad practice, however when we get the final form of our table, we may want to insert spaces so that they are properly displayed in our manuscript. I use the janitor
package to make this switch as easy as possible.
%>%
tbl.demo ::clean_names(case = "title") janitor
9.1 Simple Tables
# Method 1 using rstatix
<- df.nirs %>%
tbl.demo distinct(Subject, .keep_all = TRUE) %>%
group_by(group, gender) %>%
::get_summary_stats(age, type = "mean_sd") %>%
rstatixselect(-c("variable")) %>%
::rename(
dplyr"Group" = "group",
"Sex" = "gender",
"Mean Age" = "mean",
"SD" = "sd"
)
# Method 2 using RMisc
<- Rmisc::summarySE(data = df,
journey_time measurevar = "journey_time_avg",
groupvars = c("service","year"),
conf.interval = 0.95,
na.rm = TRUE,
.drop = TRUE)
# Taking the mean and confidence interval =============
# __Method #1 - Easy #########
<- Rmisc::summarySE(data = df,
journey_time measurevar = "journey_time_avg",
groupvars = c("service","year"),
conf.interval = 0.95,
na.rm = TRUE,
.drop = TRUE)
# To get a descriptive table
<- Rmisc::summarySE(data = df.mri %>%
tbl.descFullACAP filter(metric == "FA", mriloc == "CC_FMajor"),
measurevar = "age",
groupvars = c("group","gender"),
conf.interval = 0.95,
na.rm = TRUE,
.drop = TRUE) %>%
::clean_names("upper_camel") %>%
janitorrename(
"CI" = "Ci",
"SE" = "Se",
"SD" = "Sd"
%>%
) mutate_at(vars(Age,SD,SE,CI), funs(round(., 3)))
::resave(tbl.descFullACAP, file = "data/tables.RData") #resave a list of tables that I'll use in the .Rmd file.
cgwtools
# __Method 2 - More options
%>%
df group_by(Channel) %>%
summarise_each(funs(mean, sd))
<- data.table(df)
dt group_by(dt, Channel_46)
# __Method 2b - Not so Easy ##################
# descriptives <- demo%>%dplyr::group_by(AthleteType)%>%
# dplyr::summarise(
# "Mean Height (cm)" = round(mean(Height_cm),2)
# , "Mean Weight (kg)" = round(mean(Weight_kg),2)
# , "Mean Age (Years)" = round(mean(AgeInYears),2)
# , "Number of Concussions" = round(mean(NumPriorConc),2)
# , "Number of Diagnosed Concussions" = round(mean(NumDiagConc),2)
# , "Number of Undiagnosed Concussions" = round(mean(NumUndiagConc),2)
# )
If you have only categorical variables than you can use the tabyl
function in the janitor
package.
# Example 1 using ToothGrowth dataset
<- ToothGrowth %>%
tbl ::tabyl(supp)
janitor
# Example 2 using mtcars dataset
<- mtcars %>%
tbl ::tabyl(cyl,am) %>%
janitor::adorn_percentages("col") %>%
janitor::adorn_pct_formatting(digits = 1) janitor
9.2 Rounding numbers
In many cases, you might want to remove leading and/or trailing zeros. There are a few packages which claim to do this. Normally I will use the following piece of code
%>%
df mutate(across(where(is.numeric), round, 4))
9.3 Saving Tables to xlsx
You may also choose to write these to an xlsx file, you can use the xlsx
package (which does have the benefit of letting you append) however I prefer using rio
for all my import/export tasks. It functions using list
which is a bit different.
# using xlsx package
<- iris
df <- mtcars
df1 ::export(df, "raw/test.xlsx") # uses the rio package
rio
::export_list(list(df, df1), "raw/multidf.xlsx") # uses the rio package rio
# using xlsx package
<- iris
df ::write.xlsx2(df, "test.xlsx", row.names = FALSE, sheetName = "SheetNumber1", append = TRUE) # uses the xlsx package xlsx
9.4 Complex Tables
For more complex tables you may want to check into gtsummary
which was a lifesaver for me on a recent manuscript. It has a ton of custom options you can mess with to get exactly what you are looking for. Objects from gtsummary
are saved as flextables
::p_load_gh("ddsjoberg/gtsummary")
pacman::p_load(flextable)
pacman
# Andrew's example from a recent manuscript (microstates paper)
<- tbl.demo %>%
tbl.demo2 ::clean_names(case = "title") %>%
janitor::rename(
dplyr"Weight (kg)" = "Weight Kg",
"Height (m)" = "Ht m",
"BMI" = "Bmi",
"Age (years)" = "Age",
"Was nitrous used?" = "Nitrous"
%>%
) ::tbl_summary(by = `Anesthetic Maintenance`) %>%
gtsummaryadd_n() %>%
add_overall() %>%
modify_spanning_header(starts_with("stat") ~ "**Anesthetic Received**") %>%
bold_labels()
Below is a reprex example with gtsummary from their documentation
::p_load(gtsummary)
pacman
<- trial %>% select(trt, age, grade)
trial2 %>% tbl_summary()
trial2
%>% tbl_summary(by = trt) %>% add_p()
trial2
%>%
trial2 tbl_summary(by = trt) %>%
add_p(pvalue_fun = ~style_pvalue(.x, digits = 2)) %>%
add_overall() %>%
add_n() %>%
modify_header(label ~ "**Variable**") %>%
modify_spanning_header(c("stat_1", "stat_2") ~ "**Treatment Received**") %>%
modify_footnote(
all_stat_cols() ~ "Median (IQR) or Frequency (%)"
%>%
) modify_caption("**Table 1. Patient Characteristics**") %>%
bold_labels()
<- trial2 %>%
tbl tbl_summary(by = trt) %>%
add_p(pvalue_fun = ~style_pvalue(.x, digits = 2)) %>%
add_overall() %>%
add_n() %>%
modify_header(label ~ "**Variable**") %>%
modify_spanning_header(c("stat_1", "stat_2") ~ "**Treatment Received**") %>%
modify_footnote(
all_stat_cols() ~ "Median (IQR) or Frequency (%)"
%>%
) modify_caption("**Table 1. Patient Characteristics**") %>%
bold_labels()