1 Introduction

pacman::p_load(dplyr, simstudy, summarytools, kableExtra,
               brms, bayestestR, emmeans)
set.seed(1087) # for reproducible simulations
st_options(plain.ascii = F, footnote = NA)
knitr::opts_knit$set(root.dir = rprojroot::find_rstudio_root_file())
options(knitr.kable.NA = '')
options(mc.cores = parallel::detectCores(),
        brms.backend = "rstan")

Data is simulated using the clustering structure (vignettes clustered in respondents) of the study. The detailed factorial survey experiment design is available here.

The dependent variable is simulated as a normal distribution (\(\text{Justice} \sim N(3, 1)\)) which is a function of each factor described below and random noise. This distribution is roughly one that spans from 1 to 6, emulating a rating scale for the vignettes where each answer represents two months (~60.8 days) from December-November (1) to February-January (6). The effects of the regression coefficients in a linear (mixed) model can be regarded as the change in such a scale where, e.g., in the gender factor (1 = female, 0 = male) a change of 0.115 represents a justice evaluation that the vaccine should be delivered to women approximately one week before compared to men.

We assume the six factors of the study with the following effects:

  • Gender: no practical difference between the just priority of vaccinating men or women.
  • Age: a just priority of ~.4 is set for a unit increase of age group.
  • Work status: a just priority of ~.3 favoring health workers – compared to transport, higher education and unemployed workers – is set. No practical differences are present between the rest of the comparisons.
  • Caring duties: a just priority of ~.15 for those with caring responsibilities of elder people – compared to no caring responsibilities – is set.
  • Country of birth: a just priority of ~.25 favoring Chileans – compared to people born in Venezuela, Haiti and Spain – is set. No practical differences are present between the rest of the comparisons.
  • Comorbidity: just priority of ~.2 for those with diabetes type 2 – compared to those without comorbidities – is set.

Furthermore, for the null hypotheses a smallest effect size of interest is defined as an interval equivalent to approximately one week before or after (-0.115, 0.115). This follows the rationale that a minimally interesting practical difference is an effect of at least a one-week change considering how inoculation strategies and logistics of scheduling vaccinations operate. In other words, an estimated priority smaller than one week would be difficult to implement in inoculation plans.

# Simulate respondent data
gen.resp <- defData(varname = "female_r", dist = "binary", 
                    formula = .6, id = "id_resp")
gen.resp <- defData(gen.resp, varname = "age_r", dist = "negBinomial", 
                    formula = 44, variance = .02)
dt.resp <- genData(1000, gen.resp)

# Add vignette's factors
gen.vign <- defDataAdd(varname = "work_status_v", dist = "uniformInt", 
                       formula = "1;4")
gen.vign <- defDataAdd(gen.vign, varname = "caring_v", dist = "binary", 
                       formula = ".5")
gen.vign <- defDataAdd(gen.vign, varname = "female_v", dist = "binary", 
                       formula = ".5")
gen.vign <- defDataAdd(gen.vign, varname = "age_v", dist = "uniformInt", 
                       formula = "1;8")
gen.vign <- defDataAdd(gen.vign, varname = "diabetes_v", dist = "binary", 
                       formula = ".5")
gen.vign <- defDataAdd(gen.vign, varname = "country_v", dist = "uniformInt", 
                       formula = "1;4")

# Simulate dependent variable
## Define conditional dependence on other factors with fomula argument
# gen.vign <- defDataAdd(gen.vign, varname = "justice", dist = "categorical", 
#                       formula = "1;5")
formula <- "3 + .4 * age_v + .2 * diabetes_v + .15 * caring_v + 
            (country_v != 1)*-.25 + (work_status_v != 1)*-.3"
## Simulate justice evaluations as normal distributed response
gen.vign <- defDataAdd(gen.vign, varname = "justice", dist = "normal", 
                       formula = formula, 
                       variance = 1)
dt.vign <- genCluster(dt.resp, "id_resp", numIndsVar = 4, 
                      level1ID = "id_vign")
dt.vign <- addColumns(gen.vign, dt.vign)
cors <- dt.vign %>% 
  select(ends_with("_v")) %>% 
  cor(.) %>% 
  round(3)
dt.vign <- dt.vign %>% 
  mutate(country_v = case_when(country_v == 1 ~ "Chile",
                               country_v == 2 ~ "Venezuela",
                               country_v == 3 ~ "Haiti",
                               country_v == 4 ~ "Spain"),
         work_status_v = case_when(work_status_v == 1 ~ "Health",
                                   work_status_v == 2 ~ "Transport",
                                   work_status_v == 3 ~ "Higher ed",
                                   work_status_v == 4 ~ "Unemployed"),
         across(-c(id_resp, id_vign, justice, starts_with("age")), ~factor(.)))
print(dfSummary(dt.vign, graph.magnif = 0.75), method = 'render',
      valid.col = FALSE, na.col = FALSE)

Data Frame Summary

dt.vign

Dimensions: 4000 x 11
Duplicates: 0
No Variable Stats / Values Freqs (% of Valid) Graph
1 id_resp [integer] Mean (sd) : 500.5 (288.7) min < med < max: 1 < 500.5 < 1000 IQR (CV) : 499.5 (0.6) 1000 distinct values
2 female_r [factor] 1. 0 2. 1
1740(43.5%)
2260(56.5%)
3 age_r [integer] Mean (sd) : 44 (9.3) min < med < max: 21 < 43.5 < 79 IQR (CV) : 14 (0.2) 51 distinct values
4 id_vign [integer] Mean (sd) : 2000.5 (1154.8) min < med < max: 1 < 2000.5 < 4000 IQR (CV) : 1999.5 (0.6) 4000 distinct values (Integer sequence)
5 work_status_v [factor] 1. Health 2. Higher ed 3. Transport 4. Unemployed
980(24.5%)
1077(26.9%)
967(24.2%)
976(24.4%)
6 caring_v [factor] 1. 0 2. 1
2049(51.2%)
1951(48.8%)
7 female_v [factor] 1. 0 2. 1
2000(50.0%)
2000(50.0%)
8 age_v [integer] Mean (sd) : 4.4 (2.3) min < med < max: 1 < 4 < 8 IQR (CV) : 4 (0.5)
1:519(13.0%)
2:518(13.0%)
3:502(12.6%)
4:527(13.2%)
5:482(12.0%)
6:521(13.0%)
7:482(12.0%)
8:449(11.2%)
9 diabetes_v [factor] 1. 0 2. 1
1989(49.7%)
2011(50.3%)
10 country_v [factor] 1. Chile 2. Haiti 3. Spain 4. Venezuela
968(24.2%)
1020(25.5%)
1046(26.2%)
966(24.1%)
11 justice [numeric] Mean (sd) : 4.5 (1.4) min < med < max: -0.1 < 4.5 < 9.3 IQR (CV) : 2 (0.3) 4000 distinct values
cors[lower.tri(cors)] <- NA
diag(cors) <- NA
cors[-nrow(cors), -1] %>% 
    kbl(caption = "Pearson correlations between factors") %>% 
    kable_styling(bootstrap_options = c("striped", "bordered"))
Table 1.1: Pearson correlations between factors
caring_v female_v age_v diabetes_v country_v
work_status_v 0.04 0.024 0.019 -0.016 -0.016
caring_v -0.024 0.010 -0.010 -0.014
female_v -0.002 0.007 0.012
age_v 0.008 -0.024
diabetes_v -0.001

As expected, the experimental factors are practically independent. Also, they have small imbalances representing the expected result from a factorial survey experiment in an empirical application where this tends to occur.

2 Analysis

The model for testing the hypotheses group 1 (HG1) is defined as all main effects from the six factors. Weakly regularizing priors are used.

## Model HG1
f_hg1 <- as.formula("justice ~ female_v + age_v + work_status_v + 
                    country_v + caring_v + diabetes_v +
                    (1 | id_resp)")
## Priors
priors_hg1 <- c(set_prior("student_t(3, 3.6, 1)", class = "Intercept"),
                set_prior("student_t(3, 0, 10)", class = "b"),
                set_prior("student_t(3, 0, 3)", class = "sd"))

In order to make comparisons from the factors with more than two levels, we set orthonormal factor coding for the contrasts.

contrasts(dt.vign$work_status_v) <- contr.bayes
contrasts(dt.vign$country_v) <- contr.bayes

Fit the model using MCMC (20,000 iterations are used to compute Bayes Factors later).

m1 <- brm(f_hg1, 
          data = dt.vign,
          prior = priors_hg1, 
          iter = 20000,
          control = list(adapt_delta = .9),
          save_pars = save_pars(all = T),
          refresh = 0)
summary(m1)
## Warning: There were 91 divergent transitions after warmup. Increasing
## adapt_delta above 0.9 may help. See http://mc-stan.org/misc/
## warnings.html#divergent-transitions-after-warmup
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: justice ~ female_v + age_v + work_status_v + country_v + caring_v + diabetes_v + (1 | id_resp) 
##    Data: dt.vign (Number of observations: 4000) 
## Samples: 4 chains, each with iter = 20000; warmup = 10000; thin = 1;
##          total post-warmup samples = 40000
## 
## Group-Level Effects: 
## ~id_resp (Number of levels: 1000) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.07      0.05     0.00     0.17 1.00     5344     9774
## 
## Population-Level Effects: 
##                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept          2.57      0.04     2.48     2.66 1.00    80349    29088
## female_v1         -0.01      0.03    -0.07     0.05 1.00    76474    28786
## age_v              0.41      0.01     0.40     0.42 1.00    79298    26913
## work_status_v1    -0.07      0.03    -0.13    -0.00 1.00    80497    27230
## work_status_v2     0.28      0.03     0.21     0.34 1.00    74424    27489
## work_status_v3     0.04      0.03    -0.02     0.10 1.00    76468    26652
## country_v1        -0.02      0.03    -0.08     0.04 1.00    78687    27587
## country_v2         0.22      0.03     0.16     0.28 1.00    79310    28721
## country_v3        -0.02      0.03    -0.09     0.04 1.00    72506    26858
## caring_v1          0.08      0.03     0.02     0.15 1.00    73773    25719
## diabetes_v1        0.21      0.03     0.15     0.27 1.00    76068    27117
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.00      0.01     0.98     1.03 1.00    32556    22888
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

2.1 Bayes factors

sesoi <- c(-.115, .115) # one-week for bimonthly answers
bf_m1a <- bayesfactor_parameters(m1, null = sesoi, direction = "right")
bf_m1b <- bayesfactor_parameters(m1, null = sesoi, direction = "two-sided")
### Country
pairs_country_v <- pairs(emmeans(m1, ~country_v))
bf_m1_cntry_v <- bayesfactor_parameters(pairs_country_v, m1, 
                                      null = sesoi, direction = "two-sided")
pairs_work_v <- pairs(emmeans(m1, ~work_status_v))
bf_m1_work_v <- bayesfactor_parameters(pairs_work_v, m1, 
                                       null = sesoi, direction = "right")
# Custom BF table
bf1_alt1 <- as.data.frame(bf_m1a) %>% 
  filter(!grepl("tercept|country|female|work", Parameter)) %>% 
  mutate(Test = "Greater than")
bf1_alt2 <- as.data.frame(bf_m1_work_v) %>% 
  mutate(Test = "Greather than")
bf1_nul1 <- as.data.frame(bf_m1b) %>% 
  filter(grepl("female", Parameter)) %>% 
  mutate(Test = "Two-sided")
bf1_nul2 <- as.data.frame(bf_m1_cntry_v) %>% 
  mutate(Test = "Two-sided")

bf_sum <- bf1_alt1 %>% 
  bind_rows(bf1_nul1) %>% 
  select(Parameter, BF, Test) %>% 
  bind_rows(bf1_alt2, bf1_nul2) %>% 
  mutate(Null = "-0.115 - 0.115")

bf_sum <- bf_sum %>% 
  mutate(Result = case_when(
    grepl("than", Test) & BF > 10 ~ "Evidence for Ha",
    grepl("than", Test) & BF < 0.01 ~ "Evidence for H0",
    grepl("sided", Test) & BF < 0.01 ~ "Evidence for H0 ",
    grepl("sided", Test) & BF > 10 ~ "Evidence for Ha",
    BF >= 0.01 & BF  <= 10 ~ "Inconclusive",
    TRUE ~ NA_character_),
    Conclusion = case_when(
      grepl("than", Test) & BF > 10 ~ "Support for HG1",
      grepl("than", Test) & BF < 0.01 ~ "Contrary to HG1",
      grepl("sided", Test) & BF < 0.01 ~ "Support for HG1 ",
      grepl("sided", Test) & BF > 10 ~ "Contrary to HG1",
      BF >= 0.01 & BF  <= 10 ~ "--",
      TRUE ~ NA_character_)) %>% 
  mutate(BF = as.character(signif(BF, 2)))

bf_sum %>% 
  kbl(format = "html", digits = 3, escape = FALSE,
      caption = "Bayes factors and conclusion summary") %>%
  kable_styling(bootstrap_options = c("striped", "bordered")) %>% 
  footnote(general = paste0("BF: Bayes Factor; criteria for assessing evidence is", "$BF_{10} > 10$", " or ", "$BF_{10} < 0.01$; comparisons between levels are shown for factors with more than two levels."), 
           footnote_as_chunk = T, escape=F)
Table 2.1: Bayes factors and conclusion summary
Parameter BF Test Null Result Conclusion
b_age_v 4.1e+54 Greater than -0.115 - 0.115 Evidence for Ha Support for HG1
b_caring_v1 0.0033 Greater than -0.115 - 0.115 Evidence for H0 Contrary to HG1
b_diabetes_v1 9.3 Greater than -0.115 - 0.115 Inconclusive
b_female_v1 7.4e-06 Two-sided -0.115 - 0.115 Evidence for H0 Support for HG1
Health - Higher ed 48 Greather than -0.115 - 0.115 Evidence for Ha Support for HG1
Health - Transport 240 Greather than -0.115 - 0.115 Evidence for Ha Support for HG1
Health - Unemployed 30000 Greather than -0.115 - 0.115 Evidence for Ha Support for HG1
Higher ed - Transport 0.00045 Greather than -0.115 - 0.115 Evidence for H0 Contrary to HG1
Higher ed - Unemployed 0.0085 Greather than -0.115 - 0.115 Evidence for H0 Contrary to HG1
Transport - Unemployed 0.0026 Greather than -0.115 - 0.115 Evidence for H0 Contrary to HG1
Chile - Haiti 14 Two-sided -0.115 - 0.115 Evidence for Ha Contrary to HG1
Chile - Spain 0.81 Two-sided -0.115 - 0.115 Inconclusive
Chile - Venezuela 7.1 Two-sided -0.115 - 0.115 Inconclusive
Haiti - Spain 0.00028 Two-sided -0.115 - 0.115 Evidence for H0 Support for HG1
Haiti - Venezuela 5.8e-05 Two-sided -0.115 - 0.115 Evidence for H0 Support for HG1
Spain - Venezuela 0.00025 Two-sided -0.115 - 0.115 Evidence for H0 Support for HG1
Note: BF: Bayes Factor; criteria for assessing evidence is\(BF_{10} &gt; 10\) or \(BF_{10} &lt; 0.01\); comparisons between levels are shown for factors with more than two levels.