In this workshop, we will use the mirt package (Chalmers, 2012). I will mainly demonstrate how we can fit various types of confirmatory multidimensional item response theory (MIRT) models discussed in the lecture.
# install the package
install.packages("mirt")
# load the package
library(mirt)
## Warning: package 'mirt' was built under R version 4.2.2
## Loading required package: stats4
## Loading required package: lattice
The data we will use is an edited version of ‘read.math’ dataset provided by sirt package. The data has binary item responses from 664 students to 30 Mathematics items for German fourth graders. Items are classified into three domains: Arithmetic (Items 1-16), Measurement (Items 17-23), and Geometry (Items 24-30).
math <- read.table("https://raw.github.com/nkims/Classdata/mathdat/math.csv", sep=" ")
We will fit an IRT model using mirt()
function. This
function requires at least two arguments: data
and
model
.
data
: a matrix or data.frame of numerical datamodel
: a numeric value indicating the number of factors
to extract (exploratory; default is 1) or a mirt.model
defined object# for more details
?mirt
# Unidimensional model
twopl <- mirt(data = math, model = 1, verbose = F) # 2PL model
print(twopl)
##
## Call:
## mirt(data = math, model = 1, verbose = F)
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 28 EM iterations.
## mirt version: 1.37.1
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -11255.16
## Estimated parameters: 60
## AIC = 22630.32
## BIC = 22900.22; SABIC = 22709.72
## G2 (1073741763) = 13906.6, p = 1
## RMSEA = 0, CFI = NaN, TLI = NaN
# Using mirt.model()
mod <- mirt.model('F = 1-30')
twopl2 <- mirt(data = math, model = mod, verbose = F)
# Compare the two models
anova(twopl, twopl2) # identical
## AIC SABIC HQ BIC logLik X2 df p
## twopl 22630.32 22709.72 22734.91 22900.22 -11255.16
## twopl2 22630.32 22709.72 22734.91 22900.22 -11255.16 0 0 NaN
# print item parameter estimates
coef(twopl, simplify = T)
## $items
## a1 d g u
## V1 0.647 2.203 0 1
## V2 0.814 0.277 0 1
## V3 1.154 0.114 0 1
## V4 1.149 -0.502 0 1
## V5 1.355 0.713 0 1
## V6 1.106 0.573 0 1
## V7 1.577 -0.905 0 1
## V8 1.672 -0.913 0 1
## V9 1.151 1.473 0 1
## V10 1.179 -0.114 0 1
## V11 1.126 0.829 0 1
## V12 1.314 0.268 0 1
## V13 1.400 1.012 0 1
## V14 1.254 1.402 0 1
## V15 1.389 0.547 0 1
## V16 1.153 0.524 0 1
## V17 1.228 1.056 0 1
## V18 1.253 -0.577 0 1
## V19 1.186 0.331 0 1
## V20 2.014 -0.563 0 1
## V21 2.252 -1.168 0 1
## V22 2.118 -0.482 0 1
## V23 2.197 -0.712 0 1
## V24 0.606 -0.016 0 1
## V25 0.860 -0.618 0 1
## V26 0.799 -0.350 0 1
## V27 0.794 -0.791 0 1
## V28 0.483 -0.747 0 1
## V29 0.726 -0.173 0 1
## V30 0.824 -1.587 0 1
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 1
coef(twopl2, simplify = T)
## $items
## a1 d g u
## V1 0.647 2.203 0 1
## V2 0.814 0.277 0 1
## V3 1.154 0.114 0 1
## V4 1.149 -0.502 0 1
## V5 1.355 0.713 0 1
## V6 1.106 0.573 0 1
## V7 1.577 -0.905 0 1
## V8 1.672 -0.913 0 1
## V9 1.151 1.473 0 1
## V10 1.179 -0.114 0 1
## V11 1.126 0.829 0 1
## V12 1.314 0.268 0 1
## V13 1.400 1.012 0 1
## V14 1.254 1.402 0 1
## V15 1.389 0.547 0 1
## V16 1.153 0.524 0 1
## V17 1.228 1.056 0 1
## V18 1.253 -0.577 0 1
## V19 1.186 0.331 0 1
## V20 2.014 -0.563 0 1
## V21 2.252 -1.168 0 1
## V22 2.118 -0.482 0 1
## V23 2.197 -0.712 0 1
## V24 0.606 -0.016 0 1
## V25 0.860 -0.618 0 1
## V26 0.799 -0.350 0 1
## V27 0.794 -0.791 0 1
## V28 0.483 -0.747 0 1
## V29 0.726 -0.173 0 1
## V30 0.824 -1.587 0 1
##
## $means
## F
## 0
##
## $cov
## F
## F 1
## coef(twopl, IRTpar = T, simplify = T) # traditional IRT parameterization
# model fit indices
M2(twopl)
## M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR TLI
## stats 3233.658 405 0 0.1026375 0.09929226 0.1058587 0.08631215 0.7685396
## CFI
## stats 0.7845024
M2(twopl2)
## M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR TLI
## stats 3233.658 405 0 0.1026375 0.09929226 0.1058587 0.08631215 0.7685396
## CFI
## stats 0.7845024
To fit other types of IRT models, we can use the
itemtype
argument to specify the model we hope to fit. See
help(mirt)
for more details.
# Rasch model
Rasch <- mirt(data = math, model = 1, itemtype = "Rasch", verbose = F)
coef(Rasch, simplify = T)
## $items
## a1 d g u
## V1 1 2.511 0 1
## V2 1 0.307 0 1
## V3 1 0.122 0 1
## V4 1 -0.500 0 1
## V5 1 0.688 0 1
## V6 1 0.590 0 1
## V7 1 -0.789 0 1
## V8 1 -0.773 0 1
## V9 1 1.485 0 1
## V10 1 -0.106 0 1
## V11 1 0.846 0 1
## V12 1 0.268 0 1
## V13 1 0.957 0 1
## V14 1 1.373 0 1
## V15 1 0.526 0 1
## V16 1 0.534 0 1
## V17 1 1.045 0 1
## V18 1 -0.555 0 1
## V19 1 0.338 0 1
## V20 1 -0.414 0 1
## V21 1 -0.830 0 1
## V22 1 -0.336 0 1
## V23 1 -0.500 0 1
## V24 1 -0.023 0 1
## V25 1 -0.675 0 1
## V26 1 -0.391 0 1
## V27 1 -0.881 0 1
## V28 1 -0.898 0 1
## V29 1 -0.198 0 1
## V30 1 -1.744 0 1
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 1.347
anova(Rasch, twopl)
## AIC SABIC HQ BIC logLik X2 df p
## Rasch 22791.88 22832.90 22845.91 22931.33 -11364.94
## twopl 22630.32 22709.72 22734.91 22900.22 -11255.16 219.555 29 0
We can also estimate standard errors of the parameters estimates by
including SE = T
.
twopl.se <- mirt(data = math, model = 1, SE = T, verbose = F)
coef(twopl.se, printSE = T)
## $V1
## a1 d logit(g) logit(u)
## par 0.647 2.203 -999 999
## SE 0.136 0.142 NA NA
##
## $V2
## a1 d logit(g) logit(u)
## par 0.814 0.277 -999 999
## SE 0.107 0.090 NA NA
##
## $V3
## a1 d logit(g) logit(u)
## par 1.154 0.114 -999 999
## SE 0.128 0.098 NA NA
##
## $V4
## a1 d logit(g) logit(u)
## par 1.149 -0.502 -999 999
## SE 0.131 0.101 NA NA
##
## $V5
## a1 d logit(g) logit(u)
## par 1.355 0.713 -999 999
## SE 0.145 0.110 NA NA
##
## $V6
## a1 d logit(g) logit(u)
## par 1.106 0.573 -999 999
## SE 0.127 0.100 NA NA
##
## $V7
## a1 d logit(g) logit(u)
## par 1.577 -0.905 -999 999
## SE 0.168 0.123 NA NA
##
## $V8
## a1 d logit(g) logit(u)
## par 1.672 -0.913 -999 999
## SE 0.178 0.127 NA NA
##
## $V9
## a1 d logit(g) logit(u)
## par 1.151 1.473 -999 999
## SE 0.135 0.124 NA NA
##
## $V10
## a1 d logit(g) logit(u)
## par 1.179 -0.114 -999 999
## SE 0.130 0.099 NA NA
##
## $V11
## a1 d logit(g) logit(u)
## par 1.126 0.829 -999 999
## SE 0.127 0.105 NA NA
##
## $V12
## a1 d logit(g) logit(u)
## par 1.314 0.268 -999 999
## SE 0.139 0.104 NA NA
##
## $V13
## a1 d logit(g) logit(u)
## par 1.400 1.012 -999 999
## SE 0.149 0.118 NA NA
##
## $V14
## a1 d logit(g) logit(u)
## par 1.254 1.402 -999 999
## SE 0.143 0.125 NA NA
##
## $V15
## a1 d logit(g) logit(u)
## par 1.389 0.547 -999 999
## SE 0.146 0.109 NA NA
##
## $V16
## a1 d logit(g) logit(u)
## par 1.153 0.524 -999 999
## SE 0.128 0.101 NA NA
##
## $V17
## a1 d logit(g) logit(u)
## par 1.228 1.056 -999 999
## SE 0.135 0.113 NA NA
##
## $V18
## a1 d logit(g) logit(u)
## par 1.253 -0.577 -999 999
## SE 0.138 0.106 NA NA
##
## $V19
## a1 d logit(g) logit(u)
## par 1.186 0.331 -999 999
## SE 0.129 0.100 NA NA
##
## $V20
## a1 d logit(g) logit(u)
## par 2.014 -0.563 -999 999
## SE 0.224 0.134 NA NA
##
## $V21
## a1 d logit(g) logit(u)
## par 2.252 -1.168 -999 999
## SE 0.259 0.162 NA NA
##
## $V22
## a1 d logit(g) logit(u)
## par 2.118 -0.482 -999 999
## SE 0.235 0.136 NA NA
##
## $V23
## a1 d logit(g) logit(u)
## par 2.197 -0.712 -999 999
## SE 0.249 0.145 NA NA
##
## $V24
## a1 d logit(g) logit(u)
## par 0.606 -0.016 -999 999
## SE 0.097 0.084 NA NA
##
## $V25
## a1 d logit(g) logit(u)
## par 0.860 -0.618 -999 999
## SE 0.114 0.095 NA NA
##
## $V26
## a1 d logit(g) logit(u)
## par 0.799 -0.35 -999 999
## SE 0.108 0.09 NA NA
##
## $V27
## a1 d logit(g) logit(u)
## par 0.794 -0.791 -999 999
## SE 0.113 0.096 NA NA
##
## $V28
## a1 d logit(g) logit(u)
## par 0.483 -0.747 -999 999
## SE 0.098 0.088 NA NA
##
## $V29
## a1 d logit(g) logit(u)
## par 0.726 -0.173 -999 999
## SE 0.103 0.087 NA NA
##
## $V30
## a1 d logit(g) logit(u)
## par 0.824 -1.587 -999 999
## SE 0.130 0.119 NA NA
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
## SE NA NA
Items 1-16, Items 17-23, and Items 24-30 are each related to different domains of mathematics. Based on this information, we will specify the loading patterns and fit a simple-structure MIRT model.
ss.mod <- mirt.model('F1 = 1-16
F2 = 17-23
F3 = 24-30
COV = F1*F2*F3') # allow all possible correlations among the three factors
ss.fit <- mirt(math, ss.mod, method = "MHRM", verbose = F) # Metropolis-Hastings Robbins-Monro (MH-RM) algorithm can effectively handle high-dimensional estimation
As demonstrated above, mirt.model()
allows us to define
the relationship between items and factors. We can also determine which
covariance parameters to estimate by specifying COV
and add
equality constraints, priors, etc. For example:
model <- mirt.model(' F1 = 1-5, 7
F2 = 6, 8-10
F3 = 11-15
COV = F1*F3, F2*F3
CONSTRAIN = (1-5, 7, a1), (6, 8-10, d)
PRIOR = (1-3, d, norm, 0, 1)')
To examine the item parameter estimates and fit of the model:
coef(ss.fit, simplify=T)
## $items
## a1 a2 a3 d g u
## V1 0.751 0.000 0.000 2.250 0 1
## V2 0.976 0.000 0.000 0.281 0 1
## V3 1.237 0.000 0.000 0.107 0 1
## V4 1.292 0.000 0.000 -0.538 0 1
## V5 1.714 0.000 0.000 0.776 0 1
## V6 1.462 0.000 0.000 0.620 0 1
## V7 1.873 0.000 0.000 -1.012 0 1
## V8 2.196 0.000 0.000 -1.097 0 1
## V9 1.184 0.000 0.000 1.481 0 1
## V10 1.365 0.000 0.000 -0.135 0 1
## V11 1.226 0.000 0.000 0.845 0 1
## V12 1.433 0.000 0.000 0.266 0 1
## V13 1.707 0.000 0.000 1.094 0 1
## V14 1.495 0.000 0.000 1.495 0 1
## V15 1.598 0.000 0.000 0.568 0 1
## V16 1.203 0.000 0.000 0.524 0 1
## V17 0.000 1.130 0.000 1.024 0 1
## V18 0.000 0.984 0.000 -0.533 0 1
## V19 0.000 1.012 0.000 0.314 0 1
## V20 0.000 4.467 0.000 -1.150 0 1
## V21 0.000 4.686 0.000 -2.160 0 1
## V22 0.000 4.793 0.000 -1.045 0 1
## V23 0.000 5.082 0.000 -1.518 0 1
## V24 0.000 0.000 1.831 -0.046 0 1
## V25 0.000 0.000 1.978 -0.892 0 1
## V26 0.000 0.000 2.195 -0.562 0 1
## V27 0.000 0.000 2.019 -1.171 0 1
## V28 0.000 0.000 0.913 -0.842 0 1
## V29 0.000 0.000 1.172 -0.208 0 1
## V30 0.000 0.000 1.289 -1.828 0 1
##
## $means
## F1 F2 F3
## 0 0 0
##
## $cov
## F1 F2 F3
## F1 1.000 0.632 0.418
## F2 0.632 1.000 0.423
## F3 0.418 0.423 1.000
M2(ss.fit)
## M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR TLI
## stats 1369.074 402 0 0.06023654 0.0567092 0.06370648 0.06387498 0.9202768
## CFI
## stats 0.9263247
anova(twopl, ss.fit) # compare model fit with 2PL model
## AIC SABIC HQ BIC logLik X2 df p
## twopl 22630.32 22709.72 22734.91 22900.22 -11255.16
## ss.fit 21631.57 21714.93 21741.38 21914.96 -10752.78 1004.757 3 0
High correlations among the three factors indicate that there might be a general factor influencing all items. Here we will fit a bifactor MIRT model that incoporates a general factor as well as the specific factors.
bif.mod <- mirt.model('F0 = 1-30
F1 = 1-16
F2 = 17-23
F3 = 24-30') # no correlation among factors
bif.fit <- mirt(math, bif.mod, method = "MHRM", verbose = F)
coef(bif.fit, simplify=T)
## $items
## a1 a2 a3 a4 d g u
## V1 0.676 0.446 0.000 0.000 2.296 0 1
## V2 0.867 0.470 0.000 0.000 0.298 0 1
## V3 1.128 0.494 0.000 0.000 0.128 0 1
## V4 1.190 0.772 0.000 0.000 -0.546 0 1
## V5 1.874 1.387 0.000 0.000 0.972 0 1
## V6 1.584 1.450 0.000 0.000 0.795 0 1
## V7 1.956 1.315 0.000 0.000 -1.160 0 1
## V8 2.314 1.382 0.000 0.000 -1.253 0 1
## V9 1.181 -0.025 0.000 0.000 1.510 0 1
## V10 1.370 0.024 0.000 0.000 -0.112 0 1
## V11 1.189 0.047 0.000 0.000 0.863 0 1
## V12 1.526 -0.198 0.000 0.000 0.303 0 1
## V13 2.192 -0.735 0.000 0.000 1.350 0 1
## V14 2.350 -1.357 0.000 0.000 2.177 0 1
## V15 2.268 -0.920 0.000 0.000 0.763 0 1
## V16 1.425 -0.527 0.000 0.000 0.605 0 1
## V17 1.171 0.000 0.411 0.000 1.079 0 1
## V18 1.236 0.000 0.286 0.000 -0.572 0 1
## V19 1.085 0.000 0.343 0.000 0.341 0 1
## V20 2.306 0.000 3.864 0.000 -1.115 0 1
## V21 2.488 0.000 3.657 0.000 -2.059 0 1
## V22 2.486 0.000 4.065 0.000 -0.984 0 1
## V23 2.686 0.000 4.358 0.000 -1.496 0 1
## V24 0.501 0.000 0.000 2.116 -0.046 0 1
## V25 0.921 0.000 0.000 1.705 -0.861 0 1
## V26 0.856 0.000 0.000 2.015 -0.543 0 1
## V27 0.841 0.000 0.000 1.885 -1.174 0 1
## V28 0.419 0.000 0.000 0.734 -0.817 0 1
## V29 0.691 0.000 0.000 0.872 -0.187 0 1
## V30 0.699 0.000 0.000 0.978 -1.773 0 1
##
## $means
## F0 F1 F2 F3
## 0 0 0 0
##
## $cov
## F0 F1 F2 F3
## F0 1 0 0 0
## F1 0 1 0 0
## F2 0 0 1 0
## F3 0 0 0 1
M2(bif.fit, QMC = T) # QMC=T is useful for high dimensional models
## M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR TLI
## stats 712.9979 375 0 0.03687097 0.03271299 0.04094008 0.04450582 0.9701301
## CFI
## stats 0.9742501
anova(ss.fit, bif.fit) # compare model fit with SS model; note that the bifactor and SS models may not be nested
## AIC SABIC HQ BIC logLik X2 df p
## ss.fit 21631.57 21714.93 21741.38 21914.96 -10752.78
## bif.fit 21262.28 21381.37 21419.15 21667.12 -10541.14 423.288 27 0
# person trait estimates
scores <- fscores(bif.fit, QMC = T) # EAP estimate (default)
head(scores)
## F0 F1 F2 F3
## [1,] 0.6854651 -0.6170275 -0.057392681 0.22762860
## [2,] 1.9940210 0.3673618 0.589653492 0.54422506
## [3,] 0.3322499 -0.3040698 1.140201671 -0.02785661
## [4,] 0.5497792 0.7136597 -0.008131884 0.75932158
## [5,] -0.8706914 -0.9228804 -0.709613945 -0.94009512
## [6,] -1.7023432 0.6250598 -0.467668323 0.45828490
The data is a modified version of the ‘SAT12’ dataset obtained from mirt package (Chalmers, 2012). The data contains binary item responses from 600 examinees to 32 items of a grade 12 science assessment test (SAT). This data has three clusters of items (Items 1-13, Items 14-20, and Items 21-32) measuring different topics of science. The original item response pattern was scored by using the scoring key of [1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,3].
SAT <- read.table("https://raw.github.com/nkims/Classdata/SATdata/SAT.csv", sep=",")
Fit a simple structure MIRT model based on the domain-related
classification of the items
1-1. Evaluate the model fit statistics
1-2. Obtain the item parameter estimates
Fit a bifactor MIRT model
1-1. Evaluate the model fit statistics
1-2. Compare the model fit to that of the simple-structure model
1-3. Obtain the item parameter estimates
The data we will use is a subset of the data collected from a survey from the Trends in International Mathematics and Science Study (TIMSS) 2015 (for Grade 8). The dataset contains item responses from 1,000 randomly sampled respondents from the U.S. administration of the 9 items for the Mathematics Self-Efficacy scale. Each item was rated using four response categories: disagree a lot, disagree a little, agree a little, and agree a lot.
mathse <- read.table("https://raw.github.com/nkims/Classdata/mathse/mathse.csv", sep=",")
head(mathse)
## V1 V2 V3 V4 V5 V6 V7 V8 V9
## 1 4 1 4 4 4 2 4 4 4
## 2 4 4 4 4 4 4 2 4 3
## 3 4 4 3 4 3 4 3 3 1
## 4 4 4 3 4 2 4 4 4 4
## 5 1 1 1 1 1 1 1 1 1
## 6 4 4 2 3 2 2 2 2 2
gpcm.fit <- mirt(mathse, 1, itemtype = "gpcm", verbose = F)
coef(gpcm.fit, simplify = T)
## $items
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## V1 2.188 0 1 2 3 0 3.686 6.238 5.929
## V2 1.562 0 1 2 3 0 1.778 2.511 2.029
## V3 2.881 0 1 2 3 0 2.168 2.785 1.344
## V4 2.038 0 1 2 3 0 2.928 4.096 2.997
## V5 0.828 0 1 2 3 0 1.405 1.721 1.583
## V6 1.888 0 1 2 3 0 1.949 2.549 0.781
## V7 0.766 0 1 2 3 0 0.811 1.082 0.364
## V8 2.519 0 1 2 3 0 1.941 2.934 2.678
## V9 1.736 0 1 2 3 0 1.730 1.844 0.825
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 1
gpcm.mod <- mirt.model('SE = 1-9')
gpcm.fit2 <- mirt(mathse, gpcm.mod, itemtype = "gpcm", verbose = F)
print(gpcm.fit2)
##
## Call:
## mirt(data = mathse, model = gpcm.mod, itemtype = "gpcm", verbose = F)
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 79 EM iterations.
## mirt version: 1.37.1
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -9584.187
## Estimated parameters: 36
## AIC = 19240.37
## BIC = 19417.05; SABIC = 19302.71
## G2 (262107) = 6529.66, p = 1
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(gpcm.fit2, simplify = T)
## $items
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## V1 2.188 0 1 2 3 0 3.686 6.238 5.929
## V2 1.562 0 1 2 3 0 1.778 2.511 2.029
## V3 2.881 0 1 2 3 0 2.168 2.785 1.344
## V4 2.038 0 1 2 3 0 2.928 4.096 2.997
## V5 0.828 0 1 2 3 0 1.405 1.721 1.583
## V6 1.888 0 1 2 3 0 1.949 2.549 0.781
## V7 0.766 0 1 2 3 0 0.811 1.082 0.364
## V8 2.519 0 1 2 3 0 1.941 2.934 2.678
## V9 1.736 0 1 2 3 0 1.730 1.844 0.825
##
## $means
## SE
## 0
##
## $cov
## SE
## SE 1
If the items are measuring different dimensions, we can include additional factors. Suppose we have another dataset that has responses to 25 personality self-report items measuring 5 different subdimensions: Agreeableness(Items 1-5), Conscientiousness(Items 6-10), Extraversion(Items 11-15), Neuroticism(Items 16-20), and Opennness(Items 21-25). In such a case, we can add more dimensions and specify the factor loadings as we have done for the simple structure MIRT model example above.
ss.bfi <- mirt.model('A = 1-5
C = 6-10
E = 11-15
N = 16-20
O = 21-25
COV = A*C*E*N*O')
bfi.fit <- mirt(bfi, ss.bfi, itemtype = "gpcm", method = "QMCEM")
The nominal response model is more flexible compared to GPCM as we
can specify the scoring coefficients (i.e., the relationship between an
item and a factor). If the scoring function is specified as \((0, 1, 2, ..., K-1)\) where \(K\) is the number of response categories,
the MNRM is identical to the GPCM. If the factors/traits need to be
scored differently for each category, the scoring function should be
specified using the gpcm_mats
argument.
Let’s say we hypothesize that both the math self-efficacy trait and the extreme response style (ERS) can influence the item responses. Then we need to model both dimensions (self-efficacy and ERS) by specifying the scoring functions for each dimension because the two factors have different relationships with item categories.
# create a list of scoring functions
ers.sf <- list() # a list of matrices
for (i in 1:9){
ers.sf[[i]] <- matrix(c(0, 1, 2, 3,
1, 0, 0, 1), 4, 2) # K by D matrix
}
ers.mod <- mirt.model('SE = 1-9
ERS = 1-9
COV = SE*ERS')
ers.fit <- mirt(mathse, ers.mod, itemtype = "gpcm", gpcm_mats = ers.sf, verbose = F)
print(ers.fit)
##
## Call:
## mirt(data = mathse, model = ers.mod, itemtype = "gpcm", gpcm_mats = ers.sf,
## verbose = F)
##
## Full-information item factor analysis with 2 factor(s).
## Converged within 1e-04 tolerance after 162 EM iterations.
## mirt version: 1.37.1
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 31
## Latent density type: Gaussian
##
## Log-likelihood = -9174.408
## Estimated parameters: 46
## AIC = 18440.82
## BIC = 18666.57; SABIC = 18520.47
## G2 (262097) = 5710.11, p = 1
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(ers.fit, simplify = T)
## $items
## a1 a2 ak0_1 ak1_1 ak2_1 ak3_1 ak0_2 ak1_2 ak2_2 ak3_2 d0 d1 d2
## V1 2.161 1.150 0 1 2 3 1 0 0 1 0 3.961 6.528
## V2 1.485 1.459 0 1 2 3 1 0 0 1 0 1.987 2.641
## V3 2.536 2.038 0 1 2 3 1 0 0 1 0 2.232 2.707
## V4 1.991 2.322 0 1 2 3 1 0 0 1 0 3.694 4.782
## V5 0.780 1.077 0 1 2 3 1 0 0 1 0 1.523 1.786
## V6 1.654 2.196 0 1 2 3 1 0 0 1 0 2.306 2.813
## V7 0.711 1.197 0 1 2 3 1 0 0 1 0 0.929 1.167
## V8 2.389 2.097 0 1 2 3 1 0 0 1 0 2.053 2.937
## V9 1.552 1.773 0 1 2 3 1 0 0 1 0 1.914 1.916
## d3
## V1 6.165
## V2 2.185
## V3 1.401
## V4 3.501
## V5 1.659
## V6 0.951
## V7 0.421
## V8 2.867
## V9 0.963
##
## $means
## SE ERS
## 0 0
##
## $cov
## SE ERS
## SE 1.00 -0.07
## ERS -0.07 1.00
We will now discuss how we can fit an IRTree model to the data. Let’s say our hypothesized tree structure look like the figure below.
To fit an IRTree model, we first need to recode the item responses based on the tree structure (i.e., create pseudo-item decisions).
# import 'dendrify2' function from 'flirt' package (Jeon et al., 2014)
source("https://raw.github.com/nkims/Classdata/flirt/dendrify2.R")
mapping <- matrix(c(0, 0, 1, 1,
1, 0, NA, NA,
NA, NA, 0, 1), 4, 3)
mapping
## [,1] [,2] [,3]
## [1,] 0 1 NA
## [2,] 0 0 NA
## [3,] 1 NA 0
## [4,] 1 NA 1
pseudo <- dendrify2(mathse, mapping, wide=T)[-1]
head(pseudo)
## value.i1:node1 value.i2:node1 value.i3:node1 value.i4:node1 value.i5:node1
## 1 1 0 1 1 1
## 2 1 1 1 1 1
## 3 1 1 1 1 1
## 4 1 1 1 1 0
## 5 0 0 0 0 0
## 6 1 1 0 1 0
## value.i6:node1 value.i7:node1 value.i8:node1 value.i9:node1 value.i1:node2
## 1 0 1 1 1 NA
## 2 1 0 1 1 NA
## 3 1 1 1 0 NA
## 4 1 1 1 1 NA
## 5 0 0 0 0 1
## 6 0 0 0 0 NA
## value.i2:node2 value.i3:node2 value.i4:node2 value.i5:node2 value.i6:node2
## 1 1 NA NA NA 0
## 2 NA NA NA NA NA
## 3 NA NA NA NA NA
## 4 NA NA NA 0 NA
## 5 1 1 1 1 1
## 6 NA 0 NA 0 0
## value.i7:node2 value.i8:node2 value.i9:node2 value.i1:node3 value.i2:node3
## 1 NA NA NA 1 NA
## 2 0 NA NA 1 1
## 3 NA NA 1 1 1
## 4 NA NA NA 1 1
## 5 1 1 1 NA NA
## 6 0 0 0 1 1
## value.i3:node3 value.i4:node3 value.i5:node3 value.i6:node3 value.i7:node3
## 1 1 1 1 NA 1
## 2 1 1 1 1 NA
## 3 0 1 0 1 0
## 4 0 1 NA 1 1
## 5 NA NA NA NA NA
## 6 NA 0 NA NA NA
## value.i8:node3 value.i9:node3
## 1 1 1
## 2 1 0
## 3 0 NA
## 4 1 1
## 5 NA NA
## 6 NA NA
How would you recode the item responses based on the tree structure shown below?
erstree <- mirt.model('F1 = 1-9
F2 = 10-18
F3 = 19-27
COV = F2*F3')
erstree.fit <- mirt(pseudo, erstree, itemtype = "2PL", method = "MHRM", verbose = F)
print(erstree.fit)
##
## Call:
## mirt(data = pseudo, model = erstree, itemtype = "2PL", method = "MHRM",
## verbose = F)
##
## Full-information item factor analysis with 3 factor(s).
## Converged within 0.001 tolerance after 267 MHRM iterations.
## mirt version: 1.37.1
## M-step optimizer: NR1
## Latent density type: Gaussian
## Average MH acceptance ratio(s): 0.344
##
## Log-likelihood = -9433.887, SE = 0.04
## Estimated parameters: 55
## AIC = 18977.77
## BIC = 19247.7; SABIC = 19073.02
coef(erstree.fit, simplify = T)
## $items
## a1 a2 a3 d g u
## value.i1:node1 2.658 0.000 0.000 2.979 0 1
## value.i2:node1 2.233 0.000 0.000 1.022 0 1
## value.i3:node1 3.312 0.000 0.000 0.581 0 1
## value.i4:node1 2.735 0.000 0.000 1.488 0 1
## value.i5:node1 1.409 0.000 0.000 0.736 0 1
## value.i6:node1 2.528 0.000 0.000 0.601 0 1
## value.i7:node1 1.272 0.000 0.000 0.291 0 1
## value.i8:node1 3.549 0.000 0.000 1.339 0 1
## value.i9:node1 2.397 0.000 0.000 0.218 0 1
## value.i1:node2 0.000 1.454 0.000 -1.759 0 1
## value.i2:node2 0.000 2.153 0.000 -0.722 0 1
## value.i3:node2 0.000 2.460 0.000 0.046 0 1
## value.i4:node2 0.000 3.119 0.000 -2.092 0 1
## value.i5:node2 0.000 1.213 0.000 -0.970 0 1
## value.i6:node2 0.000 2.455 0.000 -0.694 0 1
## value.i7:node2 0.000 1.218 0.000 -0.411 0 1
## value.i8:node2 0.000 2.834 0.000 0.171 0 1
## value.i9:node2 0.000 3.490 0.000 -0.812 0 1
## value.i1:node3 0.000 0.000 1.916 0.229 0 1
## value.i2:node3 0.000 0.000 1.988 0.002 0 1
## value.i3:node3 0.000 0.000 3.467 -0.349 0 1
## value.i4:node3 0.000 0.000 2.738 -0.652 0 1
## value.i5:node3 0.000 0.000 1.204 0.085 0 1
## value.i6:node3 0.000 0.000 2.518 -1.504 0 1
## value.i7:node3 0.000 0.000 1.387 -0.672 0 1
## value.i8:node3 0.000 0.000 3.737 1.115 0 1
## value.i9:node3 0.000 0.000 1.787 -0.340 0 1
##
## $means
## F1 F2 F3
## 0 0 0
##
## $cov
## F1 F2 F3
## F1 1 0.000 0.000
## F2 0 1.000 0.672
## F3 0 0.672 1.000
# Assume that the factors underlying the second and third nodes are the same
erstree2 <- mirt.model('SE = 1-9
ERS = 10-27')
erstree2.fit <- mirt(pseudo, erstree2, itemtype = "2PL", method = "MHRM", verbose = F)
print(erstree2.fit)
##
## Call:
## mirt(data = pseudo, model = erstree2, itemtype = "2PL", method = "MHRM",
## verbose = F)
##
## Full-information item factor analysis with 2 factor(s).
## Converged within 0.001 tolerance after 203 MHRM iterations.
## mirt version: 1.37.1
## M-step optimizer: NR1
## Latent density type: Gaussian
## Average MH acceptance ratio(s): 0.336
##
## Log-likelihood = -9485.45, SE = 0.04
## Estimated parameters: 54
## AIC = 19078.9
## BIC = 19343.92; SABIC = 19172.41
coef(erstree2.fit, simplify = T)
## $items
## a1 a2 d g u
## value.i1:node1 2.659 0.000 2.970 0 1
## value.i2:node1 2.203 0.000 1.003 0 1
## value.i3:node1 3.316 0.000 0.567 0 1
## value.i4:node1 2.753 0.000 1.484 0 1
## value.i5:node1 1.397 0.000 0.726 0 1
## value.i6:node1 2.539 0.000 0.591 0 1
## value.i7:node1 1.260 0.000 0.284 0 1
## value.i8:node1 3.479 0.000 1.302 0 1
## value.i9:node1 2.376 0.000 0.205 0 1
## value.i1:node2 0.000 1.388 -1.535 0 1
## value.i2:node2 0.000 1.944 -0.420 0 1
## value.i3:node2 0.000 2.149 0.307 0 1
## value.i4:node2 0.000 2.873 -1.538 0 1
## value.i5:node2 0.000 1.075 -0.847 0 1
## value.i6:node2 0.000 2.095 -0.334 0 1
## value.i7:node2 0.000 1.065 -0.311 0 1
## value.i8:node2 0.000 2.503 0.488 0 1
## value.i9:node2 0.000 2.403 -0.424 0 1
## value.i1:node3 0.000 1.559 0.227 0 1
## value.i2:node3 0.000 1.736 0.084 0 1
## value.i3:node3 0.000 3.274 -0.137 0 1
## value.i4:node3 0.000 2.570 -0.487 0 1
## value.i5:node3 0.000 1.116 0.111 0 1
## value.i6:node3 0.000 2.440 -1.337 0 1
## value.i7:node3 0.000 1.324 -0.603 0 1
## value.i8:node3 0.000 3.377 1.209 0 1
## value.i9:node3 0.000 1.712 -0.246 0 1
##
## $means
## SE ERS
## 0 0
##
## $cov
## SE ERS
## SE 1 0
## ERS 0 1