Stimulus-Based Assessment in mstATA: Conditional Item Selection

Modeling Strategy: Conditional Item Selection

Decision variables

A stimulus-based model typically includes:

Item selection variables, e.g., \(x_{i,m}\) (item i selected in module m)
Stimulus selection variables, e.g., \(z_{s,m}\) (stimulus s selected in module m)

The item-stimulus conditional inclusion is: If item i belongs to stimulus s, then selecting item i implies selecting stimulus s. In other words, for each module m and each item i associated with stimulus s, \(x_{i,m} \le z_{s,m}\). This ensures the model never selects an item “without its stimulus”.

The pivot-item method (van der Linden, 2000) provides a more efficient and elegant way to represent the dependency between items and their associated stimuli. Formally, a pivot item is defined as an item that is selected if and only if its corresponding stimulus is selected. In practice, the pivot item is typically chosen as the item within a stimulus that best represents the stimulus—identified by content experts as having the most representative content or desirable psychometric properties.

Each stimulus-based set includes only one pivot item. The binary decision variable associated with this pivot item therefore serves as a carrier for the selection of both the item and its associated stimulus. Consequently, the pivot-item formulation allows mstATA to accommodate item pools consisting solely of discrete items, solely of stimulus-based items, or a mixture of both. In all cases, the same set of decision variables are used to represent item–module selection, thereby ensuring a unified modeling structure regardless of item type.

Logical constraints and linking

Maintaining consistency between item and stimulus selection requires the inclusion of logical constraints governing item–stimulus conditional selection. van der Linden (2005), pp. 165-170 identifies three classes of such constraints: (1) limits on the number of items associated with a selected stimulus, (2) category-specific item limits within a stimulus, and (3) bounds on the sum of item-level quantitative attributes associated with a stimulus.

First, limits on the number of items associated with a selected stimulus allow two selection regimes: all-in/all-out selection and partial selection. Under all-in/all-out selection, selecting a stimulus implies that all associated items are selected, whereas under partial selection, only a subset of associated items may be selected. In mstATA, when only a minimum number of items associated with a selected stimulus is specified, an upper bound equal to the total number of items associated with that stimulus is automatically added. This upper bound provides a safe gating mechanism to preserve consistency between item and stimulus selection.

Second, category-specific item limits within a stimulus restrict the number of items from a particular category that may be selected within a selected stimulus. Similarly, in mstATA, if only a minimum category-specific requirement is specified, an upper bound equal to the total number of items in that category associated with the stimulus is automatically added to ensure consistency.

Third, bounds on the sum of item-level quantitative attributes associated with a stimulus impose lower and/or upper limits on aggregated item attributes, such as total difficulty values or word counts. However, even when both lower and upper bounds are imposed, consistency between item and stimulus selection may not be guaranteed unless all item-level quantitative attributes are strictly positive. To prevent users from inadvertently applying this specification without verifying the strict positivity assumption, mstATA detects whether this constraint type is the sole logical constraint governing item–stimulus conditional selection—that is, when neither limits on the number of associated items nor category-specific limits are specified. In such cases, mstATA automatically applies a partial‑selection rule—setting the minimum to the smallest number of items linked to any stimulus and the maximum to the largest—to ensure consistent and feasible item selection.

Functions

Stimulus-related constraints:

Stimulus-level

stimcat_con(): constrain a stimulus must or must not be selected.
stimquant_con(): constrain quantitative attribute for a stimulus to be selected.

Itemset-level (items linked to the same stimulus)

stim_itemcount_con(): constrain the min/exact/max number of items selected conditional on the selection of a stimulus.
stim_itemcat_con(): constrain the min/exact/max number of items selected from category c conditional on the selection of a stimulus.
stim_itemquant_con(): constrain the min/exact/max values for the sum of item quantitative attribute values within a selected stimulus.

Module-/Pathway-level

test_stimcount_con(): constrain the min/equal/max number of stimuli in a module or a pathway.
test_stimcat_con(): constrain the min/equal/max number of stimuli from specific categories in a module or pathway.
test_stimquant_con(): constrain the min/equal/max for the sum of the stimulus quantitative attribute in a module or pathway.

Panel-level

panel_stimcat_con(): constrain the min/equal/max number of stimuli from specific categories within a panel.

Solution-level

solution_stimcount_con(): constraint the min/equal/max number of unique stimuli across multiple panels.
solution_stimcat_con(): constrain the min/equal/max number of unique stimuli from specific categories across multiple panels.

Worked Example

Item pool data

A simulated item pool is used for analysis and can be accessed via data("reading_itempool"). The pool contains 500 items nested within 64 passages, comprising 407 multiple‑choice (MC) items and 93 technology‑enhanced items (TEIs). The distribution of items across the four content domains is 120, 128, 135, and 117 items, respectively. Of the 64 passages, 36 belong to the history domain and 28 to social studies. The pool includes two enemy‑item sets and one enemy‑stimulus set. Summary descriptive statistics for item‑ and stimulus‑level quantitative attributes are presented below.

Descriptive statistics for item and stimulus quantitative attributes.
Attribute	Level	Mean	SD	Min	Max
Discrimination	Item	0.92	0.19	0.51	1.59
Difficulty	Item	-0.01	0.97	-3.24	2.14
Guessing	Item	0.10	0.04	0.01	0.26
Response Time	Item	120.11	35.01	60.00	180.00
Word Counts	Stimulus	123.47	35.01	52.00	199.00

Specifications

An MST panel with a 1-2-3 design is assembled using the pivot-item method to jointly select items and stimuli. Within each stimuli, an pivot item is defined as the item that has the highest item discrimination parameter.

Specifications include:

3 stages, 6 modules, 4 pathways (S1R-S2E-S3H and S1R-S2H-S3E are not allowed).
Each stage has 12, 12, 12 items respectively.
Routing decision points: 0 to routing stage 2, -0.43, 0.43 to routing to stage 3.
Unique items are used across modules in the panel.
Min and Max number of items in content 1-4 per pathway: 7-11 for each content.
Exact number of TEI items per module: 2 TEI items per module
Average response time per module: 110-130 seconds per item.
Number of passages and passage types: one history passage, one social studies passage in each module.
Items in the same enemy item set can not appear in the same pathway.
Stimuli in the same enemy stimulus set can not appear in the same pathway.
If a stimulus is selected in a module, at least 4 items, at most 8 items from that stimulus are selected.
The selected stimulus must have at least 90, at most 150 words.
maximize TIF values at \(\theta = c(-1.39, -0.97, -0.68)\) for S1R-S2E-S3E pathway, \(\theta = c(-0.43,-0.21, 0)\) for S1R-S2E-S3M pathway, \(\theta = c(0, 0.21, 0.43)\) for S1R-S2H-S3M pathway, \(\theta = c(0.68, 0.97, 1.39)\) for S1R-S2H-S3H pathway. For each pathway, the middle target theta point is prioritized over the other target theta points. This priority is operationalized by specifying that the TIF value at the middle theta point must be 1.5 times the TIF values at the remaining target theta points.

Code

Step 1: Prepare the Item Pool

data("reading_itempool")
REE<-c(-1.39,-0.97,-0.68)
REM<-c(-0.43,-0.21,0)
RHM<-c(0,0.21,0.43)
RHH<-c(0.68,0.97,1.39)
theta_values<-unique(c(REE,REM,RHM,RHH))
item_par_cols<-list("3PL"=c("discrimination","difficulty","guessing"))
theta_iif<-compute_iif(reading_itempool,
                       item_par_cols = item_par_cols,
                       theta = theta_values,model_col = "model",
                       D = 1.7)
reading_itempool[,paste0("iif(theta=",theta_values,")")]<-theta_iif
enemyitem_set<-create_enemy_sets(reading_itempool$item_id,
                                 reading_itempool$enemy_item)
enemystim_set<-create_enemy_sets(reading_itempool$stimulus,
                                 reading_itempool$enemy_stimulus)
pivot_stim_map<-create_pivot_stimulus_map(reading_itempool,
                                          item_id_col = "item_id",
                                          stimulus = "stimulus",
                                          pivot_item = "pivot_item")

Step 2: Specify the MST Structure

mst_123 <- mst_design(itempool = reading_itempool,item_id_col = "item_id",
                      design = "1-2-3",rdps = list(c(0),c(-0.43,0.43)),
                      exclude_pathways = c("1-1-3","1-2-1"),
                      module_length = c(12,12,12,12,12,12),
                      enemyitem_set = enemyitem_set,
                      enemystim_set = enemystim_set,
                      pivot_stim_map = pivot_stim_map)

Step 3: Identify hierarchical requirements

mst structure: MST 1-2-3 (S1R-S2E-S3H and S1R-S2H-S3E are not allowed). Each stage has 12, 12, 12 items respectively. Routing decision points: 0 to routing stage 2, -0.43, 0.43 to routing to stage 3.
panel-level item reusage: Unique items are used across modules in the panel.
pathway-level:no enemy item pairs, no enemy stimulus pairs, 7-11 for each content.
module-level: 2 TEI items per module; one history passage, one social studies passage in each module. Module mean response time.
itemset-level: If a stimulus is selected in a module, at least 4 items, at most 8 items from that stimulus are selected.
stim-level: The selected stimulus must have at least 90, at most 150 words.
objective: maximize TIF values at \(\theta = c(-1.39, -0.97, -0.68)\) for S1R-S2E-S3E pathway, \(\theta = c(-0.43,-0.21, 0)\) for S1R-S2E-S3M pathway, \(\theta = c(0, 0.21, 0.43)\) for S1R-S2H-S3M pathway, \(\theta = c(0.68, 0.97, 1.39)\) for S1R-S2H-S3H pathway.

Step 4: Translate specifications

mst_structure<-mst_structure_con(x = mst_123,info_tol = 0.1)
#> 'ModuleLength' present in design; ignoring stage-level bounds for length.
mst_noreuse<-panel_itemreuse_con(x = mst_123,overlap = FALSE)
mst_noenemyitem<-enemyitem_exclu_con(x = mst_123)
mst_noenemystim<-enemystim_exclu_con(x = mst_123)
mst_content<-test_itemcat_range_con(x = mst_123,attribute = "content",
                                    cat_levels = paste0("content",1:4),
                                    min = 7,max = 11,
                                    which_pathway = 1:4)
mst_tei<-test_itemcat_con(x = mst_123,attribute = "itemtype",
                          cat_levels = "TEI",
                          operator = "=",target_num = 2,
                          which_module = 1:6)
mst_passtype<-test_stimcat_con(x = mst_123,attribute = "stimulus_type",
                               cat_levels = c("history","social studies"),
                               operator = "=",target_num = 1,
                               which_module = 1:6)
mst_time<-test_itemquant_range_con(x = mst_123,attribute = "time",
                                   min = 110*12,max = 130*12,
                                   which_module = 1:6)
mst_stimitem<-stim_itemcount_con(x = mst_123,min = 4,max = 8,
                                 which_module = 1:6)
mst_stimquant<-stimquant_con(x = mst_123,attribute = "stimulus_words",
                             min = 90,max = 150,
                             which_module = 1:6)
obj1<-objective_term(x = mst_123,attribute = "iif(theta=-1.39)",
                     applied_level = "Pathway-level",
                     which_pathway = 1,sense = "max")
obj2<-objective_term(x = mst_123,attribute = "iif(theta=-0.97)",
                     applied_level = "Pathway-level",
                     which_pathway = 1,sense = "max")
obj3<-objective_term(x = mst_123,attribute = "iif(theta=-0.68)",
                     applied_level = "Pathway-level",
                     which_pathway = 1,sense = "max")
obj4<-objective_term(x = mst_123,attribute = "iif(theta=-0.43)",
                     applied_level = "Pathway-level",
                     which_pathway = 2,sense = "max")
obj5<-objective_term(x = mst_123,attribute = "iif(theta=-0.21)",
                     applied_level = "Pathway-level",
                     which_pathway = 2,sense = "max")
obj6<-objective_term(x = mst_123,attribute = "iif(theta=0)",
                     applied_level = "Pathway-level",
                     which_pathway = 2,sense = "max")
obj7<-objective_term(x = mst_123,attribute = "iif(theta=0)",
                     applied_level = "Pathway-level",
                     which_pathway = 3,sense = "max")
obj8<-objective_term(x = mst_123,attribute = "iif(theta=0.21)",
                     applied_level = "Pathway-level",
                     which_pathway = 3,sense = "max")
obj9<-objective_term(x = mst_123,attribute = "iif(theta=0.43)",
                     applied_level = "Pathway-level",
                     which_pathway = 3,sense = "max")
obj10<-objective_term(x = mst_123,attribute = "iif(theta=0.68)",
                      applied_level = "Pathway-level",
                     which_pathway = 4,sense = "max")
obj11<-objective_term(x = mst_123,attribute = "iif(theta=0.97)",
                      applied_level = "Pathway-level",
                     which_pathway = 4,sense = "max")
obj12<-objective_term(x = mst_123,attribute = "iif(theta=1.39)",
                      applied_level = "Pathway-level",
                     which_pathway = 4,sense = "max")
mst_obj<-capped_maximin_obj(x = mst_123,
                            multiple_terms = list(obj1,obj2,obj3,
                                                  obj4,obj5,obj6,
                                                  obj7,obj8,obj9,
                                                  obj10,obj11,obj12),
                            strategy_args = list(proportions = rep(c(1,1.5,1),4)))
mst_model<-onepanel_spec(x = mst_123,
                         constraints = list(mst_structure,mst_noreuse,
                                            mst_content,mst_noenemyitem,mst_noenemystim,
                                            mst_tei,mst_passtype,mst_time,
                                            mst_stimitem,
                                            mst_stimquant),
                         objective = mst_obj)

Step 5: Execute assembly via solver

The model contains 2,146 linear constraints. Using HiGHS as the solver, an optimal solution is obtained within 2 minutes.

# It is not executed in the vignette to avoid long build times.
# \dontrun{
# mst_result<-solve_model(model_spec = mst_model,solver = "HiGHS",time_limit = 5*60)
# reading_panel<-assembled_panel(x = mst_123,result = mst_result)
# }

Step 6: Diagnose infeasible model

There is an optimal solution. Skip this step.

Step 7: Evaluate panel

The assembled panel is saved as data("reading_panel").

S1R contains: stim 12 (139 words, social studies, 7 MC items) and stim 51 (132 words, history, 3 MC items and 2 TEI items) S2E contains: stim 1 (117 words, history, 7 MC items and 1 TEI item) and stim 8 (101 words, social studies, 3 MC items and 1 TEI item) S2H contains: stim 38 (98 words, history, 7 MC items and 1 TEI item) and stim 53 (116 words, social studies, 3 MC items and 1 TEI item) S3E contains: stim 37 (96 words, social studies, 5 MC items and 1 TEI item) and stim 48 (122 words, history, 5 MC items and 1 TEI item) S3M contains: stim 4 (124 words, social studies, 3 MC items and 1 TEI item) and stim 45 (124 words, history, 7 MC items and 1 TEI item) S3H contains: stim 30 (99 words, social studies, 4 MC items and 1 TEI item) and stim 36 (148 words, history, 6 MC items and 1 TEI item)

Enemy item pair and enemy stimulus pair do not appear together in a pathway.

Routing decision points information check
panel_id	theta	information	module_id
Panel_1	0.00	3.57	S2E
Panel_1	0.00	3.53	S2H
Panel_1	-0.43	4.19	S3E
Panel_1	-0.43	4.20	S3M
Panel_1	0.43	5.33	S3M
Panel_1	0.43	5.23	S3H

Number of items per content check
panel_id	pathway_id	content1	content2	content3	content4
Panel_1	M-E-E	8	10	11	7
Panel_1	M-E-M	7	10	10	9
Panel_1	M-H-M	8	10	7	11
Panel_1	M-H-H	11	9	8	8

Average response time per item check
panel_id	module_id	attribute	average
Panel_1	S1M	time	120.3333
Panel_1	S2E	time	118.6667
Panel_1	S2H	time	117.0000
Panel_1	S3E	time	122.8333
Panel_1	S3M	time	118.0833
Panel_1	S3H	time	121.2500

Pathway-level information requirements and realized information at selected ability levels.
theta	pathway_id	must_greater_than	realized_information	must_lower_than
-1.39	M-E-E	8.941	11.640	13.728
-0.97	M-E-E	13.412	13.412	18.199
-0.68	M-E-E	8.941	13.728	13.728
-0.43	M-E-M	8.941	13.244	13.728
-0.21	M-E-M	13.412	13.489	18.199
0.00	M-E-M	8.941	13.294	13.728
0.00	M-H-M	8.941	13.252	13.728
0.21	M-H-M	13.412	13.467	18.199
0.43	M-H-M	8.941	13.357	13.728
0.68	M-H-H	8.941	13.726	13.728
0.97	M-H-H	13.412	13.473	18.199
1.39	M-H-H	8.941	11.213	13.728