mfrmr Workflow

For a plot-first companion guide, see the separate mfrmr-visual-diagnostics vignette.

MML and Diagnostic Modes

mfrmr treats MML and JML differently on purpose.

MML integrates over the person distribution with Gauss-Hermite quadrature.
The current MML branch optimizes the quadrature-based marginal log-likelihood directly; it is not an EM implementation.
JML is often useful for quick exploratory passes, but MML remains the preferred route for final estimation and fixed-calibration follow-up.

For RSM and PCM, diagnostics now expose two distinct evidence paths:

diagnostic_mode = "legacy" keeps the residual/EAP-based stack.
diagnostic_mode = "marginal_fit" adds the strict latent-integrated screen.
diagnostic_mode = "both" is the safest default when you want to inspect both views side by side.

The strict marginal branch is still screening-oriented in the current release. Use summary(diag)$diagnostic_basis to separate the legacy residual evidence from the strict marginal evidence rather than pooling them into one decision.

Load Data

library(mfrmr)

list_mfrmr_data()

data("ej2021_study1", package = "mfrmr")
head(ej2021_study1)

study1_alt <- load_mfrmr_data("study1")
identical(names(ej2021_study1), names(study1_alt))

Minimal Runnable Example

Start with the packaged example_core dataset. It is intentionally compact, category-complete, and generated from a single latent trait plus facet main effects so that help-page examples stay fast without relying on undersized toy data. The same object is also available via data("mfrmr_example_core", package = "mfrmr"):

data("mfrmr_example_core", package = "mfrmr")
toy <- mfrmr_example_core

fit_toy <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "JML",
  model = "RSM",
  maxit = 30
)
diag_toy <- diagnose_mfrm(fit_toy, residual_pca = "none")

summary(fit_toy)$overview
summary(diag_toy)$overview
names(plot(fit_toy, draw = FALSE))

Diagnostics and Reporting

t4_toy <- unexpected_response_table(
  fit_toy,
  diagnostics = diag_toy,
  abs_z_min = 1.5,
  prob_max = 0.4,
  top_n = 10
)
t12_toy <- fair_average_table(fit_toy, diagnostics = diag_toy)
t13_toy <- bias_interaction_report(
  estimate_bias(fit_toy, diag_toy,
                facet_a = "Rater", facet_b = "Criterion",
                max_iter = 2),
  top_n = 10
)

class(summary(t4_toy))
class(summary(t12_toy))
class(summary(t13_toy))

names(plot(t4_toy, draw = FALSE))
names(plot(t12_toy, draw = FALSE))
names(plot(t13_toy, draw = FALSE))

chk_toy <- reporting_checklist(fit_toy, diagnostics = diag_toy)
subset(
  chk_toy$checklist,
  Section == "Visual Displays",
  c("Item", "DraftReady", "NextAction")
)

Fit and Diagnose with Full Data

For a realistic analysis, use the packaged Study 1 dataset:

fit <- fit_mfrm(
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)

diag <- diagnose_mfrm(
  fit,
  residual_pca = "none"
)

summary(fit)
summary(diag)

If you need residual-structure evidence for a final report, you can add residual PCA after the initial diagnostic pass. Treat this as an exploratory screen, not as a standalone unidimensionality test or as a DIMTEST/UNIDIM substitute. In MFRM reporting, a cautious claim should combine global residual fit, element-level fit, residual PCA, and local-dependence screens, for example: “evidence consistent with essential unidimensionality under the specified facet structure.”

diag_pca <- diagnose_mfrm(
  fit,
  residual_pca = "both",
  pca_max_factors = 6
)

summary(diag_pca)

Strict Diagnostics for RSM and PCM

For RSM and PCM, the package can now keep the legacy residual path and the strict marginal path side by side:

fit_rsm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7,
  maxit = 30
)
diag_rsm_strict <- diagnose_mfrm(
  fit_rsm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

fit_pcm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "PCM",
  step_facet = "Criterion",
  quad_points = 7,
  maxit = 30
)
diag_pcm_strict <- diagnose_mfrm(
  fit_pcm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

summary(diag_rsm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]
summary(diag_pcm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]

When you want a compact simulation-based screening check for the strict branch, use evaluate_mfrm_diagnostic_screening() on a small design:

screen_rsm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "local_dependence"),
  model = "RSM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)
screen_pcm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "step_structure_misspecification"),
  model = "PCM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)

screen_rsm$performance_summary[, c("Scenario", "EvaluationUse", "LegacyAnyFlagRate", "StrictAnyFlagRate")]
screen_pcm$performance_summary[, c("Scenario", "EvaluationUse", "LegacySensitivityProxy", "StrictSensitivityProxy", "DeltaStrictMinusLegacyFlagRate")]

The same strict branch is now reflected in the reporting router:

chk_rsm_strict <- reporting_checklist(fit_rsm_strict, diagnostics = diag_rsm_strict)
subset(
  chk_rsm_strict$checklist,
  Section == "Visual Displays" &
    Item %in% c("QC / facet dashboard", "Strict marginal visuals", "Precision / information curves"),
  c("Item", "Available", "DraftReady", "NextAction")
)

Residual PCA and Reporting

pca <- analyze_residual_pca(diag_pca, mode = "both")
plot_residual_pca(pca, mode = "overall", plot_type = "scree")

data("mfrmr_example_bias", package = "mfrmr")
bias_df <- mfrmr_example_bias
fit_bias <- fit_mfrm(
  bias_df,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)
diag_bias <- diagnose_mfrm(fit_bias, residual_pca = "none")
bias <- estimate_bias(fit_bias, diag_bias, facet_a = "Rater", facet_b = "Criterion")
fixed <- build_fixed_reports(bias)
apa <- build_apa_outputs(fit_bias, diag_bias, bias_results = bias)

mfrm_threshold_profiles()
vis <- build_visual_summaries(fit_bias, diag_bias, threshold_profile = "standard")
vis$warning_map$residual_pca_overall

The same example_bias dataset also carries a Group variable so DIF-oriented examples can show a non-null pattern instead of a fully clean result. It can be loaded either with load_mfrmr_data("example_bias") or data("mfrmr_example_bias", package = "mfrmr").

Human-Readable Reporting API

spec <- specifications_report(fit, title = "Study run")
data_qc <- data_quality_report(
  fit,
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score"
)
iter <- estimation_iteration_report(fit, max_iter = 8)
subset_rep <- subset_connectivity_report(fit, diagnostics = diag)
facet_stats <- facet_statistics_report(fit, diagnostics = diag)
cat_structure <- category_structure_report(fit, diagnostics = diag)
cat_curves <- category_curves_report(fit, theta_points = 101)
bias_rep <- bias_interaction_report(bias, top_n = 20)
plot_bias_interaction(bias_rep, plot = "scatter")

Design Simulation and Prediction

The package also supports a separate simulation/prediction layer. The key distinction is:

evaluate_mfrm_recovery() checks whether known generating parameters are recovered under a stated simulation design. It is the first simulation check to run when you are validating a model specification or a planned design.
evaluate_mfrm_design() and predict_mfrm_population() are design-level helpers that summarize expected operating characteristics under an explicit simulation specification.
predict_mfrm_units() and sample_mfrm_plausible_values() score future or partially observed persons under a fixed MML calibration.

sim_spec <- build_mfrm_sim_spec(
  n_person = 30,
  n_rater = 4,
  n_criterion = 4,
  raters_per_person = 2,
  assignment = "rotating"
)

recovery <- suppressWarnings(
  evaluate_mfrm_recovery(
    sim_spec = sim_spec,
    reps = 2,
    maxit = 30,
    include_diagnostics = TRUE,
    diagnostic_fit_df_method = "both",
    seed = 2
  )
)

summary(recovery)$recovery_summary[, c("ParameterType", "Facet", "RMSE", "Bias")]
plot(recovery, type = "summary", metric = "rmse", draw = FALSE)$data$plot_table

recovery_review <- assess_mfrm_recovery(
  recovery,
  min_reps = 2,
  min_se_available = NULL,
  max_mcse_rmse_ratio = NULL,
  max_rmse = c(facet = 1, step = 1, default = 1.5),
  max_abs_bias = c(default = 0.75)
)

summary(recovery_review)$checklist[, c("Section", "Item", "Status")]
summary(recovery_review)$reading_order
summary(recovery_review)$condition_reporting_notes
summary(recovery_review)$condition_review
summary(recovery_review)$diagnostic_reporting_notes
summary(recovery_review)$diagnostic_review

status_plot <- plot(recovery_review, type = "status", draw = FALSE)
status_plot$data$section_status
status_plot$data$reading_order

metric_plot <- plot(recovery_review, type = "metrics", metric = "rmse", draw = FALSE)
metric_plot$data$plot_table
metric_plot$data$guidance

recovery_bundle <- build_summary_table_bundle(
  recovery_review,
  appendix_preset = "recommended"
)
recovery_bundle$table_index[, c("Table", "Rows", "Role")]

# For a release-scale check, use the optional validation protocol:
# source(system.file("validation", "recovery-validation.R", package = "mfrmr"))
# validation <- mfrmr_run_recovery_validation(tier = "core", output_dir = "recovery-validation")
# summary(validation)
# validation_summary <- summary(validation)
# validation_summary$reading_order
# validation_summary$topline_release_decision
# validation_summary$release_decision_table
# validation_summary$condition_reporting_notes
# validation_summary$condition_summary
# validation_summary$diagnostic_reporting_notes
# validation_summary$domain_decision_table
# validation_bundle <- build_summary_table_bundle(validation_summary)
# validation_bundle$tables$reading_order
# validation_bundle$tables$topline_release_decision
# validation_bundle$tables$condition_reporting_notes
# validation_bundle$tables$condition_summary
# validation_bundle$tables$diagnostic_reporting_notes
# The top-line table is the release-recovery conclusion. It uses recovery
# metrics, convergence, and Monte Carlo precision from core cases as the
# primary evidence, while ExtendedSensitivityStatus reports non-core stress
# cases separately. The condition notes and condition table keep generator
# stress and score support separate from recovery metrics, so OverallStatus = "review" is not
# read as a recovery-metric failure by itself. The diagnostic reporting notes
# turn fit/separation operating characteristics into report caveats rather
# than release gates.

pred_pop <- predict_mfrm_population(
  sim_spec = sim_spec,
  reps = 2,
  maxit = 30,
  seed = 1
)

summary(pred_pop)$forecast[, c("Facet", "MeanSeparation", "McseSeparation")]

keep_people <- unique(toy$Person)[1:18]
toy_mml <- suppressWarnings(
  fit_mfrm(
    toy[toy$Person %in% keep_people, , drop = FALSE],
    person = "Person",
    facets = c("Rater", "Criterion"),
    score = "Score",
    method = "MML",
    quad_points = 5,
    maxit = 30
  )
)

new_units <- data.frame(
  Person = c("NEW01", "NEW01"),
  Rater = unique(toy$Rater)[1],
  Criterion = unique(toy$Criterion)[1:2],
  Score = c(2, 3)
)

pred_units <- predict_mfrm_units(toy_mml, new_units, n_draws = 0)
pv_units <- sample_mfrm_plausible_values(toy_mml, new_units, n_draws = 2, seed = 1)

summary(pred_units)$estimates[, c("Person", "Estimate", "Lower", "Upper")]
summary(pv_units)$draw_summary[, c("Person", "Draws", "MeanValue")]

For a report or appendix handoff, pass the recovery objects through the same summary-table export route used by the rest of the package:

export_summary_appendix(
  list(recovery = recovery, recovery_review = recovery_review),
  output_dir = tempdir(),
  prefix = "mfrmr_recovery_appendix",
  preset = "recommended",
  include_html = FALSE,
  overwrite = TRUE
)

For a quick smoke test, reps = 2 or another very small value is useful only to check that the data-generating setup and refit path run. For a study report, increase reps, keep the ADEMP-style metadata in the exported tables, and set substantive RMSE/Bias thresholds so that assess_mfrm_recovery() can mark those rows as ok, review, or concern rather than not_assessed.