| Title: | Antedependence Models for Longitudinal Data |
| Version: | 0.1.0 |
| Author: | Chenyang Li [aut, cre], Dale Zimmerman [aut, ctb] |
| Maintainer: | Chenyang Li <chenyang-li@uiowa.edu> |
| Description: | Fitting, simulation, and inference for antedependence models for longitudinal data, as described in Zimmerman and Nunez-Anton (2009, ISBN:9781420011074). Supports integer-valued antedependence (INAD) models for count data with thinning operators (binomial, Poisson, negative binomial) and flexible innovation distributions (Poisson, Bell, negative binomial), categorical antedependence models for discrete-state longitudinal outcomes, and Gaussian antedependence (AD) models for continuous data. Implements maximum likelihood estimation via time-separable optimization and block coordinate descent, with confidence intervals based on Louis' identity and profile likelihood. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Language: | en-US |
| RoxygenNote: | 7.3.3 |
| URL: | https://tanchyking.github.io/antedep/, https://github.com/TanchyKing/antedep |
| BugReports: | https://github.com/TanchyKing/antedep/issues |
| Imports: | nloptr (≥ 1.2.0), stats, ggplot2, MASS |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| Depends: | R (≥ 3.5) |
| LazyData: | true |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2026-04-22 16:01:40 UTC; Tanch |
| Repository: | CRAN |
| Date/Publication: | 2026-04-24 18:30:13 UTC |
Convert Bell parameter to Bell mean
Description
Convert Bell parameter to Bell mean
Usage
.bell_mean_from_theta(theta)
Convert Bell mean to Bell parameter
Description
Convert Bell mean to Bell parameter
Usage
.bell_theta_from_mean(mu)
Blend old/new CAT parameters and renormalize probabilities
Description
Blend old/new CAT parameters and renormalize probabilities
Usage
.blend_cat_params_em(
old_marginal,
old_transition,
new_marginal,
new_transition,
step,
order,
n_time,
n_blocks,
homogeneous
)
Arguments
old_marginal |
Previous marginal parameters. |
old_transition |
Previous transition parameters. |
new_marginal |
Proposed marginal parameters. |
new_transition |
Proposed transition parameters. |
step |
Step size in (0, 1]. |
order |
Model order. |
n_time |
Number of time points. |
n_blocks |
Number of blocks. |
homogeneous |
Whether parameters are shared across blocks. |
Value
List with blended marginal and transition.
Build AD covariance matrix from parameters
Description
Build AD covariance matrix from parameters
Usage
.build_gau_covariance(order, phi, sigma, n_time)
Arguments
order |
Antedependence order |
phi |
Dependence parameters |
sigma |
Innovation standard deviations |
n_time |
Number of time points |
Value
Covariance matrix (n_time x n_time)
Convert probability row to unconstrained logits
Description
Convert probability row to unconstrained logits
Usage
.cat_prob_to_theta(prob, c, eps = 1e-08)
Arguments
prob |
Probability vector of length c |
c |
Number of categories |
eps |
Lower bound for numerical stability |
Value
Numeric vector of length c-1
Convert unconstrained logits to probability row
Description
Convert unconstrained logits to probability row
Usage
.cat_theta_to_prob(theta_row, c)
Arguments
theta_row |
Numeric vector of length c-1 |
c |
Number of categories |
Value
Probability vector of length c
Compute alpha and theta CIs using Louis' identity (general version)
Description
Compute alpha and theta CIs using Louis' identity (general version)
Usage
.ci_alpha_theta_louis_fe(
y,
fit,
blocks,
level,
idx_time,
ridge,
thinning,
innovation
)
Wald CI at time i using Louis' identity
Description
Wald CI at time i using Louis' identity
Usage
.ci_wald_i_louis_fe(
y,
i,
alpha_hat,
theta_hat,
tau_hat,
blocks,
thinning,
innovation,
nb_inno_size = NULL,
level = 0.95,
ridge = 0
)
Compute CIs for marginal parameters
Description
Compute CIs for marginal parameters
Usage
.compute_marginal_ci(marginal, cell_counts, p, c, n_time, n_subjects, z, level)
Arguments
marginal |
Marginal parameter list from fit |
cell_counts |
Cell counts from fit |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
n_subjects |
Number of subjects |
z |
Z critical value |
level |
Confidence level |
Value
Data frame with CIs
Compute CIs for transition parameters
Description
Compute CIs for transition parameters
Usage
.compute_transition_ci(transition, cell_counts, p, c, n_time, z, level)
Arguments
transition |
Transition parameter list from fit |
cell_counts |
Cell counts from fit |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
z |
Z critical value |
level |
Confidence level |
Value
List of data frames with CIs for each time point
Construct AD covariance matrix from fitted model
Description
Construct AD covariance matrix from fitted model
Usage
.construct_gau_covariance(fit, n_time)
Arguments
fit |
Fitted AD model from fit_gau(). |
n_time |
Number of time points. |
Value
Covariance matrix.
Count cells for contingency table
Description
Counts occurrences of each combination of categories at specified time indices.
Usage
.count_cells_cat(y, time_indices, n_categories, subject_mask = NULL)
Arguments
y |
Data matrix (n_subjects x n_time) |
time_indices |
Integer vector of time indices to count |
n_categories |
Number of categories |
subject_mask |
Logical vector indicating which subjects to include (NULL = all) |
Value
Array of counts with dimensions c x c x ... (length = length(time_indices))
Count cells efficiently using table()
Description
Alternative implementation using R's table function.
Usage
.count_cells_table_cat(y, time_indices, n_categories, subject_mask = NULL)
Arguments
y |
Data matrix (n_subjects x n_time) |
time_indices |
Integer vector of time indices |
n_categories |
Number of categories |
subject_mask |
Logical vector for subject selection |
Value
Array of counts
Count free parameters for AD(p) categorical model
Description
Count free parameters for AD(p) categorical model
Usage
.count_params_cat(
order,
n_categories,
n_time,
n_blocks = 1,
homogeneous = TRUE
)
Arguments
order |
AD order p |
n_categories |
Number of categories c |
n_time |
Number of time points n |
n_blocks |
Number of blocks/groups |
homogeneous |
Whether parameters are shared across blocks |
Value
Number of free parameters
Count parameters for AD model
Description
Count parameters for AD model
Usage
.count_params_gau(order, n_time, n_blocks = 1L, estimate_mu = FALSE)
Arguments
order |
Model order. |
n_time |
Number of time points. |
n_blocks |
Number of blocks (1 if no blocks). |
estimate_mu |
Whether mu is estimated. |
Value
Number of free parameters.
Count parameters for homogeneity test
Description
Count parameters for homogeneity test
Usage
.count_params_homogeneity(
fit,
test,
n_blocks,
order,
n_time,
thinning,
innovation,
is_null
)
Arguments
fit |
Fitted model |
test |
Test type |
n_blocks |
Number of groups |
order |
AD order |
n_time |
Number of time points |
thinning |
Thinning type |
innovation |
Innovation type |
is_null |
Whether this is the null model |
Value
Number of free parameters
Convert counts to probabilities (with safe division)
Description
Convert counts to probabilities (with safe division)
Usage
.counts_to_probs_cat(counts, margin = NULL)
Arguments
counts |
Array of counts |
margin |
Which margins to condition on (sum over). NULL = no conditioning (joint). |
Value
Array of probabilities
Compute transition probabilities from counts (simpler version)
Description
Given counts array with last dimension being Y_k and earlier dimensions being conditioning variables, compute P(Y_k | conditioning).
Usage
.counts_to_transition_probs(counts)
Arguments
counts |
Array of counts, last dimension is response |
Value
Array of transition probabilities (same dimensions)
Create default marginal parameters (uniform)
Description
Create default marginal parameters (uniform)
Usage
.default_marginal_cat(p, c, n_time, n_blocks, homogeneous)
Arguments
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
n_blocks |
Number of groups |
homogeneous |
Whether to share parameters |
Value
List of marginal parameters
Create default transition parameters (uniform)
Description
Create default transition parameters (uniform)
Usage
.default_transition_cat(p, c, n_time, n_blocks, homogeneous)
Arguments
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
n_blocks |
Number of groups |
homogeneous |
Whether to share parameters |
Value
List of transition parameters
E-step: Compute expected sufficient statistics
Description
Computes conditional expectations for missing values and second moments.
Usage
.em_e_step_gau(y, order, mu, phi, sigma, blocks, tau)
M-step: Update parameters from sufficient statistics
Description
M-step: Update parameters from sufficient statistics
Usage
.em_m_step_gau(suff_stats, order, blocks, estimate_mu, n_subjects, n_time)
Convert EM expected counts to fit_cat-style cell_counts
Description
Convert EM expected counts to fit_cat-style cell_counts
Usage
.expected_counts_to_cell_counts_cat_em(
counts,
order,
n_time,
n_blocks,
homogeneous
)
Arguments
counts |
Expected counts from the E-step. |
order |
Model order. |
n_time |
Number of time points. |
n_blocks |
Number of blocks. |
homogeneous |
Logical; whether parameters are shared across blocks. |
Value
A list shaped like fit_cat(...) cell_counts.
Extract conditional probabilities from transition array
Description
Extract conditional probabilities from transition array
Usage
.extract_conditional_probs(trans_array, history)
Arguments
trans_array |
Transition array with last dimension being outcome |
history |
Vector of conditioning values |
Value
Vector of probabilities for current time point
Fit CAT model for a single population
Description
Fit CAT model for a single population
Usage
.fit_cat_single_pop(y, p, c, n_time, subject_mask = NULL)
Arguments
y |
Data matrix |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
subject_mask |
Logical vector for subject selection (NULL = all) |
Value
List with marginal, transition, cell_counts, log_l
Fit CAT model with missing data via observed-data likelihood optimization
Description
Fit CAT model with missing data via observed-data likelihood optimization
Usage
.fit_cat_single_pop_marginalize(y, p, c, n_time)
Arguments
y |
Data matrix for one population (may contain NA) |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
Value
List with marginal, transition, cell_counts, log_l, convergence
Fit stationary categorical AD model
Description
Internal function to fit a model where both marginal and transition probabilities are constant over time.
Usage
.fit_cat_stationary(y, p, c, blocks = NULL, homogeneous = TRUE)
Arguments
y |
Data matrix |
p |
Order |
c |
Number of categories |
blocks |
Block vector (or NULL) |
homogeneous |
Whether to pool across blocks |
Value
A cat_fit-like object with stationary constraints
Fit stationary model for single population
Description
Fit stationary model for single population
Usage
.fit_cat_stationary_single(y, p, c, n_time, subject_mask = NULL)
Arguments
y |
Data matrix |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
subject_mask |
Logical mask for subjects (NULL = all) |
Value
List with marginal, transition, log_l, n_params
Fit time-invariant categorical AD model
Description
Internal function to fit a model where transition probabilities are constant across time (for k > p).
Usage
.fit_cat_timeinvariant(y, p, c, blocks = NULL, homogeneous = TRUE)
Arguments
y |
Data matrix |
p |
Order |
c |
Number of categories |
blocks |
Block vector (or NULL) |
homogeneous |
Whether to pool across blocks |
Value
A cat_fit-like object with time-invariant transitions
Fit time-invariant model for single population
Description
Fit time-invariant model for single population
Usage
.fit_cat_timeinvariant_single(y, p, c, n_time, subject_mask = NULL)
Arguments
y |
Data matrix |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
subject_mask |
Logical mask for subjects (NULL = all) |
Value
List with marginal, transition, log_l, n_params
EM algorithm for AD with missing data
Description
Implements the EM algorithm for fitting AD(p) models with missing data under MAR (Missing At Random) assumption.
Usage
.fit_gau_em(
y,
order,
blocks = NULL,
estimate_mu = TRUE,
max_iter = 100,
tol = 1e-06,
verbose = FALSE,
...
)
Arguments
y |
Data matrix (n_subjects x n_time), may contain NA |
order |
Antedependence order (0, 1, or 2) |
blocks |
Block membership vector (optional) |
estimate_mu |
Logical, whether to estimate mu (default TRUE) |
max_iter |
Maximum number of EM iterations (default 100) |
tol |
Convergence tolerance on log-likelihood (default 1e-6) |
verbose |
Logical, print iteration info (default FALSE) |
Value
List with fitted model (same structure as fit_gau, plus EM info)
Fit fully heterogeneous INAD model
Description
Internal function that fits separate INAD models for each group and combines the log-likelihoods.
Usage
.fit_inad_heterogeneous(
y,
blocks,
n_blocks,
group_indices,
order,
thinning,
innovation,
...
)
Arguments
y |
Data matrix |
blocks |
Normalized block vector |
n_blocks |
Number of groups |
group_indices |
List of subject indices per group |
order |
AD order |
thinning |
Thinning type |
innovation |
Innovation type |
... |
Additional arguments passed to fit_inad |
Value
A list mimicking fit_inad output with combined log-likelihood
Get all category combinations of given length
Description
Get all category combinations of given length
Usage
.get_combinations_cat(n_categories, length)
Arguments
n_categories |
Number of categories c |
length |
Number of positions |
Value
Matrix where each row is a combination (c^length rows, length columns)
Extract history for a subject at time k
Description
Extract history for a subject at time k
Usage
.get_history(y_row, k, order)
Arguments
y_row |
Single row of data (one subject) |
k |
Current time index |
order |
AD order p |
Value
Integer vector of length min(k-1, order) with category history
Compute group-specific means for two-sample case
Description
Compute group-specific means for two-sample case
Usage
.group_means_gau(y, blocks)
Arguments
y |
Numeric matrix n_subjects by n_time. |
blocks |
Integer vector indicating group membership. |
Value
List with mu1 and mu2 vectors.
Convert history to array index
Description
Convert history to array index
Usage
.history_to_index(history, n_categories)
Arguments
history |
Integer vector of category values |
n_categories |
Number of categories |
Value
Integer index (1-based) into flattened array
Build transition objects for truncated-state missing-data likelihood
Description
Build transition objects for truncated-state missing-data likelihood
Usage
.inad_build_transitions(
order,
N,
K,
alpha1,
alpha2,
thinning,
innovation,
lambda,
nb_inno_size
)
Truncated discrete convolution
Description
Truncated discrete convolution
Usage
.inad_conv_trunc(p, q, K)
Effective innovation mean under block effects
Description
Effective innovation mean under block effects
Usage
.inad_effective_innovation_mean(theta, tau, innovation)
Effective innovation distribution parameter under block effects
Description
For Poisson and negative binomial innovations, block effects are additive on
the mean. For Bell innovations, block effects are additive on the innovation
mean theta * exp(theta), then mapped back to the Bell parameter.
Usage
.inad_effective_innovation_param(theta, tau, innovation)
Build truncated thinning pmf for one previous count
Description
Build truncated thinning pmf for one previous count
Usage
.inad_make_thin_pmf(prev, alpha, thinning, K)
Retrieve/calculate order-2 transition distribution for one previous state pair
Description
Retrieve/calculate order-2 transition distribution for one previous state pair
Usage
.inad_order2_dist(prev2, prev1, info, K)
Heuristic state-space bound for INAD missing-data recursion
Description
Heuristic state-space bound for INAD missing-data recursion
Usage
.inad_state_max(y, order, alpha1, alpha2, theta, tau, innovation = "pois")
Subject-level observed-data likelihood for INAD(0)
Description
Subject-level observed-data likelihood for INAD(0)
Usage
.inad_subject_ll_order0(y_s, innov_list, K)
Subject-level observed-data likelihood for INAD(1)
Description
Subject-level observed-data likelihood for INAD(1)
Usage
.inad_subject_ll_order1(y_s, init_pmf, trans1, K)
Subject-level observed-data likelihood for INAD(2)
Description
Subject-level observed-data likelihood for INAD(2)
Usage
.inad_subject_ll_order2(y_s, init_pmf, trans_t2, order2_info, K)
Initialize parameters for AD EM algorithm
Description
Initialize parameters for AD EM algorithm
Usage
.initialize_gau_em(y, order, blocks, estimate_mu)
Initialize from marginal statistics (ignoring dependence)
Description
Initialize from marginal statistics (ignoring dependence)
Usage
.initialize_gau_marginal(y, order, blocks)
Innovation probability vector
Description
Computes the probability mass for innovation counts.
Usage
.innov_vec(u_vals, lam, innovation, sz)
Arguments
u_vals |
Integer vector of possible innovation values. |
lam |
Innovation parameter used by the PMF. This is the mean for
|
innovation |
One of "pois", "bell", "nbinom". |
sz |
Size parameter for negative binomial innovation (ignored otherwise). |
Value
Numeric vector of probabilities.
Compute innovation variance estimates under AD(p)
Description
The MLE of the innovation variance at time i is RSS_i(p) / N.
Usage
.innovation_var_gau(y, p, mu = NULL)
Arguments
y |
Numeric matrix n_subjects by n_time. |
p |
Antedependence order. |
mu |
Optional mean vector for centering. |
Value
Numeric vector of innovation variance estimates.
Compute log-likelihood from cell counts (faster for large datasets)
Description
Instead of looping over subjects, this version uses aggregated cell counts. Equivalent to logL_cat but more efficient.
Usage
.logL_from_counts_cat(y, order, marginal, transition, n_categories)
Arguments
y |
Data matrix |
order |
AD order p |
marginal |
Marginal parameters |
transition |
Transition parameters |
n_categories |
Number of categories |
Value
Scalar log-likelihood
Compute observed-data log-likelihood for AD with missing values
Description
Uses multivariate normal marginalization to compute the likelihood of observed values, marginalizing over missing values.
Usage
.logL_gau_missing(y, order, mu, phi, sigma, blocks = NULL, tau = 0)
Arguments
y |
Data matrix (n_subjects x n_time), may contain NA |
order |
Antedependence order (0, 1, or 2) |
mu |
Mean vector (length n_time) |
phi |
Dependence parameter(s) |
sigma |
Innovation standard deviations (length n_time) |
blocks |
Block membership vector (optional) |
tau |
Block effects, first element constrained to zero |
Value
Observed-data log-likelihood (scalar)
Observed-data INAD likelihood with missing values under MAR
Description
Uses forward recursion over a truncated count state space.
Usage
.logL_inad_missing(
y,
order,
thinning,
innovation,
alpha,
theta,
nb_inno_size = NULL,
blocks = NULL,
tau = 0
)
Compute log-likelihood contribution from one subject
Description
Compute log-likelihood contribution from one subject
Usage
.logL_subject_cat(y_s, p, c, marginal, transition)
Arguments
y_s |
Integer vector of length n_time (one subject's data) |
p |
Order |
c |
Number of categories |
marginal |
Marginal parameters for this population |
transition |
Transition parameters for this population |
Value
Scalar log-likelihood contribution
Observed-data log-likelihood for order-1 CAT model
Description
Observed-data log-likelihood for order-1 CAT model
Usage
.logL_subject_cat_marginalize_p1(y_s, c, marginal, transition)
Arguments
y_s |
Subject row (may contain NA) |
c |
Number of categories |
marginal |
Marginal list |
transition |
Transition list |
Value
Scalar log-likelihood contribution
Observed-data log-likelihood for order-2 CAT model
Description
Observed-data log-likelihood for order-2 CAT model
Usage
.logL_subject_cat_marginalize_p2(y_s, c, marginal, transition)
Arguments
y_s |
Subject row (may contain NA) |
c |
Number of categories |
marginal |
Marginal list |
transition |
Transition list |
Value
Scalar log-likelihood contribution
Compute observed-data log-likelihood contribution from one subject
Description
Compute observed-data log-likelihood contribution from one subject
Usage
.logL_subject_cat_observed(y_s, p, c, marginal, transition, na_action)
Arguments
y_s |
Integer vector of length n_time (one subject's data; may contain NA) |
p |
Order |
c |
Number of categories |
marginal |
Marginal parameters for this population |
transition |
Transition parameters for this population |
na_action |
Missing-data handling mode |
Value
Scalar log-likelihood contribution
Log-sum-exp for numerical stability
Description
Log-sum-exp for numerical stability
Usage
.log_sum_exp(x)
Compute log-likelihood contribution from counts and probabilities
Description
Computes sum of N * log(pi) over all cells.
Usage
.loglik_contribution(counts, probs)
Arguments
counts |
Array of counts |
probs |
Array of probabilities (same dimensions as counts) |
Value
Scalar log-likelihood contribution
Stable log-sum-exp
Description
Stable log-sum-exp
Usage
.logsumexp_cat(x)
Arguments
x |
Numeric vector |
Value
Scalar log(sum(exp(x)))
Louis' identity observed information at time i
Description
Louis' identity observed information at time i
Usage
.louis_info_i_fe(
y,
i,
alpha_hat,
theta_hat,
tau_hat,
blocks,
thinning,
innovation,
nb_inno_size = NULL
)
Compute MLE of mean vector under AD(p)
Description
Under AD(p), the MLE of mu is the sample mean vector when there are no covariates.
Usage
.mle_mean_gau(y)
Arguments
y |
Numeric matrix n_subjects by n_time. |
Value
Numeric vector of mean estimates.
Pack CAT parameters into unconstrained vector
Description
Pack CAT parameters into unconstrained vector
Usage
.pack_cat_params(marginal, transition, p, c, n_time)
Arguments
marginal |
Marginal parameter list |
transition |
Transition parameter list |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
Value
Numeric parameter vector
Compute intervenor-adjusted sample partial correlation
Description
Computes the sample partial correlation between Y_i and Y_j, adjusted for the intervenors Y_(j+1), ..., Y_(i-1), from the residuals of the fitted mean structure.
Usage
.partial_corr_gau(y, i, j, mu = NULL)
Arguments
y |
Numeric matrix n_subjects by n_time. |
i |
Row/column index (larger time index). |
j |
Row/column index (smaller time index), j < i. |
mu |
Optional mean vector for centering. |
Details
The intervenor-adjusted partial correlation r_(i,j|(j+1:i-1)) is computed from residuals of regressing Y_i and Y_j on the intervenors.
Under AD(p), the partial correlation r_(i,i-k|(i-k+1:i-1)) should be approximately zero for k > p.
Value
Sample partial correlation (scalar).
Compute matrix of intervenor-adjusted partial correlations
Description
Compute matrix of intervenor-adjusted partial correlations
Usage
.partial_corr_matrix_gau(y, mu = NULL)
Arguments
y |
Numeric matrix n_subjects by n_time. |
mu |
Optional mean vector for centering. |
Value
Matrix where entry (i,j) for i > j is r_(i,j|(j+1:i-1)).
Posterior computations for all thinning-innovation combinations
Description
Computes conditional expectations and variances needed for Louis' identity. See Appendix C of the paper for the derivation.
Usage
.poster_general(m, n, alpha, lambda, thinning, innovation, nb_inno_size = NULL)
Compute partial residuals for PRISM plot
Description
Compute partial residuals for PRISM plot
Usage
.prism_residuals(y, i, j)
Arguments
y |
Data matrix (n_subjects x n_time). |
i |
First time index. |
j |
Second time index (j > i + 1). |
Value
List with resid_i and resid_j (partial residuals).
Modified psi function for test statistic correction
Description
Computes the psi function used in the modified likelihood ratio test (Kenward, 1987) for better small-sample approximation.
Usage
.psi_kenward(x, y)
Arguments
x |
First argument. |
y |
Second argument. |
Details
psi(x, y) is defined as:
0 if x = 0
(2y - 1) / (2y(y - 1)) if x = 1
2 * sum from l=1 to x/2 of (y + 2l - 2)^(-1) if x > 0 and even
psi(1, y) + psi(x - 1, y + 1) if x > 1 and odd
Value
Value of psi(x, y).
Compute residual sum of squares from AD regression
Description
For time point i, regress Y_i on its p_i predecessors (Y_(i-1), ..., Y_(i-p_i)) and return the residual sum of squares.
Usage
.rss_gau(y, i, p, mu = NULL, include_intercept = TRUE)
Arguments
y |
Numeric matrix n_subjects by n_time. |
i |
Time index (1-based). |
p |
Number of predecessors to include in regression (can be 0). |
mu |
Optional mean vector. If provided, data is centered by mu before regression. If NULL, regression includes an intercept. |
include_intercept |
Logical. If TRUE and mu is NULL, include intercept in regression. |
Details
When mu is provided, the regression is:
(Y_i - mu_i) ~ (Y_(i-1) - mu_(i-1)) + ... + (Y_(i-p) - mu_(i-p))
with no intercept.
When mu is NULL and include_intercept is TRUE:
Y_i ~ 1 + Y_(i-1) + ... + Y_(i-p)
For i = 1 or p = 0, the RSS is simply the sum of squared deviations from the mean (or from mu_1 if provided).
Value
Residual sum of squares (scalar).
Compute RSS for two-sample case (pooled and separate)
Description
Compute RSS for two-sample case (pooled and separate)
Usage
.rss_two_sample_gau(y, blocks, i, p, pooled = TRUE)
Arguments
y |
Numeric matrix n_subjects by n_time. |
blocks |
Integer vector indicating group membership (1 or 2). |
i |
Time index. |
p |
Antedependence order. |
pooled |
Logical. If TRUE, compute pooled RSS; if FALSE, within-group RSS. |
Value
RSS value.
Compute RSS vector for all time points under AD(p)
Description
Compute RSS vector for all time points under AD(p)
Usage
.rss_vector_gau(y, p, mu = NULL)
Arguments
y |
Numeric matrix n_subjects by n_time. |
p |
Antedependence order. |
mu |
Optional mean vector for centering. |
Value
Numeric vector of length n_time with RSS for each time point.
Safe log function (0 * log(0) = 0)
Description
Safe log function (0 * log(0) = 0)
Usage
.safe_log(x)
Arguments
x |
Numeric vector or array |
Value
Log of x, with 0 for x = 0
Safeguard M-step update via step-halving
Description
Safeguard M-step update via step-halving
Usage
.safeguard_update_cat_em(
y,
order,
old_marginal,
old_transition,
new_marginal,
new_transition,
blocks_id,
homogeneous,
c,
logL_old,
n_time,
n_blocks
)
Arguments
y |
Data matrix. |
order |
Model order. |
old_marginal |
Previous marginal parameters. |
old_transition |
Previous transition parameters. |
new_marginal |
Proposed marginal parameters from M-step. |
new_transition |
Proposed transition parameters from M-step. |
blocks_id |
Normalized block ids. |
homogeneous |
Whether parameters are shared across blocks. |
c |
Number of categories. |
logL_old |
Observed-data log-likelihood at old parameters. |
n_time |
Number of time points. |
n_blocks |
Number of blocks. |
Value
List with safeguarded parameters, acceptance flag, and step size.
Simulate one subject's trajectory
Description
Simulate one subject's trajectory
Usage
.simulate_subject_cat(n_time, p, c, marginal, transition)
Arguments
n_time |
Number of time points |
p |
Order |
c |
Number of categories |
marginal |
Marginal parameters |
transition |
Transition parameters |
Value
Integer vector of length n_time
Thinning probability vector
Description
Computes the probability mass for thinned counts.
Usage
.thin_vec(k_vals, yprev, a, thinning)
Arguments
k_vals |
Integer vector of possible thinned values. |
yprev |
Previous count (size parameter for thinning). |
a |
Thinning parameter alpha. |
thinning |
One of "binom", "pois", "nbinom". |
Value
Numeric vector of probabilities.
Build uniform CAT parameter values
Description
Build uniform CAT parameter values
Usage
.uniform_cat_params(p, c, n_time)
Arguments
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
Value
List with marginal and transition
Unpack alpha parameters
Description
Converts alpha from various input formats to a standardized list.
Usage
.unpack_alpha(order, alpha, N)
Arguments
order |
Model order (0, 1, or 2). |
alpha |
Alpha parameter in various formats. |
N |
Number of time points. |
Value
List with elements a1 and a2.
Unpack unconstrained CAT parameter vector
Description
Unpack unconstrained CAT parameter vector
Usage
.unpack_cat_params(theta, p, c, n_time)
Arguments
theta |
Numeric parameter vector |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
Value
List with marginal and transition parameter lists
Validate blocks parameter
Description
Validate blocks parameter
Usage
.validate_blocks_cat(blocks, n_subjects)
Arguments
blocks |
Block membership vector |
n_subjects |
Number of subjects |
Value
Validated blocks vector (integer, 1-indexed)
Validate transition probability parameters
Description
Checks that transition probabilities are valid (non-negative, sum to 1).
Usage
.validate_params_cat(marginal, transition, p, c, n_time, tol = 1e-08)
Arguments
marginal |
Marginal parameters |
transition |
Transition parameters |
p |
Order |
c |
Number of categories |
n_time |
Number of time points |
tol |
Tolerance for checking sums |
Value
TRUE if valid, otherwise throws error
Validate categorical data matrix
Description
Validate categorical data matrix
Usage
.validate_y_cat(y, n_categories = NULL, allow_na = FALSE)
Arguments
y |
Data matrix |
n_categories |
Number of categories (NULL to infer) |
allow_na |
Logical; if TRUE, allow missing values. |
Value
List with validated y and n_categories
The Bell distribution
Description
Density, distribution function, quantile function and random generation
for the Bell distribution with parameter theta.
Usage
dbell(x, theta, log = FALSE)
pbell(x, theta)
rbell(n, theta, max_z = 100L)
qbell(p, theta, max_z = 100L)
Arguments
x |
vector of nonnegative integers (for |
theta |
scalar nonnegative Bell parameter. |
log |
logical; if TRUE, probabilities p are given as log(p). |
n |
number of observations to generate (for |
max_z |
maximum support value used for approximation in
|
p |
numeric vector of probabilities between 0 and 1 inclusive (for |
Details
Let B_x denote the xth Bell number. The Bell distribution has
probability mass function
P(X = x) = \theta^x \exp(-\exp(\theta) + 1) \frac{B_x}{x!},
for nonnegative integers x and \theta \ge 0.
For \theta > 0, the Bell mean is E[X] = \theta e^\theta.
At \theta = 0, the distribution is degenerate at 0.
The functions follow the standard naming used in base R:
dbell for the density, pbell for the distribution function,
qbell for the quantile function and rbell for random
generation.
Value
For dbell, a numeric vector of probabilities.
For pbell, a numeric vector of cumulative probabilities.
For qbell, an integer vector of quantiles.
For rbell, an integer vector of random values.
Examples
dbell(0:5, theta = 1)
pbell(0:5, theta = 1)
qbell(c(0.25, 0.5, 0.9), theta = 1)
set.seed(1)
rbell(10, theta = 1)
Bell numbers
Description
Compute Bell numbers up to order z using a truncated series representation.
Usage
Bz(z)
Arguments
z |
nonnegative integer giving the maximum order. |
Details
This function is for internal use. The results are stored in .Bell_num
and used by the Bell distribution functions.
Value
numeric vector of length z + 1 containing B_0, ..., B_z.
Akaike information criterion for fitted categorical AD models
Description
Computes AIC using the fitted log likelihood and a parameter count for categorical antedependence parameters.
Usage
aic_cat(fit)
Arguments
fit |
A fitted model object of class |
Details
The AIC is computed as:
AIC = -2 \times \ell + 2k
where \ell is the log-likelihood and k is the number of free
parameters.
Value
A numeric scalar AIC value.
Examples
set.seed(1)
y <- simulate_cat(40, 5, order = 1, n_categories = 2)
fit <- fit_cat(y, order = 1)
aic_cat(fit)
Akaike information criterion for fitted Gaussian AD models
Description
Computes AIC using the fitted log likelihood and a parameter count that respects identifiability constraints for the Gaussian antedependence parameters.
Usage
aic_gau(fit)
Arguments
fit |
A fitted model object returned by |
Details
The AIC is computed as:
AIC = -2 \times \ell + 2k
where \ell is the log-likelihood and k is the number of free
parameters.
This function applies to Gaussian AD fits from fit_gau.
Value
A numeric scalar AIC value.
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 30, n_time = 5, order = 1, phi = 0.3)
fit <- fit_gau(y, order = 1)
aic_gau(fit)
Akaike information criterion for fitted INAD models
Description
Computes AIC using the fitted log likelihood and a parameter count that respects structural zeros and identifiability constraints.
Usage
aic_inad(fit)
Arguments
fit |
A fitted model object returned by |
Details
The AIC is computed as:
AIC = -2 \times \ell + 2k
where \ell is the log-likelihood and k is the number of free
parameters.
Value
A numeric scalar AIC value.
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 40,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
fit <- fit_inad(y, order = 1, thinning = "binom", innovation = "pois", max_iter = 20)
aic_inad(fit)
Bayesian information criterion for fitted categorical AD models
Description
Computes BIC using the fitted log likelihood and a parameter count for categorical antedependence parameters.
Usage
bic_cat(fit, n_subjects = NULL)
Arguments
fit |
A fitted model object of class |
n_subjects |
Number of subjects. If NULL, extracted from fit. |
Details
The BIC is computed as:
BIC = -2 \times \ell + k \times \log(N)
where \ell is the log-likelihood, k is the number of free parameters,
and N is the number of subjects.
Value
A numeric scalar BIC value.
Examples
set.seed(1)
y <- simulate_cat(40, 5, order = 1, n_categories = 2)
# Fit models of different orders
fit0 <- fit_cat(y, order = 0)
fit1 <- fit_cat(y, order = 1)
fit2 <- fit_cat(y, order = 2)
# Compare BIC
c(BIC_0 = bic_cat(fit0), BIC_1 = bic_cat(fit1), BIC_2 = bic_cat(fit2))
Bayesian information criterion for fitted Gaussian AD models
Description
Computes BIC using the fitted log likelihood and a parameter count that respects identifiability constraints for the Gaussian antedependence parameters.
Usage
bic_gau(fit, n_subjects = NULL)
Arguments
fit |
A fitted model object returned by |
n_subjects |
Number of subjects, typically |
Details
The BIC is computed as:
BIC = -2 \times \ell + k \times \log(N)
where \ell is the log-likelihood, k is the number of free parameters,
and N is the number of subjects.
This function applies to Gaussian AD fits from fit_gau.
For categorical and INAD models, use bic_cat and
bic_inad.
Value
A numeric scalar BIC value.
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 30, n_time = 5, order = 1, phi = 0.3)
fit <- fit_gau(y, order = 1)
bic_gau(fit, n_subjects = nrow(y))
Bayesian information criterion for fitted INAD models
Description
Computes BIC using the fitted log likelihood and a parameter count that respects structural zeros and identifiability constraints.
Usage
bic_inad(fit, n_subjects = NULL)
Arguments
fit |
A fitted model object returned by |
n_subjects |
Number of subjects, typically |
Details
The BIC is computed as:
BIC = -2 \times \ell + k \times \log(N)
where \ell is the log-likelihood, k is the number of free parameters,
and N is the number of subjects.
Value
A numeric scalar BIC value.
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 40,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
fit <- fit_inad(y, order = 1, thinning = "binom", innovation = "pois", max_iter = 20)
bic_inad(fit, n_subjects = nrow(y))
BIC-based order selection for categorical AD models
Description
Fits AD models of increasing orders and selects the best by BIC.
Usage
bic_order_cat(
y,
max_order = 2,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
criterion = "bic"
)
Arguments
y |
Integer matrix of categorical data (n_subjects x n_time). |
max_order |
Maximum order to consider. Default is 2. |
blocks |
Optional block membership vector. |
homogeneous |
Whether to use homogeneous parameters across blocks. |
n_categories |
Number of categories (inferred if NULL). |
criterion |
Which criterion to use: "bic" (default) or "aic". |
Value
A list containing:
table |
Data frame with order, log_l, n_params, aic, bic |
bic |
Named numeric vector of BIC values by order |
best_order |
Order with lowest criterion value |
criterion |
Criterion used for order selection ( |
fits |
List of fitted models |
Examples
y <- simulate_cat(100, 5, order = 1, n_categories = 2)
result <- bic_order_cat(y, max_order = 2)
print(result$table)
print(result$best_order)
BIC-based order selection for Gaussian AD models
Description
Fits AD models of increasing orders and selects the best by BIC.
Usage
bic_order_gau(y, max_order = 2L, ...)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
max_order |
Maximum order to consider. |
... |
Additional arguments passed to |
Value
A list with class gau_bic_order containing:
- fits
List of fitted models
- bic
BIC values for each order
- best_order
Order with lowest BIC
- table
Summary table
See Also
bic_order_cat, bic_order_inad,
fit_gau
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 80, n_time = 6, order = 1, phi = 0.4)
ord <- bic_order_gau(y, max_order = 2)
ord$best_order
ord$table
BIC-based order selection for INAD models
Description
Fits INAD models across candidate orders and reports BIC-based selection.
Usage
bic_order_inad(
y,
max_order = 2,
thinning = "binom",
innovation = "pois",
blocks = NULL,
...
)
Arguments
y |
Integer matrix. |
max_order |
Maximum order (1 or 2). |
thinning |
Thinning operator. |
innovation |
Innovation distribution. |
blocks |
Optional block assignments. |
... |
Additional arguments. |
Value
A list with class "bic_order_inad" containing:
- fits
List of fitted INAD models by candidate order
- bic
Named numeric vector of BIC values by order
- best_order
Order with minimum BIC
- table
Data frame with order, logLik, n_params, and BIC
- settings
Input and derived settings used for selection
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 30,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
ord <- bic_order_inad(y, max_order = 2, thinning = "binom", innovation = "pois", max_iter = 10)
ord$best_order
Morphine bolus analgesia counts
Description
Morphine bolus self administration counts for two treatment groups recorded at 12 four hour time points. The data are stored in matrix form to facilitate use with antedependence models.
Usage
bolus_inad
Format
A list with four components:
- y
integer matrix of dimension N by n_time containing all subjects and time points
- y_2mg
integer matrix with rows corresponding to the 2 mg treatment group
- y_1mg
integer matrix with rows corresponding to the 1 mg treatment group
- blocks
integer vector of length N giving the block or treatment group indicator, 1 for 2 mg and 2 for 1 mg
Source
Dataset bolus from the cold package, converted to
matrix form and grouped by treatment.
Cattle growth data (Treatments A and B)
Description
Longitudinal cattle growth measurements for two treatment groups from Zimmerman and Nunez-Anton antedependence book companion data. This dataset is continuous-response data suitable for Gaussian AD modeling.
Usage
cattle_growth
Format
A list with five components:
- y
numeric matrix of dimension N by n_time containing all subjects
- y_A
numeric matrix for Treatment A subjects
- y_B
numeric matrix for Treatment B subjects
- blocks
integer vector of length N (1 = Treatment A, 2 = Treatment B)
- time
integer vector of measurement occasions
Source
https://homepage.divms.uiowa.edu/~dzimmer/Data-for-AD/cattle_growth_data_Treatment%20A.txt and https://homepage.divms.uiowa.edu/~dzimmer/Data-for-AD/cattle_growth_data_Treatment%20B.txt
Confidence intervals for fitted categorical AD models
Description
Computes Wald-based confidence intervals for the transition probability parameters of a fitted categorical antedependence model.
Usage
ci_cat(fit, y = NULL, level = 0.95, parameters = "all")
Arguments
fit |
A fitted model object of class |
y |
Optional data matrix. If NULL, |
level |
Confidence level (default 0.95). |
parameters |
Which parameters to compute CIs for: "all" (default), "marginal", or "transition". |
Details
Confidence intervals are computed using the Wald method based on the asymptotic normality of maximum likelihood estimators.
For a probability estimate \hat{\pi} based on count N, the standard error is:
SE(\hat{\pi}) = \sqrt{\frac{\hat{\pi}(1-\hat{\pi})}{N}}
For conditional probabilities \hat{\pi}_{j|i} based on conditioning count
N_i, the standard error is:
SE(\hat{\pi}_{j|i}) = \sqrt{\frac{\hat{\pi}_{j|i}(1-\hat{\pi}_{j|i})}{N_i}}
The confidence interval is then:
\hat{\pi} \pm z_{\alpha/2} \times SE(\hat{\pi})
Note: CIs are truncated to the interval from 0 to 1 when they exceed these bounds.
Missing-data fits with na_action = "marginalize" are not currently
supported because observed cell counts are not stored for that path.
Value
A list of class "cat_ci" containing:
marginal |
Data frame of CIs for marginal parameters (if requested) |
transition |
List of data frames of CIs for transition parameters (if requested) |
level |
Confidence level used |
settings |
Model settings from fit |
References
Agresti, A. (2013). Categorical Data Analysis (3rd ed.). Wiley.
See Also
Examples
# Fit a model
set.seed(123)
y <- simulate_cat(200, 5, order = 1, n_categories = 2)
fit <- fit_cat(y, order = 1)
# Compute confidence intervals
ci <- ci_cat(fit)
print(ci)
# Just marginal CIs
ci_marg <- ci_cat(fit, parameters = "marginal")
Confidence intervals for fitted Gaussian AD models
Description
Computes approximate Wald confidence intervals for selected parameters from a fitted Gaussian AD model.
Usage
ci_gau(fit, level = 0.95, parameters = "all")
Arguments
fit |
A fitted model object returned by |
level |
Confidence level between 0 and 1. |
parameters |
Which parameters to include: |
Details
This helper currently supports complete-data Gaussian AD fits.
Standard errors are based on large-sample approximations:
-
SE(\hat{\mu}_t) \approx \hat{\sigma}_t / \sqrt{n} -
SE(\hat{\sigma}_t) \approx \hat{\sigma}_t / \sqrt{2n} -
SE(\hat{\phi}) \approx \sqrt{(1-\hat{\phi}^2)/n}for free\phientries
Value
An object of class gau_ci, a list with elements
settings, level, mu, phi, and sigma.
Each non-NULL element is a data frame with columns param,
est, se, lower, upper, and level.
See Also
Examples
y <- simulate_gau(n_subjects = 80, n_time = 6, order = 1, phi = 0.4)
fit <- fit_gau(y, order = 1)
ci <- ci_gau(fit)
ci$mu
ci$phi
ci$sigma
Confidence intervals for fitted INAD models
Description
Computes confidence intervals for selected parameters from a fitted INAD model. For the fixed effect case, Wald intervals for time varying alpha and theta are computed via Louis identity for supported thinning-innovation combinations. For block effects tau, profile likelihood intervals are computed by fixing one component of tau and re maximizing the log likelihood over nuisance parameters. For negative binomial innovations, Wald intervals for the innovation size parameter are computed using a one dimensional observed information approximation per time point, holding other parameters fixed at their fitted values.
Usage
ci_inad(
y,
fit,
blocks = NULL,
level = 0.95,
idx_time = NULL,
ridge = 0,
profile_maxeval = 2500,
profile_xtol_rel = 1e-06
)
Arguments
y |
Integer matrix with |
fit |
A fitted model object returned by |
blocks |
Optional integer vector of length |
level |
Confidence level between 0 and 1. |
idx_time |
Optional integer vector of time indices for which to compute intervals. Default is all time points. |
ridge |
Nonnegative ridge value added to the observed information matrix used for Louis based Wald intervals. |
profile_maxeval |
Maximum number of function evaluations used in the profile likelihood refits. |
profile_xtol_rel |
Relative tolerance used in the profile likelihood refits. |
Value
An object of class inad_ci, a list with elements
settings, level, alpha, theta, nb_inno_size,
and tau. Each non NULL interval element is a data frame with columns
param, est, lower, upper, and possibly se
and width.
Examples
data("bolus_inad", package = "antedep")
y <- bolus_inad$y
blocks <- bolus_inad$blocks
fit <- fit_inad(y, order = 1, thinning = "binom", innovation = "bell", blocks = blocks)
ci <- ci_inad(y, fit, blocks = blocks)
ci$alpha
ci$theta
ci$tau
Cochlear implant speech recognition data
Description
Longitudinal speech recognition outcomes for two groups (A/B), including incomplete records, from Zimmerman and Nunez-Anton antedependence book companion data. This dataset is continuous-response data suitable for Gaussian AD modeling.
Usage
cochlear_implant
Format
A list with six components:
- y
numeric matrix of dimension N by n_time containing all subjects
- y_A
numeric matrix for Group A subjects
- y_B
numeric matrix for Group B subjects
- blocks
integer vector of length N (1 = Group A, 2 = Group B)
- group
character vector of group labels ("A"/"B")
- time
integer vector of measurement occasions
Source
https://homepage.divms.uiowa.edu/~dzimmer/Data-for-AD/speech_recognition_data.txt
EM algorithm for categorical AD model estimation
Description
Fits categorical antedependence models with missing outcomes using the Expectation-Maximization (EM) algorithm for orders 0 and 1.
Usage
em_cat(
y,
order = 1,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
max_iter = 100,
tol = 1e-06,
epsilon = 1e-08,
safeguard = TRUE,
verbose = FALSE
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Values are
category codes in |
order |
Antedependence order. Supported values are |
blocks |
Optional block/group vector of length |
homogeneous |
Logical. If |
n_categories |
Number of categories. If |
max_iter |
Maximum number of EM iterations. |
tol |
Convergence tolerance on absolute log-likelihood change. |
epsilon |
Small positive constant used for smoothing and numerical stability. |
safeguard |
Logical; if |
verbose |
Logical; if |
Details
For complete data (no missing values), this function defers to
fit_cat with closed-form MLEs.
For missing data and orders 0/1, each EM iteration computes expected
sufficient statistics with a forward-backward E-step, then updates
probabilities by normalized expected counts in the M-step.
If safeguard = TRUE, a step-halving line search is applied to the
M-step update whenever the observed-data likelihood decreases.
A final E-step is run before returning so that log_l/AIC/BIC and
expected cell counts correspond exactly to the returned parameter values.
Value
A cat_fit object with fields matching fit_cat.
In EM mode, cell_counts stores expected counts from the final
E-step, with settings$cell_counts_type = "expected".
See Also
Examples
set.seed(1)
y <- simulate_cat(n_subjects = 40, n_time = 5, order = 1, n_categories = 3)
y[sample(length(y), 10)] <- NA
fit <- em_cat(y, order = 1, n_categories = 3, max_iter = 20, tol = 1e-5)
fit$settings$na_action
EM algorithm for Gaussian AD model estimation
Description
Convenience wrapper around fit_gau with
na_action = "em" to provide a parallel entry point to
em_inad.
Usage
em_gau(
y,
order = 1,
blocks = NULL,
estimate_mu = TRUE,
max_iter = 100,
tol = 1e-06,
verbose = FALSE,
...
)
Arguments
y |
Numeric matrix (n_subjects x n_time), may contain NA. |
order |
Integer 0, 1, or 2. |
blocks |
Optional vector of block membership (length n_subjects). |
estimate_mu |
Logical, whether to estimate mu (default TRUE). |
max_iter |
Maximum EM iterations. |
tol |
EM convergence tolerance. |
verbose |
Logical, print EM progress. |
... |
Additional arguments passed to |
Details
This is an alias-style helper for users who prefer explicit em_*
entry points across model families.
Value
An gau_fit object as returned by fit_gau.
See Also
fit_gau, em_inad, em_cat,
fit_cat
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 35, n_time = 5, order = 1, phi = 0.3)
y[sample(length(y), 8)] <- NA
fit <- em_gau(y, order = 1, max_iter = 20, tol = 1e-5)
fit$settings$na_action
EM algorithm for INAD model estimation
Description
Fits INAD models using the Expectation-Maximization algorithm. This is an alternative to direct likelihood optimization.
Usage
em_inad(
y,
order = 1,
thinning = "binom",
innovation = "pois",
blocks = NULL,
max_iter = 200,
tol = 1e-07,
alpha_init = NULL,
theta_init = NULL,
tau_init = NULL,
nb_inno_size = NULL,
safeguard = TRUE,
verbose = FALSE
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. |
order |
Model order (1 or 2). Order 0 does not require EM. |
thinning |
Thinning operator: "binom", "pois", or "nbinom". |
innovation |
Innovation distribution: "pois", "bell", or "nbinom". |
blocks |
Optional integer vector of length n_subjects for block effects. |
max_iter |
Maximum number of EM iterations. |
tol |
Convergence tolerance for log-likelihood change. |
alpha_init |
Optional initial values for alpha parameters. |
theta_init |
Optional initial values for theta parameters. |
tau_init |
Optional initial values for tau parameters. |
nb_inno_size |
Size parameter for negative binomial innovation (if used). |
safeguard |
Logical; if TRUE, use step-halving when likelihood decreases. |
verbose |
Logical; if TRUE, print iteration progress. |
Details
For Gaussian and CAT EM entry points, see em_gau and
em_cat. For CAT specifically, fit_cat() supports
na_action = "em" for orders 0/1 and na_action = "marginalize"
for order 2 missing-data fits.
Value
A list with class "inad_fit" containing estimated parameters.
See Also
em_gau, em_cat, fit_inad,
fit_cat
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 50,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.25,
theta = 2
)
fit <- em_inad(y, order = 1, thinning = "binom", innovation = "pois", max_iter = 20, tol = 1e-6)
fit$log_l
Fit categorical antedependence model by maximum likelihood
Description
Computes maximum likelihood estimates for the parameters of an AD(p) model for categorical longitudinal data. The model is parameterized by transition probabilities, and MLEs are obtained in closed form.
Usage
fit_cat(
y,
order = 1,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
na_action = c("fail", "complete", "marginalize", "em"),
em_max_iter = 100,
em_tol = 1e-06,
em_epsilon = 1e-08,
em_safeguard = TRUE,
em_verbose = FALSE
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c, where c is the number of categories. |
order |
Antedependence order p. Must be 0, 1, or 2. Default is 1. |
blocks |
Optional integer vector of length n_subjects specifying group membership. If NULL, all subjects are treated as one group. |
homogeneous |
Logical. If TRUE (default), parameters are shared across all groups (blocks are ignored for estimation). If FALSE, separate transition probabilities are estimated for each group. |
n_categories |
Number of categories. If NULL (default), inferred from the maximum value in y. |
na_action |
Handling of missing values in |
em_max_iter |
Maximum EM iterations used when |
em_tol |
EM convergence tolerance used when |
em_epsilon |
Numerical smoothing constant used when |
em_safeguard |
Logical; if |
em_verbose |
Logical; print EM progress when |
Details
For AD(p), the model decomposes as:
P(Y_1, \ldots, Y_n) = P(Y_1, \ldots, Y_p) \times \prod_{k=p+1}^{n} P(Y_k | Y_{k-p}, \ldots, Y_{k-1})
MLEs are computed as empirical proportions:
Marginal/joint probabilities: count / N
Transition probabilities: conditional count / marginal count
Empty cells receive probability 0 (if denominator is also 0).
When na_action = "em", fit_cat() dispatches to
em_cat. In that case, em_safeguard controls step-halving
protection against likelihood-decreasing updates, and returned
log_l/AIC/BIC/cell_counts are synchronized via a final E-step
under the returned parameters.
For order = 2, na_action = "em" is not available and errors
explicitly; use na_action = "marginalize".
Value
A list of class "cat_fit" containing:
marginal |
List of marginal/joint probabilities for initial time points |
transition |
List of transition probability arrays for k = p+1 to n |
log_l |
Log-likelihood at MLE |
aic |
Akaike Information Criterion |
bic |
Bayesian Information Criterion |
n_params |
Number of free parameters |
cell_counts |
List of cell counts: observed counts for closed-form fits
( |
convergence |
Optimizer convergence code (0 for closed-form solutions) |
settings |
List of model settings |
References
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
Examples
# Simulate binary AD(1) data
set.seed(123)
y <- simulate_cat(n_subjects = 100, n_time = 5, order = 1, n_categories = 2)
# Fit model
fit <- fit_cat(y, order = 1)
print(fit)
# Compare orders
fit0 <- fit_cat(y, order = 0)
fit1 <- fit_cat(y, order = 1)
fit2 <- fit_cat(y, order = 2)
c(AIC_0 = fit0$aic, AIC_1 = fit1$aic, AIC_2 = fit2$aic)
# EM fit with missing data
y_miss <- y
y_miss[sample(length(y_miss), size = round(0.15 * length(y_miss)))] <- NA
fit_em <- fit_cat(
y_miss,
order = 1,
na_action = "em",
em_max_iter = 80,
em_tol = 1e-6
)
fit_em$settings$n_iter
fit_em$settings$cell_counts_type
Fit Gaussian antedependence model by maximum likelihood
Description
Fits an AD(0), AD(1), or AD(2) model for Gaussian longitudinal data
by maximum likelihood. Missing values can be handled by complete-case
deletion or by EM (see em_gau for an explicit EM wrapper).
Usage
fit_gau(
y,
order = 1,
blocks = NULL,
na_action = c("fail", "complete", "em"),
estimate_mu = TRUE,
em_max_iter = 100,
em_tol = 1e-06,
em_verbose = FALSE,
...
)
Arguments
y |
Numeric matrix (n_subjects x n_time). May contain NA. |
order |
Integer 0, 1, or 2. |
blocks |
Optional vector of block membership (length n_subjects). |
na_action |
One of "fail", "complete", or "em". |
estimate_mu |
Logical, whether to estimate mu (default TRUE). |
em_max_iter |
Maximum EM iterations (only used when na_action = "em"). |
em_tol |
EM convergence tolerance (only used when na_action = "em"). |
em_verbose |
Logical, print EM progress (only used when na_action = "em"). |
... |
Passed through to the EM fitter. |
Details
For missing data with na_action = "em", AD orders 0 and 1 are the
primary production path. AD order 2 is available, but the current EM
implementation uses simplified second-order updates and should be treated as
provisional for high-stakes inference.
For observed-data likelihood evaluation under MAR without fitting, use
logL_gau with na_action = "marginalize". In contrast,
fit_gau handles missingness via complete-case fitting
(na_action = "complete") or EM (na_action = "em").
Value
A list with components including mu, phi, sigma, tau, log_l, n_obs, n_missing.
See Also
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 30, n_time = 5, order = 1, phi = 0.3)
fit <- fit_gau(y, order = 1)
fit$log_l
y_miss <- y
y_miss[1, 2] <- NA
fit_em <- fit_gau(y_miss, order = 1, na_action = "em", em_max_iter = 20)
fit_em$settings$na_action
Fit INAD antedependence model by maximum likelihood
Description
Fits INAD models by maximum likelihood.
Usage
fit_inad(
y,
order = 1,
thinning = c("binom", "pois", "nbinom"),
innovation = c("pois", "bell", "nbinom"),
blocks = NULL,
max_iter = 50,
tol = 1e-06,
verbose = FALSE,
init_alpha = NULL,
init_theta = NULL,
init_tau = 0.4,
init_nb_inno_size = 1,
nb_inno_size_ub = 50,
na_action = c("fail", "complete", "marginalize")
)
Arguments
y |
Integer matrix n_sub by n_time. |
order |
Integer in {0, 1, 2}. |
thinning |
One of "binom", "pois", "nbinom". |
innovation |
One of "pois", "bell", "nbinom". |
blocks |
Optional integer vector length n_sub. Default NULL. |
max_iter |
Max iterations for FE coordinate descent. |
tol |
Tolerance for FE log likelihood stopping. |
verbose |
Logical. |
init_alpha |
Optional initial alpha. For order 1 numeric length 1 or n_time. For order 2 matrix n_time by 2 or list(alpha1, alpha2). |
init_theta |
Optional initial theta numeric length 1 or n_time. |
init_tau |
Optional initial tau. Scalar expands to c(0, x, ..., x). Vector forces first to 0. |
init_nb_inno_size |
Optional initial size for innovation nbinom, length 1 or n_time. |
nb_inno_size_ub |
Upper bound for innovation negative binomial size
parameter when |
na_action |
How to handle missing values:
|
Details
No fixed effect: time separable optimization using logL_inad_i with theta eliminated by moment equations for order 1 and 2.
Fixed effect: block coordinate descent using nloptr BOBYQA, updating tau, alpha, theta, and nb_inno_size if needed.
Value
A list of class "inad_fit" containing:
- alpha
Estimated antedependence parameter(s)
- theta
Estimated innovation parameter(s)
- tau
Estimated block effects (if applicable)
- nb_inno_size
Estimated innovation NB size parameter(s), when
innovation = "nbinom"- log_l
Maximized log-likelihood
- aic
Akaike information criterion
- bic
Bayesian information criterion
- n_params
Number of free parameters
- convergence
Convergence code
- settings
Model and fitting settings
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 60,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
fit <- fit_inad(y, order = 1, thinning = "binom", innovation = "pois", max_iter = 20)
fit$log_l
Labor force longitudinal categorical data (Table 1)
Description
Five-year employment-status sequences reconstructed from Table 1 in the labor-force example used in Xie and Zimmerman score/Wald antedependence testing work. Category coding is 1 = employed, 2 = unemployed.
Usage
labor_force_cat
Format
A list with five components:
- y
integer matrix of dimension N by 5 containing expanded subject-level sequences
- counts
data frame with columns Y1, Y2, Y3, Y4, Y5, Count
- n_categories
number of categories (2)
- time
integer vector of calendar years (1967:1971)
- status_labels
character vector c("employed", "unemployed")
Source
Table 1 (labor-force example) from: Xie, Y. and Zimmerman, D. L. (2013). Score and Wald tests for antedependence in categorical longitudinal data.
Log-likelihood for categorical AD models (with missing data support)
Description
Evaluates the log-likelihood of an AD(p) model for categorical longitudinal data at given parameter values.
Usage
logL_cat(
y,
order,
marginal,
transition = NULL,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
na_action = c("fail", "complete", "marginalize")
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c. |
order |
Antedependence order p. Must be 0, 1, or 2. |
marginal |
List of marginal/joint probabilities for initial time points. Structure depends on order (see Details). |
transition |
List of transition probability arrays for time points k = p+1 to n. Each element should be an array of dimension c^p x c where the last dimension corresponds to the current time point. |
blocks |
Optional integer vector of length n_subjects specifying group membership. Required if homogeneous = FALSE. |
homogeneous |
Logical. If TRUE (default), same parameters used for all subjects. If FALSE, marginal and transition should be lists indexed by block. |
n_categories |
Number of categories. If NULL, inferred from data. |
na_action |
Handling of missing values in |
Details
The log-likelihood for AD(p) decomposes into contributions from initial time points and transition time points.
For order 0 (independence), the log-likelihood is the sum of log marginal probabilities at each time point.
Parameter structure for marginal:
Order 0: List with elements t1, t2, ..., tn, each a vector of length c
Order 1: List with element t1 (vector of length c)
Order 2: List with t1 (vector), t2_given_1to1 (c x c matrix)
Parameter structure for transition:
Order 0: Not used (NULL or empty list)
Order 1: List with elements t2, t3, ..., tn, each c x c matrix
Order 2: List with elements t3, t4, ..., tn, each c x c x c array
Value
Scalar log-likelihood value.
References
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
Examples
set.seed(1)
y <- simulate_cat(n_subjects = 40, n_time = 5, order = 1, n_categories = 3)
fit <- fit_cat(y, order = 1, n_categories = 3)
logL_cat(
y = y,
order = 1,
marginal = fit$marginal,
transition = fit$transition,
n_categories = 3
)
Log-likelihood for Gaussian AD models (with missing data support)
Description
Computes the log-likelihood for Gaussian antedependence models of order 0, 1, or 2. Supports missing data under MAR assumption via na_action parameter.
Usage
logL_gau(
y,
order,
mu = NULL,
phi = NULL,
sigma = NULL,
blocks = NULL,
tau = 0,
na_action = c("fail", "complete", "marginalize")
)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. May contain NA. |
order |
Antedependence order, one of 0, 1, or 2. |
mu |
Mean vector (length n_time). |
phi |
Dependence coefficient(s). For order 1: vector of length n_time-1. For order 2: matrix with 2 columns or vector of length 2*(n_time-2). |
sigma |
Innovation standard deviations (length n_time). |
blocks |
Integer vector of block membership (length n_subjects), or NULL. |
tau |
Block effects, first element constrained to zero |
na_action |
How to handle missing values:
|
Details
For complete data (no NA), all three na_action options give the same result.
For missing data:
marginalize: Uses MVN marginalization to compute P(Y_obs). This is the correct observed-data likelihood for MAR missing data.
complete: Removes subjects with any missing values. May lose information.
fail: Stops with error. Useful to ensure no missing data present.
Value
Scalar log-likelihood value.
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 30, n_time = 5, order = 1, phi = 0.3)
fit <- fit_gau(y, order = 1)
logL_gau(y, order = 1, mu = fit$mu, phi = fit$phi, sigma = fit$sigma)
Log-likelihood for INAD models (with missing data support)
Description
If blocks is NULL, this computes the log likelihood as the sum of per time contributions from logL_inad_i for computational convenience.
Usage
logL_inad(
y,
order = 1,
thinning = c("binom", "pois", "nbinom"),
innovation = c("pois", "bell", "nbinom"),
alpha,
theta,
nb_inno_size = NULL,
blocks = NULL,
tau = 0,
na_action = c("fail", "complete", "marginalize")
)
Arguments
y |
Integer matrix n_sub by n_time. |
order |
Integer in {0, 1, 2}. |
thinning |
One of "binom", "pois", "nbinom". |
innovation |
One of "pois", "bell", "nbinom". |
alpha |
Thinning parameters. For order 1, numeric length 1 or n_time. For order 2, either a matrix n_time by 2 or a list(alpha1, alpha2). |
theta |
Innovation parameter(s). Numeric length 1 or n_time. For
Poisson and negative binomial innovations, this is the innovation mean.
For Bell innovations, this is the Bell rate parameter
(mean |
nb_inno_size |
Size parameter for innovation "nbinom". Numeric length 1 or n_time. |
blocks |
Optional integer vector of length n_sub. If NULL, no fixed effect. |
tau |
Optional numeric vector. Only used if blocks is not NULL. |
na_action |
How to handle missing values:
|
Value
A scalar log likelihood.
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 60,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
fit <- fit_inad(y, order = 1, thinning = "binom", innovation = "pois", max_iter = 20)
logL_inad(
y,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = fit$alpha,
theta = fit$theta
)
INAD log likelihood contribution at time i (no fixed effect)
Description
Returns the time i contribution, summed over subjects, for the no fixed effect model.
Usage
logL_inad_i(
y,
i,
order = 1,
thinning = c("binom", "pois", "nbinom"),
innovation = c("pois", "bell", "nbinom"),
alpha,
theta,
nb_inno_size = NULL
)
Arguments
y |
Integer matrix n_sub by n_time. |
i |
Time index in 1..ncol(y). |
order |
Integer in {0, 1, 2}. |
thinning |
One of "binom", "pois", "nbinom". |
innovation |
One of "pois", "bell", "nbinom". |
alpha |
Thinning parameters. For order 1, numeric length 1 or n_time. For order 2, either a matrix n_time by 2 or a list(alpha1, alpha2). |
theta |
Innovation parameter at time i, or a vector length 1 or n_time.
For Poisson and negative binomial innovations, this is the innovation
mean. For Bell innovations, this is the Bell rate parameter
(mean |
nb_inno_size |
Size parameter for innovation "nbinom". Numeric length 1 or n_time. |
Value
A scalar log likelihood contribution for time i.
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 50,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
fit <- fit_inad(y, order = 1, thinning = "binom", innovation = "pois", max_iter = 20)
logL_inad_i(
y = y,
i = 3,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = fit$alpha,
theta = fit$theta
)
Compute intervenor-adjusted partial correlation matrix
Description
Computes the partial correlation between Y[i] and Y[j] adjusting for the
"intervenor" variables Y[i+1], ..., Y[j-1]. Under an antedependence model
of order p, partial correlations for |i-j| > p should be approximately zero.
Usage
partial_corr(y, test = FALSE, n_digits = 3)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
test |
Logical; if TRUE, returns significance flags based on approximate threshold 2/sqrt(n_eff) where n_eff = n_subjects - (lag - 1). Default FALSE. |
n_digits |
Integer; number of decimal places for rounding. Default 3. |
Details
The intervenor-adjusted partial correlation between Y[i] and Y[j] (i < j) is
computed as the correlation between the residuals from regressing Y[i] and Y[j]
on the intervenor set Y[i+1], ..., Y[j-1].
For adjacent time points (|i-j| = 1), the partial correlation equals the ordinary correlation since there are no intervenors.
The diagonal of both returned matrices contains variances (not correlations). This keeps scale information available alongside correlation structure.
The significance test uses an approximate threshold of 2/sqrt(n_eff), which corresponds roughly to a 95% confidence bound under normality. This is a rough screening tool, not a formal hypothesis test.
Value
A list with components:
correlation |
Matrix with correlations (upper triangle) and variances (diagonal) |
partial_correlation |
Matrix with partial correlations (lower triangle) and variances (diagonal) |
significant |
(If test=TRUE) Matrix flagging significant partial correlations (1 = significant) |
n_subjects |
Number of subjects |
n_time |
Number of time points |
References
Zimmerman, D. L. and Nunez-Anton, V. (2009). Antedependence Models for Longitudinal Data. CRC Press.
See Also
plot_prism for visual diagnostics
Examples
data("bolus_inad")
pc <- partial_corr(bolus_inad$y, test = TRUE)
# View partial correlations (lower triangle)
pc$partial_correlation
# Extract variances from the diagonal
variances <- diag(pc$partial_correlation)
# Check which are "significant" (rough screen for AD order)
pc$significant
PRISM plot (Partial Residual Intervenor Scatterplot Matrix)
Description
Creates a matrix of scatterplots for diagnosing antedependence structure.
The upper triangle shows ordinary scatterplots of Y[i] vs Y[j].
The lower triangle shows PRISM plots: residuals from regressing Y[i] and Y[j]
on the intervenor variables Y[i+1], ..., Y[j-1].
Usage
plot_prism(
y,
time_labels = NULL,
pch = 20,
cex = 0.6,
col_upper = "steelblue",
col_lower = "firebrick",
main = "PRISM Diagnostic Plot"
)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
time_labels |
Optional character vector of time point labels. Default uses column names or "T1", "T2", etc. |
pch |
Point character for scatterplots. Default 20 (filled circle). |
cex |
Point size. Default 0.6. |
col_upper |
Color for upper triangle plots. Default "steelblue". |
col_lower |
Color for lower triangle (PRISM) plots. Default "firebrick". |
main |
Overall title. Default "PRISM Diagnostic Plot". |
Details
Under an antedependence model of order p, the partial correlation between
Y[i] and Y[j] given the intervenors should be zero when |i-j| > p.
This means PRISM plots in the lower triangle should show no association
for lags greater than p.
Interpretation:
Upper triangle: Shows marginal associations between time points
Lower triangle (PRISM): Shows conditional associations after removing effects of intervenor variables
If AD(1) holds: Only the first sub-diagonal of lower triangle should show association
If AD(2) holds: First two sub-diagonals should show association
Value
Invisibly returns NULL. Called for side effect (plotting).
References
Zimmerman, D. L. and Nunez-Anton, V. (2009). Antedependence Models for Longitudinal Data. CRC Press. Chapter 2.
See Also
partial_corr for numerical partial correlations
Examples
data("bolus_inad")
plot_prism(bolus_inad$y)
# With custom labels
plot_prism(bolus_inad$y, time_labels = paste0("Hour ", seq(0, 44, by = 4)))
Profile plot (spaghetti plot) for longitudinal data
Description
Creates a profile plot showing individual subject trajectories with overlaid mean trajectory and standard deviation bands.
Usage
plot_profile(
y,
time_points = NULL,
blocks = NULL,
block_labels = NULL,
title = "Profile Plot",
xlab = "Time",
ylab = "Measurement",
ylim = NULL,
show_sd = TRUE,
individual_alpha = 0.3,
individual_color = "grey50",
mean_color = "blue",
sd_color = "red",
mean_lwd = 2
)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns, or a data frame with measurements. |
time_points |
Optional numeric vector of time points for x-axis. Default uses 1:n_time or attempts to extract from column names. |
blocks |
Optional integer vector of block memberships for stratified plotting. If provided, creates separate panels for each block. |
block_labels |
Optional character vector of labels for blocks. |
title |
Plot title. Default "Profile Plot". |
xlab |
X-axis label. Default "Time". |
ylab |
Y-axis label. Default "Measurement". |
ylim |
Optional y-axis limits as c(min, max). |
show_sd |
Logical; if TRUE (default), show +/- 1 SD error bars. |
individual_alpha |
Alpha (transparency) for individual trajectories. Default 0.3. |
individual_color |
Color for individual trajectories. Default "grey50". |
mean_color |
Color for mean trajectory. Default "blue". |
sd_color |
Color for SD error bars. Default "red". |
mean_lwd |
Line width for mean trajectory. Default 2. |
Details
This function provides a quick visual summary of longitudinal data showing:
Individual subject trajectories (light grey lines)
Mean trajectory across subjects (bold colored line)
+/- 1 standard deviation bands (error bars)
When blocks is provided, the plot is faceted by block membership,
allowing comparison of trajectories across treatment groups or other strata.
Value
A ggplot2 object (invisibly). Called primarily for side effect (plotting).
Examples
data("bolus_inad")
# Basic profile plot
plot_profile(bolus_inad$y)
# With block stratification
plot_profile(bolus_inad$y, blocks = bolus_inad$blocks,
block_labels = c("2mg", "1mg"))
# Customized
plot_profile(bolus_inad$y,
time_points = seq(0, 44, by = 4),
title = "Bolus Counts Over Time",
xlab = "Hours", ylab = "Count")
Print method for cat_ci objects
Description
Print method for cat_ci objects
Usage
## S3 method for class 'cat_ci'
print(x, ...)
Arguments
x |
A cat_ci object |
... |
Additional arguments (ignored) |
Value
Invisibly returns x.
Print method for cat_fit objects
Description
Print method for cat_fit objects
Usage
## S3 method for class 'cat_fit'
print(x, ...)
Arguments
x |
A cat_fit object |
... |
Additional arguments (ignored) |
Value
Invisibly returns x.
Print method for cat_lrt objects
Description
Print method for cat_lrt objects
Usage
## S3 method for class 'cat_lrt'
print(x, ...)
Arguments
x |
A cat_lrt object |
... |
Additional arguments (ignored) |
Value
Invisibly returns x.
Print method for BIC order selection
Description
Print method for BIC order selection
Usage
## S3 method for class 'gau_bic_order'
print(x, ...)
Arguments
x |
Object of class |
... |
Unused. |
Value
Invisibly returns x.
Print method for AD confidence intervals
Description
Print method for AD confidence intervals
Usage
## S3 method for class 'gau_ci'
print(x, ...)
Arguments
x |
An object of class |
... |
Unused. |
Value
The input object, invisibly.
Print method for AD contrast test
Description
Print method for AD contrast test
Usage
## S3 method for class 'gau_contrast_test'
print(x, ...)
Arguments
x |
Object of class |
... |
Unused. |
Value
Invisibly returns x.
Print method for gau_fit objects
Description
Print method for gau_fit objects.
Usage
## S3 method for class 'gau_fit'
print(x, ...)
Arguments
x |
A |
... |
Additional arguments (ignored). |
Value
The input object, invisibly.
Print method for AD homogeneity test
Description
Print method for AD homogeneity test
Usage
## S3 method for class 'gau_homogeneity_test'
print(x, ...)
Arguments
x |
Object of class |
... |
Unused. |
Value
Invisibly returns x.
Print method for AD mean test
Description
Print method for AD mean test
Usage
## S3 method for class 'gau_mean_test'
print(x, ...)
Arguments
x |
Object of class |
... |
Unused. |
Value
Invisibly returns x.
Print method for AD order test
Description
Print method for AD order test
Usage
## S3 method for class 'gau_order_test'
print(x, ...)
Arguments
x |
Object of class |
... |
Unused. |
Value
Invisibly returns x.
Print method for homogeneity_tests_inad
Description
Print method for homogeneity_tests_inad
Usage
## S3 method for class 'homogeneity_tests_inad'
print(x, digits = 4, ...)
Arguments
x |
Object of class |
digits |
Number of digits for printing |
... |
Unused |
Value
Invisibly returns x.
Print method for INAD confidence intervals
Description
Print method for INAD confidence intervals
Usage
## S3 method for class 'inad_ci'
print(x, ...)
Arguments
x |
An object of class |
... |
Unused. |
Value
The input object, invisibly.
Print method for INAD model fits
Description
Print method for INAD model fits
Usage
## S3 method for class 'inad_fit'
print(x, digits = 4, ...)
Arguments
x |
An object of class |
digits |
Number of digits to print. |
... |
Unused. |
Value
The input object, invisibly.
Print method for test_homogeneity_inad
Description
Print method for test_homogeneity_inad
Usage
## S3 method for class 'test_homogeneity_inad'
print(x, digits = 4, ...)
Arguments
x |
Object of class |
digits |
Number of digits for printing |
... |
Unused |
Value
Invisibly returns x.
100km race split-time data
Description
Split times (in minutes) from a 100km race example with 10 consecutive sections. This is continuous-response longitudinal data suitable for Gaussian AD modeling.
Usage
race_100km
Format
A list with five components:
- y
numeric matrix of dimension N by 10 containing section split times
- age
numeric vector of subject ages (may include missing values)
- subject
integer subject identifiers
- time
integer vector of section indices (1:10)
- section_labels
character vector of split-time column names
Source
Combined table extracted from textbook race example data, stored in
data-raw/external/100km_race_combined_extracted.csv.
Run all homogeneity tests for INAD
Description
Convenience function to run all three homogeneity tests at once and return a summary.
Usage
run_homogeneity_tests_inad(
y,
blocks,
order = 1,
thinning = "binom",
innovation = "pois",
...
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. |
blocks |
Integer vector of length n_subjects specifying group membership. |
order |
Antedependence order (0, 1, or 2). |
thinning |
Thinning operator: "binom", "pois", or "nbinom". |
innovation |
Innovation distribution: "pois", "bell", or "nbinom". |
... |
Additional arguments passed to |
Value
A list with class "homogeneity_tests_inad" containing results
for all three tests and a summary table.
Examples
data("bolus_inad")
tests <- run_homogeneity_tests_inad(bolus_inad$y, bolus_inad$blocks,
order = 1, thinning = "nbinom",
innovation = "bell")
print(tests)
Run all pairwise order tests
Description
Performs sequential likelihood ratio tests for AD orders 0 vs 1, 1 vs 2, etc.
Usage
run_order_tests_cat(
y,
max_order = 2,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
test = c("lrt", "score", "mlrt", "wald")
)
Arguments
y |
Integer matrix of categorical data (n_subjects x n_time). |
max_order |
Maximum order to test. Default is 2. |
blocks |
Optional block membership vector. |
homogeneous |
Whether to use homogeneous parameters across blocks. |
n_categories |
Number of categories (inferred if NULL). |
test |
Type of test statistic for each pairwise comparison. One of
|
Details
This function performs forward selection: starting from order 0, it tests whether increasing the order provides significant improvement. The selected order is the highest order where the test was significant (at alpha = 0.05).
Value
A list containing:
tests |
List of test_order_cat results for each comparison |
table |
Summary data frame with all comparisons |
fits |
List of all fitted models |
selected_order |
Recommended order based on sequential testing at alpha=0.05 |
Examples
y <- simulate_cat(200, 6, order = 1, n_categories = 2)
result <- run_order_tests_cat(y, max_order = 2)
print(result$table)
cat("Selected order:", result$selected_order, "\n")
Run all stationarity-related tests for categorical AD
Description
Performs tests for time-invariance and stationarity constraints. For
order = 1, the stationarity test corresponds to strict stationarity;
for order > 1, it tests marginal-constancy plus time-invariant
transitions.
Currently supports complete data only.
Usage
run_stationarity_tests_cat(
y,
order = 1,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
test = c("lrt", "score", "mlrt")
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c. |
order |
Antedependence order p. Default is 1. |
blocks |
Optional integer vector of length n_subjects specifying group membership. |
homogeneous |
Logical. If TRUE (default), parameters are shared across all groups. |
n_categories |
Number of categories. If NULL, inferred from data. |
test |
Type of test statistic. One of |
Value
A list containing:
- time_invariance
Result of test_timeinvariance_cat
- stationarity
Result of test_stationarity_cat
- table
Summary data frame
Examples
y <- simulate_cat(200, 6, order = 1, n_categories = 2)
result <- run_stationarity_tests_cat(y, order = 1)
print(result$table)
Run all stationarity-related tests for Gaussian AD
Description
Runs a standard set of stationarity constraints for Gaussian AD models.
Usage
run_stationarity_tests_gau(
y,
order = 1L,
blocks = NULL,
verbose = FALSE,
max_iter = 2000L,
rel_tol = 1e-08,
...
)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
order |
Antedependence order (0, 1, or 2). |
blocks |
Optional vector of block memberships (length n_subjects). |
verbose |
Logical; if TRUE, prints progress. |
max_iter |
Maximum number of optimization iterations for constrained fits. |
rel_tol |
Relative tolerance for constrained optimization. |
... |
Additional arguments passed to |
Value
A list with class "stationarity_tests_gau" containing:
- fit_unconstrained
Unconstrained Gaussian AD fit
- tests
Named list of
test_stationarity_gauresults- summary
Summary table of all constraints
See Also
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 80, n_time = 6, order = 1, phi = 0.4, sigma = 1)
out <- run_stationarity_tests_gau(y, order = 1, verbose = FALSE)
out$summary
Run all stationarity-related tests for INAD
Description
Run all stationarity-related tests for INAD
Usage
run_stationarity_tests_inad(
y,
order = 1,
thinning = "binom",
innovation = "pois",
blocks = NULL,
verbose = FALSE,
...
)
Arguments
y |
Integer matrix. |
order |
Model order (1 or 2). |
thinning |
Thinning operator. |
innovation |
Innovation distribution. |
blocks |
Optional block assignments. |
verbose |
Logical. |
... |
Additional arguments. |
Value
A list with class "stationarity_tests_inad".
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 25,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
out <- run_stationarity_tests_inad(
y,
order = 1,
thinning = "binom",
innovation = "pois",
verbose = FALSE,
max_iter = 15
)
out$summary
Simulate categorical antedependence series
Description
Generate simulated longitudinal categorical data from an AD(p) model with specified transition probabilities.
Usage
simulate_cat(
n_subjects,
n_time,
order = 1,
n_categories = 2,
marginal = NULL,
transition = NULL,
blocks = NULL,
homogeneous = TRUE,
seed = NULL
)
Arguments
n_subjects |
Number of subjects to simulate. |
n_time |
Number of time points. |
order |
Antedependence order p. Must be 0, 1, or 2. Default is 1. |
n_categories |
Number of categories c. Default is 2 (binary). |
marginal |
List of marginal/joint probabilities for initial time points. If NULL, uniform probabilities are used. See Details for structure. |
transition |
List of transition probability arrays for time points k = p+1 to n. If NULL, uniform transitions are used. See Details. |
blocks |
Optional integer vector of length n_subjects specifying group membership. Used with homogeneous = FALSE. |
homogeneous |
Logical. If TRUE (default), same parameters for all subjects. If FALSE, marginal and transition should be lists indexed by block. |
seed |
Optional random seed for reproducibility. |
Details
Data are simulated sequentially:
For k = 1: Draw Y(1) from marginal distribution
For k = 2 to p: Draw Y(k) conditional on Y(1), ..., Y(k-1)
For k = p+1 to n: Draw Y(k) conditional on Y(k-p), ..., Y(k-1)
Parameter structure for marginal:
Order 0: List with elements t1, t2, ..., tn, each a vector of length c summing to 1
Order 1: List with element t1 (vector of length c)
Order 2: List with t1 (vector), t2_given_1to1 (c x c matrix where rows represent conditioning values and columns represent outcomes)
Parameter structure for transition:
Order 0: Not used (NULL)
Order 1: List with elements t2, t3, ..., tn, each c x c matrix where rows are previous values and columns are current values (rows sum to 1)
Order 2: List with elements t3, t4, ..., tn, each c x c x c array where first two indices are conditioning values and third is outcome
Value
Integer matrix with n_subjects rows and n_time columns, where each entry is a category code from 1 to c.
References
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
Examples
y <- simulate_cat(n_subjects = 30, n_time = 5, order = 1, n_categories = 3, seed = 1)
dim(y)
Simulate Gaussian antedependence series
Description
Generate longitudinal continuous data from a Gaussian antedependence (AD) model of order 0, 1, or 2 using a conditional regression on predecessors.
Usage
simulate_gau(
n_subjects,
n_time,
order = 1L,
mu = NULL,
phi = NULL,
sigma = NULL,
blocks = NULL,
tau = 0,
seed = NULL
)
Arguments
n_subjects |
number of subjects |
n_time |
number of time points |
order |
antedependence order, 0, 1 or 2 |
mu |
mean parameter; |
phi |
dependence parameter; ignored when |
sigma |
innovation standard deviation; |
blocks |
integer vector of length |
tau |
group effect vector indexed by block; |
seed |
optional random seed for reproducibility |
Details
For order = 0, each time point is generated independently as
Y[, t] = mu[t] + tau[block] + eps, with eps ~ N(0, sigma[t]^2).
For order = 1, for t >= 2:
Y[, t] = m_t + phi[t] * (Y[, t - 1] - m_{t-1}) + eps_t,
where m_t = mu[t] + tau[block] and eps_t ~ N(0, sigma[t]^2).
For order = 2, for t >= 3:
Y[, t] = m_t + phi[1, t] * (Y[, t - 1] - m_{t-1}) + phi[2, t] * (Y[, t - 2] - m_{t-2}) + eps_t.
If blocks is provided, each subject s belongs to a block and receives a
mean shift tau[blocks[s]]. tau[1] is forced to 0.
Value
numeric matrix with dimension n_subjects by n_time
Examples
y <- simulate_gau(
n_subjects = 20,
n_time = 6,
order = 1,
phi = 0.4,
seed = 42
)
dim(y)
Simulate INAD antedependence series
Description
Generate longitudinal count data from an INAD model using a thinning operator and an innovation distribution.
Usage
simulate_inad(
n_subjects,
n_time,
order = 1L,
thinning = c("binom", "pois", "nbinom"),
innovation = c("pois", "bell", "nbinom"),
alpha = NULL,
theta = NULL,
nb_inno_size = NULL,
blocks = NULL,
tau = 0,
seed = NULL
)
Arguments
n_subjects |
number of subjects |
n_time |
number of time points |
order |
antedependence order, 0, 1 or 2 |
thinning |
thinning operator, one of |
innovation |
innovation distribution, one of |
alpha |
thinning parameter or vector or matrix; if |
theta |
innovation parameter or vector; if |
nb_inno_size |
size (dispersion) parameter for negative binomial innovations
when |
blocks |
integer vector of length |
tau |
group effect vector indexed by block; |
seed |
optional random seed for reproducibility |
Details
Time 1 observations are generated from the innovation distribution alone.
For times 2 to n_time, counts are generated as thinning of previous
counts plus independent innovations. When order = 0, all time points
are generated from the innovation distribution and the thinning operator
and alpha are ignored.
If blocks is provided, innovations include a block effect. For Poisson and
negative binomial innovations, the innovation mean is theta[t] + tau[blocks[i]].
For Bell innovations, the innovation mean is theta[t] * exp(theta[t]) + tau[blocks[i]].
Value
integer matrix of counts with dimension n_subjects by n_time
Examples
y <- simulate_inad(
n_subjects = 20,
n_time = 6,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2,
seed = 42
)
dim(y)
Summary method for cat_ci objects
Description
Summary method for cat_ci objects
Usage
## S3 method for class 'cat_ci'
summary(object, ...)
Arguments
object |
A cat_ci object |
... |
Additional arguments (ignored) |
Value
A data frame summarizing all CIs
Summary method for gau_ci objects
Description
Summary method for gau_ci objects
Usage
## S3 method for class 'gau_ci'
summary(object, ...)
Arguments
object |
A |
... |
Unused. |
Value
A data frame stacking available confidence intervals with columns
param, component, est, se, lower,
upper, and level. Returns NULL if no intervals are available.
Summary method for inad_ci objects
Description
Summary method for inad_ci objects
Usage
## S3 method for class 'inad_ci'
summary(object, ...)
Arguments
object |
An |
... |
Unused. |
Value
A data frame stacking available confidence intervals with columns
param, component, est, se, lower,
upper, and level. Returns NULL if no intervals are available.
Test linear hypotheses on the mean under antedependence
Description
Tests the null hypothesis C * mu = c for a specified contrast matrix C and vector c, under an AD(p) covariance structure. This implements Theorem 7.2 of Zimmerman & Núñez-Antón (2009).
Usage
test_contrast_gau(y, C, c = NULL, p = 1L)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
C |
Contrast matrix with c rows and n_time columns, where c is the number of contrasts being tested. Rows must be linearly independent. |
c |
Right-hand side vector of length equal to nrow(C). Default is a vector of zeros. |
p |
Antedependence order of the covariance structure. This is the same
order parameter named |
Details
The Wald test statistic (Theorem 7.2) is:
(C\bar{Y} - c)^T (C \hat{\Sigma} C^T)^{-1} (C\bar{Y} - c)
where \hat{\Sigma} is the REML estimator of the covariance matrix
under the AD(p) model.
Common examples include:
Testing if mean is constant: C is the first-difference matrix
Testing for linear trend: C tests deviations from linearity
Value
A list with class gau_contrast_test containing:
- method
Inference method used (
"wald").- C
Contrast matrix
- c
Right-hand side vector
- mu_hat
Estimated mean vector
- contrast_est
Estimated value of C * mu
- statistic
Wald test statistic
- df
Degrees of freedom (number of contrasts)
- p_value
P-value from chi-square distribution
References
Zimmerman, D.L. and Núñez-Antón, V. (2009). Antedependence Models for Longitudinal Data. Chapman & Hall/CRC. Chapter 7.
Examples
y <- simulate_gau(n_subjects = 50, n_time = 5, order = 1)
# Test if mean is constant (all differences = 0)
# C is 4x5 matrix of first differences
C <- matrix(0, nrow = 4, ncol = 5)
for (i in 1:4) {
C[i, i] <- 1
C[i, i+1] <- -1
}
test <- test_contrast_gau(y, C = C, p = 1)
print(test)
Likelihood ratio test for homogeneity across groups (categorical AD data)
Description
Tests whether multiple groups share the same transition probability parameters in a categorical antedependence model.
Usage
test_homogeneity_cat(
y = NULL,
blocks = NULL,
order = 1,
n_categories = NULL,
fit_null = NULL,
fit_alt = NULL,
test = c("lrt", "score", "mlrt")
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c. Can be NULL if both fit_null and fit_alt are provided. |
blocks |
Integer vector of length n_subjects specifying group membership. Required unless pre-fitted models are provided. |
order |
Antedependence order p. Default is 1. |
n_categories |
Number of categories. If NULL, inferred from data. |
fit_null |
Optional pre-fitted homogeneous model (class "cat_fit" with homogeneous = TRUE). If provided, y is not required for fitting under H0. |
fit_alt |
Optional pre-fitted heterogeneous model (class "cat_fit" with homogeneous = FALSE). If provided, y is not required for fitting under H1. |
test |
Type of test statistic. One of |
Details
The null hypothesis is that all G groups share the same transition probability parameters:
H_0: \pi^{(1)} = \pi^{(2)} = \ldots = \pi^{(G)}
The alternative hypothesis allows each group to have its own parameters.
The degrees of freedom are:
df = (G-1) \times k
where G is the number of groups and k is the number of free parameters per population.
Value
A list of class "cat_lrt" containing:
- method
Inference method used: one of
"lrt","score","mlrt", or"wald".- lrt_stat
Likelihood ratio test statistic
- df
Degrees of freedom
- p_value
P-value from chi-square distribution
- fit_null
Fitted homogeneous model (H0)
- fit_alt
Fitted heterogeneous model (H1)
- n_groups
Number of groups
- table
Summary data frame
References
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
See Also
Examples
# Simulate data with different transition probabilities for two groups
set.seed(123)
marg1 <- list(t1 = c(0.7, 0.3))
marg2 <- list(t1 = c(0.4, 0.6))
trans1 <- list(t2 = matrix(c(0.9, 0.1, 0.2, 0.8), 2, byrow = TRUE),
t3 = matrix(c(0.9, 0.1, 0.2, 0.8), 2, byrow = TRUE))
trans2 <- list(t2 = matrix(c(0.5, 0.5, 0.5, 0.5), 2, byrow = TRUE),
t3 = matrix(c(0.5, 0.5, 0.5, 0.5), 2, byrow = TRUE))
y1 <- simulate_cat(100, 3, order = 1, n_categories = 2,
marginal = marg1, transition = trans1)
y2 <- simulate_cat(100, 3, order = 1, n_categories = 2,
marginal = marg2, transition = trans2)
y <- rbind(y1, y2)
blocks <- c(rep(1, 100), rep(2, 100))
# Test homogeneity
test <- test_homogeneity_cat(y, blocks, order = 1)
print(test)
Likelihood ratio test for homogeneity across groups (Gaussian AD data)
Description
Tests the null hypothesis that G groups have the same AD(p) covariance structure against the alternative that they have different AD(p) structures. This implements Theorem 6.6 of Zimmerman & Núñez-Antón (2009).
Usage
test_homogeneity_gau(y, blocks, p = 1L, use_modified = TRUE)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
blocks |
Integer vector of length n_subjects indicating group membership. |
p |
Antedependence order. This is the same order parameter named
|
use_modified |
Logical. If TRUE (default), use modified test statistic for better small-sample approximation. |
Details
The test compares:
H0: All G groups have the same AD(p) covariance matrix Sigma(theta)
H1: Groups have different AD(p) covariance matrices Sigma(theta_g)
The likelihood ratio test statistic (Theorem 6.6) involves comparing pooled and within-group RSS values. The degrees of freedom are (G-1)(2n - p)(p + 1)/2.
This test is useful for determining whether a common covariance structure can be assumed across treatment groups before performing mean comparisons.
Value
A list with class gau_homogeneity_test containing:
- method
Inference method used (
"lrt").- statistic
Test statistic value
- statistic_modified
Modified test statistic (if use_modified = TRUE)
- df
Degrees of freedom
- p_value
P-value from chi-square distribution
- p_value_modified
P-value from modified test
- G
Number of groups
- group_sizes
Sample sizes for each group
- order
Antedependence order
References
Zimmerman, D.L. and Núñez-Antón, V. (2009). Antedependence Models for Longitudinal Data. Chapman & Hall/CRC. Section 6.4.
Kenward, M.G. (1987). A method for comparing profiles of repeated measurements. Applied Statistics, 36, 296-308.
See Also
test_order_gau, test_two_sample_gau
Examples
# Simulate data from two groups with same covariance
n1 <- 30
n2 <- 35
y1 <- simulate_gau(n1, n_time = 6, order = 1, phi = 0.5, sigma = 1)
y2 <- simulate_gau(n2, n_time = 6, order = 1, phi = 0.5, sigma = 1)
y <- rbind(y1, y2)
blocks <- c(rep(1, n1), rep(2, n2))
# Test homogeneity
test <- test_homogeneity_gau(y, blocks, p = 1)
print(test)
Likelihood ratio test for homogeneity across groups (INAD data)
Description
Tests hypotheses about parameter equality across treatment or grouping factors in integer-valued antedependence models. Implements the homogeneity testing framework from Section 3.7 of Li & Zimmerman (2026).
Usage
test_homogeneity_inad(
y,
blocks,
order = 1,
thinning = "binom",
innovation = "pois",
test = c("all", "mean", "dependence"),
fit_pooled = NULL,
fit_inadfe = NULL,
fit_hetero = NULL,
...
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. |
blocks |
Integer vector of length n_subjects specifying group membership. |
order |
Antedependence order (0, 1, or 2). |
thinning |
Thinning operator: "binom", "pois", or "nbinom". |
innovation |
Innovation distribution: "pois", "bell", or "nbinom". |
test |
Type of homogeneity test:
|
fit_pooled |
Optional pre-computed pooled fit (M1). |
fit_inadfe |
Optional pre-computed INADFE fit (M2). |
fit_hetero |
Optional pre-computed heterogeneous fit (M3). |
... |
Additional arguments passed to |
Details
The function supports three nested model comparisons as described in the paper:
M1 (Pooled): All parameters are common across groups. This corresponds
to fitting fit_inad(y, blocks = NULL).
M2 (INADFE): The thinning parameters \alpha are shared across
groups, but innovation means differ via block effects \tau. This is the
standard INADFE model fitted via fit_inad(y, blocks = blocks).
M3 (Fully Heterogeneous): Both \alpha and \theta
parameters can differ across groups. This is fitted by running separate
fit_inad calls for each group.
The three test types correspond to:
-
"all": H0: M1 vs H1: M3 (complete homogeneity vs complete heterogeneity) -
"mean": H0: M1 vs H1: M2 (test for group differences in means only) -
"dependence": H0: M2 vs H1: M3 (test for group differences in dependence)
Degrees of freedom are computed as the difference in free parameters between the null and alternative models.
Value
A list with class "test_homogeneity_inad" containing:
- method
Inference method used (
"lrt").- statistic
Test statistic value
- lrt_stat
Likelihood ratio test statistic
- df
Degrees of freedom
- p_value
P-value from chi-square distribution
- test
Type of test performed
- fit_null
Fitted model under H0
- fit_alt
Fitted model under H1
- bic_null
BIC under H0
- bic_alt
BIC under H1
- bic_selected
Which model BIC prefers
- table
Summary data frame
References
Li, C. and Zimmerman, D.L. (2026). Integer-valued antedependence models for longitudinal count data. Biostatistics. Section 3.7.
See Also
fit_inad, test_order_inad,
test_stationarity_inad
Examples
data("bolus_inad")
y <- bolus_inad$y
blocks <- bolus_inad$blocks
# Test for any group differences (M1 vs M3)
test_all <- test_homogeneity_inad(y, blocks, order = 1,
thinning = "nbinom", innovation = "bell",
test = "all")
print(test_all)
# Test only for mean differences (M1 vs M2)
test_mean <- test_homogeneity_inad(y, blocks, order = 1,
thinning = "nbinom", innovation = "bell",
test = "mean")
print(test_mean)
# Test for dependence differences given different means (M2 vs M3)
test_dep <- test_homogeneity_inad(y, blocks, order = 1,
thinning = "nbinom", innovation = "bell",
test = "dependence")
print(test_dep)
One-sample test for mean structure under antedependence
Description
Tests the null hypothesis that the mean vector equals a specified value mu = mu_0 against the alternative mu != mu_0, under an AD(p) covariance structure. This implements Theorem 7.1 of Zimmerman & Núñez-Antón (2009).
Usage
test_one_sample_gau(y, mu0, p = 1L, order = NULL, use_modified = TRUE)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
mu0 |
Hypothesized mean vector under the null (length n_time). |
p |
Antedependence order of the covariance structure. This is the same
order parameter named |
order |
Optional alias for |
use_modified |
Logical. If TRUE (default), use the modified test statistic (formula 7.7) for better small-sample approximation. |
Details
The test exploits the AD structure to gain power over tests that don't assume any covariance structure. The likelihood ratio test statistic (Theorem 7.1) is:
N \sum_{i=1}^{n} [\log RSS_i(\mu_0) - \log RSS_i(\hat{\mu})]
where RSS_i(mu) is the residual sum of squares from the regression of Y_i - mu_i on its p predecessors Y_(i-1) - mu_(i-1), ..., Y_(i-p) - mu_(i-p).
The test has n degrees of freedom (one for each component of mu).
Value
A list with class gau_mean_test containing:
- method
Inference method used (
"lrt").- test_type
"one-sample"
- mu0
Hypothesized mean under null
- mu_hat
MLE of mean (sample mean)
- statistic
Test statistic value
- statistic_modified
Modified test statistic (if use_modified = TRUE)
- df
Degrees of freedom (n_time)
- p_value
P-value from chi-square distribution
- p_value_modified
P-value from modified test
- order
Antedependence order used
References
Zimmerman, D.L. and Núñez-Antón, V. (2009). Antedependence Models for Longitudinal Data. Chapman & Hall/CRC. Chapter 7.
See Also
test_two_sample_gau, test_order_gau
Examples
# Simulate data with known mean
mu_true <- c(10, 11, 12, 13, 14, 15)
y <- simulate_gau(n_subjects = 50, n_time = 6, order = 1, mu = mu_true)
# Test if mean is zero
test <- test_one_sample_gau(y, mu0 = rep(0, 6), p = 1)
print(test)
# Test if mean equals true value (should not reject)
test2 <- test_one_sample_gau(y, mu0 = mu_true, p = 1)
print(test2)
Likelihood ratio test for antedependence order (categorical AD data)
Description
Tests whether a higher-order AD model provides significantly better fit than a lower-order model for categorical longitudinal data.
Usage
test_order_cat(
y = NULL,
order_null = 0,
order_alt = 1,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
fit_null = NULL,
fit_alt = NULL,
test = c("lrt", "score", "mlrt", "wald")
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c. Can be NULL if both fit_null and fit_alt are provided. |
order_null |
Order under the null hypothesis (default 0). |
order_alt |
Order under the alternative hypothesis (default 1). Must be greater than order_null. |
blocks |
Optional integer vector of length n_subjects specifying group membership. |
homogeneous |
Logical. If TRUE (default), parameters are shared across all groups. |
n_categories |
Number of categories. If NULL, inferred from data. |
fit_null |
Optional pre-fitted model under null hypothesis (class "cat_fit"). If provided, y is not required for fitting under H0. |
fit_alt |
Optional pre-fitted model under alternative hypothesis. If provided, y is not required for fitting under H1. |
test |
Type of test statistic. One of |
Details
The likelihood ratio test statistic is:
\lambda = -2[\ell_0 - \ell_1]
where \ell_0 and \ell_1 are the maximized log-likelihoods under
the null and alternative hypotheses.
Under H0, \lambda follows a chi-square distribution with degrees of
freedom equal to the difference in the number of free parameters.
For testing AD(p) vs AD(p+1), the degrees of freedom are:
df = (c-1)^2 \times c^p \times (n - p - 1)
where c is the number of categories and n is the number of time points.
If y contains missing values and models are fit internally, this
function defaults to na_action = "marginalize" for fitting.
Score- and Wald-based variants currently require complete data.
Value
A list of class "cat_lrt" containing:
- method
Inference method used: one of
"lrt","score","mlrt", or"wald".- lrt_stat
Likelihood ratio test statistic
- df
Degrees of freedom
- p_value
P-value from chi-square distribution
- fit_null
Fitted model under H0
- fit_alt
Fitted model under H1
- order_null
Order under null
- order_alt
Order under alternative
- table
Summary data frame
References
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
See Also
Examples
# Simulate AD(1) data
set.seed(123)
y <- simulate_cat(200, 6, order = 1, n_categories = 2)
# Test AD(0) vs AD(1)
test_01 <- test_order_cat(y, order_null = 0, order_alt = 1)
print(test_01$table)
# Test AD(1) vs AD(2)
test_12 <- test_order_cat(y, order_null = 1, order_alt = 2)
print(test_12$table)
# Using pre-fitted models
fit0 <- fit_cat(y, order = 0)
fit1 <- fit_cat(y, order = 1)
test_prefitted <- test_order_cat(fit_null = fit0, fit_alt = fit1)
Likelihood ratio test for antedependence order (Gaussian AD data)
Description
Tests the null hypothesis that the data follow an AD(p) model against the alternative that they follow an AD(p+q) model, using the likelihood ratio test described in Theorem 6.4 and 6.5 of Zimmerman & Núñez-Antón (2009).
Usage
test_order_gau(
y,
p = 0L,
q = 1L,
mu = NULL,
use_modified = TRUE,
order_null = NULL,
order_alt = NULL
)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
p |
Order under the null hypothesis (default 0). This is the same
antedependence order parameter named |
q |
Order increment under the alternative (default 1, so alternative is AD(p+q)). |
mu |
Optional mean vector. If NULL, the saturated mean (sample means) is used. |
use_modified |
Logical. If TRUE (default), use the modified test statistic (formula 6.9) which has better small-sample properties. |
order_null |
Optional alias for |
order_alt |
Optional absolute order under the alternative. When supplied,
|
Details
The test is based on the intervenor-adjusted sample partial correlations. Under the null hypothesis AD(p), the partial correlations r_(i,i-k|(i-k+1:i-1)) should be zero for k > p.
The likelihood ratio test statistic (Theorem 6.4) is:
-N \sum_{j=1}^{q} \sum_{i=p+j+1}^{n} \log(1 - r^2_{i,i-p-j\cdot(i-p-j+1:i-1)})
which is asymptotically chi-square with (2n - 2p - q - 1)(q/2) degrees of freedom.
The modified test (formula 6.9) adjusts for small-sample bias using Kenward's (1987) correction.
Value
A list with class gau_order_test containing:
- method
Inference method used (
"lrt").- p
Order under null hypothesis
- q
Order increment
- statistic
Test statistic value
- statistic_modified
Modified test statistic (if use_modified = TRUE)
- df
Degrees of freedom
- p_value
P-value from chi-square distribution
- p_value_modified
P-value from modified test (if use_modified = TRUE)
- n_subjects
Number of subjects
- n_time
Number of time points
References
Zimmerman, D.L. and Núñez-Antón, V. (2009). Antedependence Models for Longitudinal Data. Chapman & Hall/CRC. Chapter 6.
Kenward, M.G. (1987). A method for comparing profiles of repeated measurements. Applied Statistics, 36, 296-308.
See Also
test_one_sample_gau, test_homogeneity_gau
Examples
# Simulate AD(1) data
y <- simulate_gau(n_subjects = 50, n_time = 6, order = 1, phi = 0.5)
# Test AD(0) vs AD(1)
test01 <- test_order_gau(y, p = 0, q = 1)
print(test01)
# Test AD(1) vs AD(2)
test12 <- test_order_gau(y, p = 1, q = 1)
print(test12)
Likelihood ratio test for antedependence order (INAD data)
Description
Performs a likelihood ratio test comparing INAD models of different orders.
Usage
test_order_inad(
y,
order_null = 0,
order_alt = 1,
thinning = "binom",
innovation = "pois",
blocks = NULL,
use_chibar = TRUE,
weights = NULL,
fit_null = NULL,
fit_alt = NULL,
...
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. |
order_null |
Order under null hypothesis (0 or 1). |
order_alt |
Order under alternative hypothesis (1 or 2). Must be order_null + 1. |
thinning |
Thinning operator: "binom", "pois", or "nbinom". |
innovation |
Innovation distribution: "pois", "bell", or "nbinom". |
blocks |
Optional integer vector for block effects. |
use_chibar |
Logical; if TRUE, use chi-bar-square for boundary test. |
weights |
Optional weights for chi-bar-square mixture. |
fit_null |
Optional pre-computed null fit. |
fit_alt |
Optional pre-computed alternative fit. |
... |
Additional arguments passed to fit_inad. |
Details
The test compares nested INAD models of orders order_null and
order_alt = order_null + 1 using:
\lambda = 2(\ell_{alt} - \ell_{null})
where \ell_{null} and \ell_{alt} are maximized log-likelihoods
under the null and alternative models.
The default p-value uses the chi-square approximation with degrees of freedom
matching the number of additional dependence parameters introduced under the
higher-order model. When use_chibar = TRUE, a chi-bar-square mixture
p-value is also reported for boundary-aware inference.
Missing-data inputs are supported through the same na_action options
available in fit_inad. If y has missing values and
na_action is not supplied via ..., this function defaults to
na_action = "marginalize".
Value
A list with class "test_order_inad" containing:
- method
Inference method used (
"lrt").- fit_null
Fitted model under H0
- fit_alt
Fitted model under H1
- statistic
Test statistic value
- lrt_stat
Likelihood ratio test statistic
- df
Degrees of freedom
- p_value
Chi-square p-value
- p_value_chibar
Chi-bar-square p-value (if
use_chibar = TRUE)- bic_null
BIC under H0
- bic_alt
BIC under H1
- bic_selected
Which model BIC prefers
- table
Two-row model comparison table
- settings
Input and derived settings for the test
References
Li, C. and Zimmerman, D.L. (2026). Integer-valued antedependence models for longitudinal count data. Biostatistics.
See Also
fit_inad, bic_order_inad,
test_stationarity_inad
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 40,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
out <- test_order_inad(
y,
order_null = 0,
order_alt = 1,
thinning = "binom",
innovation = "pois",
max_iter = 20
)
out$statistic
Likelihood ratio test for stationarity (categorical AD data)
Description
Tests whether a categorical antedependence process satisfies stationarity constraints in the AD parameterization.
Usage
test_stationarity_cat(
y,
order = 1,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
test = c("lrt", "score", "mlrt")
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c. |
order |
Antedependence order p. Default is 1. |
blocks |
Optional integer vector of length n_subjects specifying group membership. |
homogeneous |
Logical. If TRUE (default), parameters are shared across all groups. |
n_categories |
Number of categories. If NULL, inferred from data. |
test |
Type of test statistic. One of |
Details
The tested constraints are:
The marginal distribution P(Yk) is constant for all k
The transition probabilities P(Yk | Y(k-p), ..., Y(k-1)) are constant for all k > p
For AD order 1, these two constraints correspond to strict stationarity. For AD order greater than 1, this function should be interpreted as testing marginal-constancy plus time-invariant transitions; these constraints are not, in general, sufficient for full strict stationarity.
This is stronger than time-invariance alone, which only requires condition 2.
This function currently supports complete data only.
The null hypothesis is tested against the general (non-stationary) AD(p) model. The degrees of freedom are computed from the fitted parameter counts:
df = n_{params}(H_1) - n_{params}(H_0)
where H_1 is the unconstrained non-stationary model and H_0 is
the stationary model.
Value
A list of class "cat_lrt" containing:
- method
Inference method used: one of
"lrt","score", or"mlrt".- lrt_stat
Likelihood ratio test statistic
- df
Degrees of freedom
- p_value
P-value from chi-square distribution
- fit_null
Fitted stationary model (H0)
- fit_alt
Fitted non-stationary model (H1)
- table
Summary data frame
References
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
See Also
test_timeinvariance_cat, test_order_cat
Examples
# Simulate stationary AD(1) data
set.seed(123)
y <- simulate_cat(200, 6, order = 1, n_categories = 2)
# Test stationarity
test <- test_stationarity_cat(y, order = 1)
print(test)
Likelihood ratio test for stationarity (Gaussian AD data)
Description
Tests whether time-varying Gaussian AD covariance parameters can be constrained to be constant over time.
Usage
test_stationarity_gau(
y,
order = 1L,
blocks = NULL,
constrain = "both",
fit_unconstrained = NULL,
verbose = FALSE,
max_iter = 2000L,
rel_tol = 1e-08,
...
)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
order |
Antedependence order (0, 1, or 2). |
blocks |
Optional vector of block memberships (length n_subjects). |
constrain |
Constraint to test: for order 0: "sigma" (or "all"); for order 1: "phi", "sigma", or "both"/"all"; for order 2: "phi1", "phi2", "phi", "sigma", or "all"/"both". |
fit_unconstrained |
Optional pre-computed unconstrained fit from
|
verbose |
Logical; if TRUE, prints fitting progress. |
max_iter |
Maximum number of optimization iterations for constrained fit. |
rel_tol |
Relative tolerance for constrained optimization. |
... |
Additional arguments passed to |
Details
The mean structure is kept unrestricted in both models (time-specific means plus optional block shifts), and the test constrains covariance parameters: innovation standard deviations and/or antedependence coefficients.
The likelihood-ratio statistic is:
\lambda = 2(\ell_{alt} - \ell_{null})
where \ell_{null} and \ell_{alt} are maximized log-likelihoods
under the constrained and unconstrained models, respectively.
Degrees of freedom are computed from the number of constraints implied by
constrain.
Value
A list with class "test_stationarity_gau" containing:
- method
Inference method used (
"lrt").- fit_unconstrained
Unconstrained Gaussian AD fit
- fit_constrained
Constrained Gaussian AD fit
- constraint
Human-readable null constraint description
- lrt_stat
Likelihood-ratio statistic
- df
Degrees of freedom
- p_value
Chi-square p-value
- bic_unconstrained
BIC of unconstrained model
- bic_constrained
BIC of constrained model
- bic_selected
Model selected by BIC
- table
Two-row model summary table
References
Zimmerman, D.L. and Nunez-Anton, V. (2009). Antedependence Models for Longitudinal Data. Chapman & Hall/CRC. Chapter 6.
See Also
run_stationarity_tests_gau, test_order_gau,
fit_gau
Examples
set.seed(1)
y <- simulate_gau(n_subjects = 80, n_time = 6, order = 1, phi = 0.4, sigma = 1)
# Test jointly constant phi and sigma (order 1)
out <- test_stationarity_gau(y, order = 1, constrain = "both")
out$p_value
Likelihood ratio test for stationarity (INAD data)
Description
Tests whether time-varying INAD parameters can be constrained to be constant over time.
Usage
test_stationarity_inad(
y,
order = 1,
thinning = "binom",
innovation = "pois",
blocks = NULL,
constrain = "both",
fit_unconstrained = NULL,
verbose = FALSE,
...
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. |
order |
Model order (1 or 2). |
thinning |
Thinning operator: "binom", "pois", or "nbinom". |
innovation |
Innovation distribution: "pois", "bell", or "nbinom". |
blocks |
Optional integer vector for block effects. |
constrain |
Which parameters to constrain: "alpha", "theta", "both" for order 1; "alpha1", "alpha2", "alpha", "theta", "all" for order 2. |
fit_unconstrained |
Optional pre-computed unconstrained fit. |
verbose |
Logical; if TRUE, print progress. |
... |
Additional arguments. |
Details
For order 1, the test can constrain alpha, theta, or both.
For order 2, it can constrain alpha1, alpha2,
alpha (both), theta, or all supported time-varying
parameters.
Degrees of freedom are computed from the number of equality constraints imposed under the null model relative to the unconstrained model.
Missing-data inputs are supported through the same na_action options
available in fit_inad. If y has missing values and
na_action is not supplied via ..., this function defaults to
na_action = "marginalize".
Value
A list with class "test_stationarity_inad" containing:
- method
Inference method used (
"lrt").- fit_unconstrained
Unconstrained INAD fit
- fit_constrained
Constrained INAD fit
- constraint
Human-readable null constraint description
- statistic
Test statistic value
- lrt_stat
Likelihood ratio test statistic
- df
Degrees of freedom
- p_value
Chi-square p-value
- bic_unconstrained
BIC of unconstrained model
- bic_constrained
BIC of constrained model
- bic_selected
Model selected by BIC
- table
Two-row model summary table
References
Li, C. and Zimmerman, D.L. (2026). Integer-valued antedependence models for longitudinal count data. Biostatistics.
See Also
run_stationarity_tests_inad,
test_order_inad, fit_inad
Examples
set.seed(1)
y <- simulate_inad(
n_subjects = 30,
n_time = 5,
order = 1,
thinning = "binom",
innovation = "pois",
alpha = 0.3,
theta = 2
)
out <- test_stationarity_inad(
y,
order = 1,
thinning = "binom",
innovation = "pois",
constrain = "both",
max_iter = 20
)
out$p_value
Likelihood ratio test for time-invariance (categorical data)
Description
Tests whether transition probabilities are constant over time in a categorical antedependence model.
Usage
test_timeinvariance_cat(
y,
order = 1,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
test = c("lrt", "score", "mlrt")
)
Arguments
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c. |
order |
Antedependence order p. Default is 1. |
blocks |
Optional integer vector of length n_subjects specifying group membership. |
homogeneous |
Logical. If TRUE (default), parameters are shared across all groups. |
n_categories |
Number of categories. If NULL, inferred from data. |
test |
Type of test statistic. One of |
Details
The null hypothesis is that all transition probabilities (for k > p) are equal across time:
H_0: \pi_{y_k | y_{k-p}, \ldots, y_{k-1}} \text{ is constant for } k = p+1, \ldots, n
This reduces (n-p) separate transition matrices/arrays to a single one.
The degrees of freedom are:
df = (c-1) \times c^p \times (n - p - 1)
This function currently supports complete data only. If y contains
missing values, use model-fitting functions (for example fit_cat)
directly with missing-data handling instead of this test wrapper.
Value
A list of class "cat_lrt" containing:
- method
Inference method used: one of
"lrt","score","mlrt", or"wald".- lrt_stat
Likelihood ratio test statistic
- df
Degrees of freedom
- p_value
P-value from chi-square distribution
- fit_null
Fitted time-invariant model (H0)
- fit_alt
Fitted time-varying model (H1)
- table
Summary data frame
References
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
See Also
Examples
# Simulate data with time-invariant transitions
set.seed(123)
y <- simulate_cat(200, 6, order = 1, n_categories = 2)
# Test time-invariance
test <- test_timeinvariance_cat(y, order = 1)
print(test)
Two-sample test for equality of mean profiles under antedependence
Description
Tests the null hypothesis that two groups have equal mean profiles mu_1 = mu_2 against the alternative mu_1 != mu_2, assuming a common AD(p) covariance structure. This implements Theorem 7.3 of Zimmerman & Núñez-Antón (2009).
Usage
test_two_sample_gau(y, blocks, p = 1L, order = NULL, use_modified = TRUE)
Arguments
y |
Numeric matrix with n_subjects rows and n_time columns. |
blocks |
Integer vector of length n_subjects indicating group membership (must contain exactly two unique values, typically 1 and 2). |
p |
Antedependence order of the common covariance structure. This is the
same order parameter named |
order |
Optional alias for |
use_modified |
Logical. If TRUE (default), use modified test statistic. |
Details
This test is also known as a "profile comparison" test. The likelihood ratio test statistic (Theorem 7.3) compares the pooled RSS (under H0: common mean) to the sum of within-group RSS (under H1: separate means):
N \sum_{i=1}^{n} [\log RSS_i(\mu) - \log RSS_i(\mu_1, \mu_2)]
where RSS_i(mu) uses a common mean and RSS_i(mu_1, mu_2) uses group-specific means.
Value
A list with class gau_mean_test containing:
- method
Inference method used (
"lrt").- test_type
"two-sample"
- mu1_hat
Estimated mean for group 1
- mu2_hat
Estimated mean for group 2
- mu_pooled
Pooled mean estimate under H0
- statistic
Test statistic value
- statistic_modified
Modified test statistic
- df
Degrees of freedom (n_time)
- p_value
P-value from chi-square distribution
- p_value_modified
P-value from modified test
- order
Antedependence order used
References
Zimmerman, D.L. and Núñez-Antón, V. (2009). Antedependence Models for Longitudinal Data. Chapman & Hall/CRC. Chapter 7.
Examples
# Simulate data from two groups with different means
n1 <- 30
n2 <- 35
y1 <- simulate_gau(n1, n_time = 6, order = 1, mu = rep(10, 6))
y2 <- simulate_gau(n2, n_time = 6, order = 1, mu = rep(12, 6))
y <- rbind(y1, y2)
blocks <- c(rep(1, n1), rep(2, n2))
# Test equality of profiles
test <- test_two_sample_gau(y, blocks, p = 1)
print(test)