Help for package inough

Version:

0.1.0

Title:

Inattention Detection Pipeline for Psychophysical Tasks

Description:

Three-stage pipeline for detecting inattention episodes in long psychophysical tasks (200+ trials). Uses accuracy residuals and response pattern signals to locate, sharpen, and formally test candidate inattention regions at trial-level precision.

License:

GPL (≥ 3)

URL:

https://github.com/pawlenartowicz/inough

BugReports:

https://github.com/pawlenartowicz/inough/issues

Encoding:

UTF-8

Language:

en-US

Depends:

R (≥ 3.5.0)

LazyData:

true

RoxygenNote:

7.3.3

Imports:

lme4, ggplot2, patchwork, rlang, jsonlite

Suggests:

testthat (≥ 3.0.0)

Config/testthat/edition:

NeedsCompilation:

Packaged:

2026-05-08 21:07:34 UTC; plenartowicz

Author:

Pawel Lenartowicz

[aut, cre], Maja Willard

[aut]

Maintainer:

Pawel Lenartowicz <pawellenartowicz@europe.com>

Repository:

CRAN

Date/Publication:

2026-05-13 19:10:02 UTC

Apply spurious-start accuracy heuristic

Description

For each chunk, checks if the first n trials have suspiciously high accuracy (>= k correct). If so, marks the chunk and shifts the test start past those trials. Chunks with fewer than min_left trials remaining are dropped.

Usage

apply_spurious_heuristic(chunks, trial, heuristics)

Arguments

chunks

Data frame with id, start, end.

trial

Trial data frame (needs id, trial_idx, correct).

heuristics

inough_heuristics object.

Value

Updated chunks data frame with added columns: test_start, spurious_start.

Participant-level bail-out

Description

Flags entire sessions where trial-level analysis is unreliable: response sequence too stereotyped (low LZ) or accuracy indistinguishable from chance.

Usage

bailout(participant, lz_threshold)

Arguments

participant

Data frame with id, lz, n_trials, mean_accuracy.

lz_threshold

LZ threshold (from inough_control).

Value

Data frame with id, reason, lz_value, accuracy for bailed-out participants (zero rows if none).

Extract accuracy residuals via probit GLMM

Description

Fits a probit GLMM (lme4::glmer) or probit GLM (stats::glm) on accuracy and returns Pearson residuals.

Usage

extract_accuracy(df, formula, has_random = TRUE)

Arguments

df

Data frame with correct (0/1), id, and predictor columns. Must already contain n_trial if the formula references it.

formula

Full model formula (built by inough_signals).

has_random

Logical; whether the formula includes random effects.

Value

A list with $residuals (Pearson) and $model.

Extract response bias signals

Description

Computes lag-1 response repetition indicator (per-trial) and normalized Lempel-Ziv complexity of the response sequence (per-participant).

Usage

extract_bias(df)

Arguments

df

Data frame with columns id and response (0/1).

Details

LZ is normalized by a permutation null distribution: 400 random 50/50 binary sequences at the modal participant length, giving expected value ~1.0 for random sequences regardless of sample size.

Value

A list with:

trial: Data frame with column resp_lag1 (+1 = same as previous, -1 = different, 0 = first trial).
participant: Data frame with columns id and lz (permutation-normalized LZ76 complexity).

Extract flagged trials

Description

Returns a data frame of all trials flagged by the detection pipeline, suitable for downstream filtering via dplyr::anti_join or similar.

Usage

flags(x)

Arguments

x

An inough_detected object.

Value

Data frame with columns id, trial_idx, flag_type ("bailout" or "chunk"), chunk_id (integer, NA for bailout), p_adj (numeric, NA for bailout).

Pipeline tuning parameters

Description

Pipeline tuning parameters

Usage

inough_control(
  lz_threshold = 0.2,
  window_size = 3,
  sd_threshold = 2,
  window_weight = "uniform",
  min_chunk = 6,
  comparison = "clean"
)

Arguments

lz_threshold

LZ complexity below this triggers bail-out (default 0.2).

window_size

Half-width of rolling window; total window = 2*w+1 (default 3, giving 7-trial windows).

sd_threshold

Number of SDs above chance that |roll_resp| must exceed to flag a candidate region. Under H0 of balanced random responding, SD = sqrt(sum(w^2)) / sum(w) where w are the window weights. For uniform weights this simplifies to 1/sqrt(2*window_size+1). Default 2 (about 5% false-positive rate per window under H0).

window_weight

Weighting scheme for the rolling window: "uniform" (default) or "triangular" (center trials weighted more; reduces edge sensitivity). The SD is computed analytically for both.

min_chunk

Minimum chunk length in trials to retain after merging (default 6).

comparison

t-test comparison set: "clean" (chunk vs all non-flagged trials) or "rest" (chunk vs everything except that chunk). Default "clean".

Value

An inough_control object with a pre-computed screening_threshold based on sd_threshold and the window.

Detect inattention episodes

Description

Two-stage pipeline: (1) dual-track screening with chunk filtering, (2) formal t-test with FDR correction. Participants with extremely stereotyped responses or chance-level accuracy are bailed out first.

Usage

inough_detect(
  signals,
  fdr_alpha = 0.2,
  control = inough_control(),
  heuristics = inough_heuristics()
)

Arguments

signals

An inough_signals object from inough_signals.

fdr_alpha

FDR significance level for BH correction (default 0.05).

control

An inough_control object (see inough_control).

heuristics

An inough_heuristics object (see inough_heuristics). Controls boundary extension mode and spurious-accuracy trimming.

Value

An inough_detected object.

Post-detection heuristics

Description

Configures optional refinements applied after chunk screening: boundary extension mode and spurious-accuracy trimming.

Usage

inough_heuristics(
  boundary_mode = "heuristic",
  spurious = TRUE,
  spurious_n = 6L,
  spurious_k = NULL,
  min_left = 7L
)

Arguments

boundary_mode

How to extend chunk boundaries after detection: "heuristic" (default) walks backwards from chunk start to find where stereotyped responding actually began. "fixed" extends symmetrically (amount depends on window_weight in inough_control), "heuristic" walks backwards from chunk start to find where stereotyped responding actually began. End boundary always uses fixed extension.

spurious

Logical; enable spurious-start accuracy trimming (default TRUE). When TRUE, the first spurious_n trials of each chunk are checked: if accuracy is suspiciously high (>= k correct), those trials are excluded from the t-test.

spurious_n

Number of trials at chunk start to inspect (default 6).

spurious_k

Explicit threshold: flag if >= k correct out of spurious_n. When NULL (default), computed as ceiling((0.5 + 1.5 * sqrt(0.25 / n)) * n).

min_left

Minimum trials remaining after spurious trimming (default 7). Chunks shorter than this after trimming are dropped.

Value

An inough_heuristics object.

Extract inattention signals

Description

Fits a probit GLMM on accuracy and computes response bias indicators. This is the first step in the inough pipeline: call inough_signals, then pass the result to inough_detect.

Usage

inough_signals(
  df,
  correct,
  response,
  id = "ID",
  learning_effect = TRUE,
  participant_effect = TRUE,
  trial_transform = "sqrt"
)

Arguments

df

A data frame with rows ordered by trial within each participant.

correct

Formula. LHS names the accuracy column (0/1 integer). RHS names design predictors that explain correctness (e.g., correct ~ Stim + Weight + Orient + Block).

response

Formula identifying the response column. Use a two-sided formula where the RHS names the column (e.g., response ~ answ). Must have exactly 2 distinct values; auto-encoded to 0/1 by sorted order.

id

String naming the participant identifier column (default "ID").

learning_effect

Logical. If TRUE (default), adds n_trial as fixed effect and random slope per participant.

participant_effect

Logical. If TRUE (default), adds random intercept per participant.

trial_transform

Transformation applied to trial index before rescaling to [-1, 1]. One of "sqrt" (default), "log", "linear", or a function.

Value

An inough_signals object.

Normalized Lempel-Ziv Complexity (LZ76)

Description

Computes the normalized LZ76 complexity of a binary sequence. Higher values indicate more random/complex sequences; lower values indicate more predictable/repetitive patterns.

Usage

lz_complexity(x)

Arguments

x

Integer vector of 0s and 1s.

Value

Numeric scalar in \[0, 1\]. Normalized complexity where 1 = maximally complex (random) and values near 0 = highly predictable.

Examples

lz_complexity(c(0, 0, 0, 0, 0))       # low
lz_complexity(c(0, 1, 0, 1, 0, 1))     # low-medium
lz_complexity(sample(0:1, 100, TRUE))   # near 1

Sort and merge overlapping regions

Description

Sort and merge overlapping regions

Usage

merge_regions(regions)

Plot per-participant diagnostic panels

Description

Produces a stacked 4-panel visualization: accuracy strip, accuracy residuals, lag-1 response, and dual-track z-scores. Flagged chunks are highlighted as red shaded regions.

Usage

## S3 method for class 'inough_detected'
plot(x, id, ...)

Arguments

x

An inough_detected object.

id

Character scalar — participant ID to plot.

...

Ignored.

Value

A patchwork object (invisibly).

Generate interactive HTML report

Description

Creates a self-contained HTML file with a participant browser, diagnostic plots, chunk details, and summary statistics.

Usage

report(x, ...)

## S3 method for class 'inough_detected'
report(x, file = NULL, custom_plot = NULL, ...)

Arguments

x

An inough_detected object.

...

Arguments passed to methods.

file

Output file path. If NULL (default), uses a tempfile and opens in the browser.

custom_plot

Optional per-trial variable to show as an extra panel in the participant view. A list with three fields:

data: a data.frame with columns id, trial_idx, value. trial_idx is 1-based within participant, matching the trial order fed to inough_signals.
title: string shown as the panel label.
description: string shown as the panel blurb.

Value

Invisibly returns the file path.

Contiguous TRUE regions to start/end data.frame

Description

Contiguous TRUE regions to start/end data.frame

Usage

rle_regions(above, trial_idx)

Robust z-score (median / MAD); returns NULL if MAD = 0

Description

Robust z-score (median / MAD); returns NULL if MAD = 0

Usage

robust_zscore(x)

Centered weighted rolling mean

Description

weights is the full weight vector of length 2*k+1 (center at position k+1). At the edges the weight vector is cropped and renormalized so the SD formula stays valid.

Usage

rolling_wmean(x, weights)

Stage 1: Candidate Screening

Description

Flags regions where the absolute rolling mean of the lag-1 response exceeds the per-participant threshold: |roll_resp| > threshold. Catches both repetition (positive) and switching (negative) stereotypy with a single track. Overlapping regions are merged; chunks shorter than min_chunk are discarded. Boundary extension mode is controlled by heuristics$boundary_mode.

Usage

screen(trial, participant, bail_ids, control, heuristics)

Arguments

trial

Trial data frame from inough_signals.

participant

Participant data frame from inough_signals.

bail_ids

Character vector of bailed-out participant IDs.

control

inough_control object.

heuristics

inough_heuristics object.

Value

A list with $candidates (raw threshold crossings) and $chunks (after merge + min-length filter + boundary extension).

Example dual-task data with diverse inattention profiles

Description

A minimal anonymized subset of the Dual Task (Gabor orientation under motor interference) dataset, intended for demonstrating the inough pipeline. Twenty participants were sampled to span a range of attention profiles: clean performers, two bail-out cases (one for response stereotypy, one for chance-level accuracy), participants with localized inattention chunks, and participants with extended inattention periods.

Usage

task_example

Format

A data frame with one row per trial and the following columns:

participant: Anonymized participant identifier (factor, P01–P20).
block: Block index within the session (integer, >= 1; practice block excluded).
trial: Trial index within the participant's session (integer).
stim: Stimulus identifier (integer).
weight: Stimulus weight / contrast (numeric).
orient: Gabor orientation code (integer).
cue_type: Cue type code (integer).
response: Participant's response (integer, two unique values).
correct: Trial accuracy (integer, 0 or 1).

Details

Participant identifiers have been replaced with arbitrary codes (P01–P20) and any session timing information has been removed.

Source

A subset of the Dual Task (s_9) data collected in the COST/Kraken consciousness study (Krakow site). Participant IDs have been re-coded for anonymity.

Examples

data(task_example)
head(task_example)


signals <- inough_signals(
  task_example,
  correct  = correct ~ stim + weight + orient + cue_type + block,
  response = response ~ response,
  id       = "participant"
)
det <- inough_detect(signals)
summary(det)

Stage 3: Formal Test

Description

Welch t-test on accuracy residuals (chunk vs comparison set) with LZ-informed local FDR. When test_start column is present (from spurious-accuracy trimming), uses it for inside trials while still excluding the full chunk from the comparison set.

Usage

test_chunks(chunks, trial, participant, fdr_alpha, comparison)

Arguments

chunks

Data frame from Stage 2 (id, start, end, and optionally test_start, spurious_start).

trial

Trial data frame from inough_signals.

participant

Participant data frame from inough_signals.

fdr_alpha

Significance threshold for local FDR (default 0.05). A chunk is flagged when lfdr < fdr_alpha.

comparison

"clean" or "rest".

Details

Local FDR is computed per chunk:

lfdr = \pi_0 f_0(t) / [\pi_0 f_0(t) + \pi_1 f_1(t)]

where \pi_0 = \min(1, LZ) is the LZ-informed null prior, f_0 is the central t density (null), and f_1 is a non-central t density with noncentrality parameter reflecting the expected accuracy drop to chance.

Value

Data frame with id, start, end, t_stat, df, p_raw, p_adj, lfdr, effect_size, significant.

Package {inough}

Apply spurious-start accuracy heuristic

Description

Usage

Arguments

Value

Participant-level bail-out

Description

Usage

Arguments

Value

Extract accuracy residuals via probit GLMM

Description

Usage

Arguments

Value

Extract response bias signals

Description

Usage

Arguments

Details

Value

Extract flagged trials

Description

Usage

Arguments

Value

Pipeline tuning parameters

Description

Usage

Arguments

Value

Detect inattention episodes

Description

Usage

Arguments

Value

Post-detection heuristics

Description

Usage

Arguments

Value

Extract inattention signals

Description

Usage

Arguments

Value

Normalized Lempel-Ziv Complexity (LZ76)

Description

Usage

Arguments

Value

Examples

Sort and merge overlapping regions

Description

Usage

Plot per-participant diagnostic panels

Description

Usage

Arguments

Value

Generate interactive HTML report

Description

Usage

Arguments

Value

Contiguous TRUE regions to start/end data.frame

Description

Usage

Robust z-score (median / MAD); returns NULL if MAD = 0

Description

Usage

Centered weighted rolling mean

Description

Usage

Stage 1: Candidate Screening

Description

Usage

Arguments

Value