The PFCI package implements Penalized Fast
Causal Inference, a two-stage procedure for learning causal
structure in high-dimensional settings with potential latent variables
and selection bias. The method combines graphical lasso screening with
the FCI algorithm to produce a Partial Ancestral Graph (PAG) that is
substantially faster than standard FCI/RFCI while maintaining accuracy
under sparsity.
PFCI is available on CRAN. It requires
pcalg and graph from Bioconductor for its core
functionality:
The standard three-step workflow is simulate, fit, evaluate:
library(PFCI)
# Step 1: simulate a sparse DAG with p = 100 nodes
sim <- simulate_pfci_toy(p = 100, n = 100, edge_prob = 0.02, seed = 1)
# Step 2: fit PFCI
fit <- pfci_fit(sim$X, alpha = 0.05)
print(fit)
# Step 3: evaluate against ground truth
met <- pfci_metrics(sim, fit)
metThe print(fit) call reports runtime and tuning
parameters. The met list contains SHD, F1, MCC, Precision,
Recall, and Time.
To simulate and evaluate under latent confounding use the
simulate_with_latent and metrics_with_latent
functions:
PFCI is approximately 3x faster than RFCI at p = 1000
while maintaining equal or better F1 and MCC. See Table 1 of Pal, Ghosh,
and Yang (2025) for full simulation results across p = 100
to p = 1000.
Pal, S., Ghosh, D., and Yang, S. (2025). Penalized FCI for Causal Structure Learning in a Sparse DAG for Biomarker Discovery in Parkinson’s Disease. Annals of Applied Statistics. doi:10.48550/arXiv.2507.00173