nmfkc 0.8.8

Removed the `B.L1` penalty (and its `gamma` alias)

B.L1 placed an L1 penalty on the fitted coefficient field . Because is dense and tracks the overall reconstruction magnitude, B.L1 acted as a crude global shrinkage that pulls the fit toward zero and degrades prediction, without producing useful structural sparsity. It has been removed from nmfkc() and from the test-set B refit inside nmfkc.cv(). Use C.L1 for sparsity / variable selection on the parameter matrix (individual entries are driven to exactly zero). Passing B.L1/gamma now has no effect (silently ignored via ...).

MAP penalties for `nmfre()`

nmfre() gains three optional penalties (default 0, via ...), acting as Gaussian priors on the basis/coefficients and orthogonal to the random-effect machinery (U, lambda, sigma2, tau2 are unchanged):
- X.L2.smooth: path-graph row smoothness of the basis X — well suited to longitudinal / ordered-row models.
- X.L2.ortho: column orthogonality of X.
- C.L2: ridge on Theta = C. For C.signed = TRUE the C-step stays a closed-form solve (a Sylvester ridge-least-squares via the eigenbases of X'X and AA'); for C.signed = FALSE it is added to the MU denominator. Penalties enter the fixed-lambda inner objective; the EM variance updates are untouched. (L1 sparsity on Theta is intentionally not offered: it would break the signed closed-form step and conflicts with the random-effect shrinkage that already regularizes the model.)

**`nmfae` deprecated in favour of `nmf.rrr`**

The canonical implementation of the three-layer NMF-RRR model now lives under the nmf.rrr* / nmf.rrr.signed* names (nmf.rrr, .inference, .ecv, .cv, .rank, .DOT, .heatmap, .kernel.beta.cv, .rename, and the six signed variants). The former nmfae* / nmfae.signed* names are now thin deprecated wrappers that emit .Deprecated() and forward to their nmf.rrr* counterpart. Fitted objects keep the legacy S3 classes (e.g. class = c("nmf.rrr", "nmfae", "nmf")), so all S3 methods and saved objects continue to work unchanged.

**`nmf.sem` deprecated in favour of `nmf.ffb`**

The canonical implementation of the NMF-FFB (feed-forward + feedback) model now lives under the nmf.ffb* names (nmf.ffb, nmf.ffb.inference, nmf.ffb.cv, nmf.ffb.split, nmf.ffb.DOT). The former nmf.sem* names are now thin deprecated wrappers that emit .Deprecated() and forward to their nmf.ffb* counterpart. Fitted objects keep class = c("nmf.ffb", "nmf.sem", "nmf"), so all S3 methods and existing saved objects continue to work unchanged.

`C.L2` ridge for the signed families

nmfkc.signed() and nmf.rrr.signed() gain a C.L2 ridge (default 0, via ...) on the signed coefficient matrix , penalizing C.L2 * ||Cp - Cn||^2. Because only the difference (= ) enters the model, the penalty has zero gradient on the unidentified common mode ; it is injected symmetrically into the Cp/Cn multiplicative updates (num_Cp += C.L2*Cn, den_Cp += C.L2*Cp) across both the unweighted and weighted paths, and added to the tracked objective.

Basis penalties extended to the signed families

nmfkc.signed() now accepts X.L2.ortho (column orthogonality) and X.L2.smooth (path-graph row smoothness), matching nmfkc(). Both default to 0 (off), are passed via ..., and are skipped when X.restriction = "fixed". The penalties are folded into both the fast unweighted and the weighted MU paths and into the tracked objective.
nmfae.signed() now accepts X1.L2.ortho / X2.L2.ortho (orthogonality of the response-basis columns and covariate-basis rows), matching nmfae(). Default 0, via ..., wired into both MU paths and the objective.

`by` option: grouping order of coefficient tables

The print() methods for the inference summaries (nmfkc.inference, nmfae/nmfae.inference, nmfae.signed.inference, nmfkc.net.inference) and summary.nmfre() gain a by argument controlling how the significance table is grouped: by = "covariate" (default, unchanged behaviour) lists all bases within each covariate (1-1, 1-2, …), while by = "basis" lists all covariates within each basis (1-1, 2-1, …). The default reproduces the previous ordering exactly.
The symmetric-network model (nmfkc.net, tri-type) follows the same rule: since it sets , the parameter matrix ’s column factor (Basis.col) is the covariate slot and its row factor (Basis.row) is the basis slot, so by groups by Basis.col / Basis.row respectively. (The bi-type has no free , only , so no coefficient table.)

Classed CV objects with `print` / `plot`

nmfkc.ecv(), nmfkc.cv() and nmfkc.bicv() now return classed objects ("nmfkc.ecv" / "nmfkc.cv" / "nmfkc.bicv") with print() and plot() methods — the rank sweeps (ecv, bicv) plot a score-vs-rank curve with a marker, matching nmfae.ecv() / nmfre.ecv(). Field access (cv$sigma, etc.) is unchanged; nmfkc.ecv() also now returns the swept rank vector.

Naming / API consistency pass (aligned to the nmfkc house style)

Fit objects now report runtime as numeric seconds everywhere (was a preformatted string in nmfkc()); print() formats it for display.
nmfkc.net() fit objects now return sigma (RMSE), for parity with nmfkc() / nmfkc.signed().
nmfre.ecv() returns the held-out RMSE as $sigma (was $sigma.ecv) to match nmfkc.ecv().
New predict.nmfre(): fixed-effect prediction for new covariates, or the in-sample BLUP fit when newA is omitted.
nmf.sem/nmf.ffb fit objects now carry the shared "nmf" class, so the common coef/fitted/residuals fallbacks apply.
Fold-count argument unified to nfolds (nmfre.ecv was nfold; legacy names accepted via ...); summary.nmfre() CI toggle is ci.show (object-first); nmfae/nmfae.signed fit objects gained an iter alias of niter.

`X.L2.smooth`: row-smoothness penalty on the basis

New penalty X.L2.smooth (nonnegative, default 0) adds with the path-graph Laplacian over the rows, i.e. it penalizes squared differences between adjacent rows and yields gently-varying (smooth) bases — useful when the rows of have a natural order (e.g. time points). Like X.L2.ortho, it slots into the multiplicative -step (), preserving non-negativity and monotone descent. Default 0 reproduces prior results exactly.

Public API: `nmf.rrr` and `nmf.ffb` are the documented names

The legacy nmfae* and nmf.sem* families are now marked internal (@keywords internal): they remain exported and fully functional for backward compatibility, but no longer appear in the reference index / pkgdown site. Their documented, user-facing names are the NMF-RRR aliases (nmf.rrr*) and the NMF-FFB aliases (nmf.ffb*), which now each have their own self-contained help page (previously nmf.ffb* shared the nmf.sem* pages).

`X.init = "kmeans++"` basis initialization

New basis-initialization option X.init = "kmeans++" (alias "kmeanspp") seeds the -means centres by weighting (Arthur & Vassilvitskii, 2007, SODA) before Lloyd refinement, giving a more careful, -competitive initialization than uniform-random seeding. Available in every optimizer: nmfkc, nmfre, nmf.sem, nmfkc.net, nmfkc.signed (shared initializer), and nmfae / nmfae.signed, which now forward X.init to their internal nmfkc() basis-init steps. The default remains "kmeans" (unchanged results); nstart is not used for "kmeans++" (one careful seeding replaces random restarts).

`nmf.rrr` / `nmfae` family: `rank1` / `rank2` arguments

The two basis ranks of the NMF-RRR (tri-factorized) family are now the symmetric rank1 (response basis ) and rank2 (covariate basis , default rank1), replacing the asymmetric rank / rank.encoder. Applies across the whole family: nmfae/nmfae.signed and their .ecv/.cv/.rank/.kernel.beta.cv helpers (and the nmf.rrr* aliases). The legacy rank / rank.encoder (and Q / R) remain accepted for backward compatibility, so existing calls keep working.

`nmfre.ecv`: rank selection for NMF-RE

New nmfre.ecv() selects the basis rank by Wold-style element-wise (entry-holdout) cross-validation with iterative imputation, scoring the held-out prediction RMSE (sigma.ecv). The held-out entries of a column are predicted from that column’s retained entries via the BLUP , so — unlike nmfkc.ecv() (zero-weight mask, fixed-effect prediction) — it evaluates the full NMF-RE model including the random effects. Returns a "nmfre.ecv" object with print/plot methods (the plot marks the minimizing rank). Sign convention follows C.signed; CV tolerances are loosened by default and overridable via ....

`nmfkc.DOT`: signed-coefficient graphs

New argument C.signed lets nmfkc.DOT() draw graphs when () is signed (real-valued), e.g. from nmfre(C.signed = TRUE) or the *.signed fits. In signed mode threshold is an absolute-value cut ( threshold), edge widths scale by , and negative edges are drawn as black dashed lines (positive edges solid) with their signed numeric labels. Default C.signed = NULL auto-detects from result$C.signed or negative entries in / ; FALSE restores the historical non-negative behaviour. The basis is always non-negative, so edges are unaffected.

`nmfre`: marginal-NLL convergence trace

nmfre() now records nll.trace, the marginal negative log-likelihood (random effects integrated out), which the ECM algorithm decreases monotonically. plot.nmfre() displays this instead of the fixed- penalized objective (objfunc.iter), which is not monotone across outer iterations because it jumps when is updated.

`nmfre`: optimization and inference fully separated

nmfre() now performs optimization only, mirroring the nmfkc() / nmfkc.inference() split. The wild.bootstrap argument and all inference outputs (coefficients, C.se, C.se.boot, C.ci.*, C.p.side, sigma2.used, …) are removed from nmfre(); the inline inference block is gone, making the function lighter and easier to maintain. Obtain standard errors, z-values, p-values, and confidence intervals for by passing the fit to nmfre.inference(fit, Y, A). summary() prints the coefficient table only after inference has been run.

`nmfre`: EM/ECM algorithm and sign-free fixed effects (paper port)

nmfre() is re-implemented to follow the Psychometrika manuscript’s NMF-RE mixed model . The optimizer is now an outer-inner ECM: the inner loop is a fixed- block-coordinate descent (random-effect ridge BLUP for , complete-EM semi-NMF step for the basis including the posterior variance , and a fixed-effect update for ); the outer loop runs the EM M-steps for and until stabilizes.
New formal argument C.signed (logical, default TRUE, recommended, matches the paper). TRUE makes the fixed-effect coefficients () real-valued, updated by exact least squares, with a two-sided test (interior null) and no projection of the bootstrap replicates. FALSE restores the historical non-negative variant (multiplicative update, one-sided/boundary test). A character value ("signed" / "nonneg") is also accepted for backward compatibility.
C.signed is the single switch for the whole estimation scheme: it also selects the basis () update rule (TRUE → complete-EM semi-NMF, FALSE → positive-part multiplicative update), reproducing the paper’s pairing. is non-negative in both cases. (x.postvar remains an advanced toggle for the posterior-variance term of the semi-NMF step.)
No cap is imposed on ; it is reported as a diagnostic only. dfU.control is now deprecated and inert. Output gains the logical C.signed; summary.nmfre() reports the sign convention and p-value side.
Removed the exported helper nmfre.dfU.scan() (and its print method): it scanned df_U cap rates, which no longer exist now that the variance components are estimated. The df.rate argument is retained but inert.

`nmf.rrr`: NMF-RRR names for the `nmfae` family

New nmf.rrr / nmf.rrr.signed (and .inference, .ecv, .cv, .rank, .DOT, .heatmap, .kernel.beta.cv, .rename) are thin aliases of the corresponding nmfae* / nmfae.signed* functions, matching the (tri-factorized non-negative reduced-rank regression) name used in Satoh & Tokuda. The legacy nmfae* names remain fully functional (no deprecation). nmf.rrr() / nmf.rrr.signed() prepend the NMF-RRR class to the fit; all existing S3 methods are reused by inheritance.
The fitted bases are now labelled Resp (response basis ) and Cov (covariate basis ) instead of Dec/Enc, in both nmfae()/nmfae.signed() (and so nmf.rrr*), matching the response/covariate co-clustering reading.

`nmfae()`: Kullback-Leibler divergence objective

nmfae() gains method = c("EU", "KL") (mirroring nmfkc()). (default, unchanged) minimises the Frobenius distance; minimises the generalised Kullback-Leibler divergence via Lee-Seung multiplicative updates for all three factors of (numerator carries the ratio , denominator the column/weight sums). Weights, L1/L2 penalties and the encoder structure are supported in both modes. For the residual SE is (not on the data scale); records the objective used.
nmfae() and nmfre() now honour nstart (previously silently ignored): it is forwarded to the nmfkc() initialisation step(s) (k-means multi-start). Default keeps the historical single-start behaviour; a larger value gives a more stable initialisation and is recommended before inference. (nmfkc(), nmfae.signed(), nmfkc.net() and nmfkc.signed() already supported nstart; all expose it via ....)
Bug fix in nmfre(): a character X.init (e.g. "runif", "nndsvd", "kmeans") previously fell through unresolved and crashed in .nmfre.normalize.X() (“‘x’ must be an array of at least two dimensions”). X.init now accepts (default), a named init method forwarded to nmfkc() (so random-init multi-start works), or a numeric basis matrix (used as-is, with estimated given that fixed ).

`nmfkc.inference()`: re-fit wild bootstrap for singular information

New method = "refit" (alongside the default backward-compatible "onestep") performs a residual wild (multiplier) bootstrap that re-estimates () to convergence with the basis held FIXED, using no information matrix. It stays valid when the Fisher information is singular (over-parameterised / kernel covariates) or lies on the boundary, where the one-step / sandwich SE is unreliable. With fixed there is no label switching or scale ambiguity, so element-wise SE/CI of are valid even for . The bootstrap SE/CI become primary and the p-value is a two-sided bootstrap p-value.
wild.dist selects the multiplier distribution ("rademacher", "mammen", "exp"), orthogonal to method; wild.unit ("element" / "column") the granularity. Raw draws are returned in $C.boot.draws so any identifiable functional (e.g. a contrast or fitted curve) and its percentile band can be formed. nmfkc.net.inference() inherits the mode by delegation.
Internal: the wild-bootstrap engine is factored into shared helpers in R/inference-boot.R (.wild.multipliers, .boot.onestep, .boot.refit, .refit.C.MU, .boot.summarize). The previously duplicated one-step loop in nmfkc.inference(), nmfre() / nmfre.inference(), nmfae.inference() and nmfae.signed.inference() now all call the shared .boot.onestep() (behaviour unchanged; nmfae.signed uses project = FALSE for signed ).

nmfkc 0.8.2

`nmfkc.net.DOT()`: default layout is now `"neato"`

The layout choices are reordered by recommendation (neato, fdp, twopi, circo, dot), so the default changes from "fdp" to "neato", which separates community graphs more clearly. Raising threshold (e.g. 0.2–0.3) further declutters weak membership edges.

Bug fix: `nmfkc.net.DOT()` mis-detected `type = "bi"` as `"tri"`

The bi-vs-tri auto-detection ignored the result’s $type field and fell back to all.equal(C, diag(Q)), which fails when C carries dimnames (it reports a names mismatch). A type = "bi" fit was therefore treated as "tri", drawing the inter-class interaction layer that the bi model (with ) should not have. Detection now uses $type first (falling back to the dimnames-safe identity check), so "bi" correctly draws no inter-class edges.

`nmfkc.bicv()` / `nmfkc.consensus()`: leaner signatures

Fine-tuning arguments move into ... (same safe defaults): nmfkc.bicv() is now nmfkc.bicv(Y, rank, ...) (nfolds = 2 per Owen & Perry, plus seed, nnls.maxit, via ...), and nmfkc.consensus() is nmfkc.consensus(Y, A, rank, nrun, keep.consensus, ...) (seed, pac.range via ...). Existing named-argument calls are unaffected.

`nmfkc.ard()`: simpler, safer interface

The signature is trimmed to the essentials nmfkc.ard(Y, rank, nrun, plot, ...); everything else (prior, seed, a, b, maxit, epsilon, tol) moves into ... with the same safe defaults, so a typical call is just nmfkc.ard(Y, rank = K).
nrun now defaults to 10 (was 1): ARD is a sensitive point estimate, and several restarts give a stable modal rank by default.
The help now states explicitly that the implementation is the Euclidean () case of Tan & Fevotte (2013) and that the default b is an empirical energy scale, not the paper’s method-of-moments value (Eq. 38).

`nmfkc.ard()`: better default prior scale

The default b is now the initial per-component energy scale (nrow(Y) + ncol(Y)) / K * mean(Y) instead of a fixed 0.001 * mean(Y). The old fixed fraction over-pruned (winner-take-all collapse onto one dominant component) when (F + N)/K was large; the new scale-aware default recovers genuine low-rank structure stably (e.g. a clean rank-3 signal: relevance 1, 0.99, 0.87, 0, ..., all restarts agree).

New `nmfkc.ard()`: ARD rank determination (Tan & Fevotte 2013, prototype)

Automatic Relevance Determination for the NMF rank (Euclidean). Fits NMF once at an over-complete rank and prunes automatically: each component carries a relevance weight with an inverse-gamma prior and the multiplicative updates gain a penalty (L2 half-normal / L1 exponential) that drives unsupported components to zero. The number of surviving components is the estimated rank – no rank scan. Returns an "nmfkc.ard" object with print and a relevance-bar plot. Plain NMF only; a sensitive point estimate (depends on prior / start / init), so a complement to the CV / consensus engines, not a sole criterion.

New `nmfkc.consensus()`: consensus-clustering rank selection (Brunet 2004)

The bioinformatics-standard stability approach, as a lightweight engine like nmfkc.ecv / nmfkc.bicv. For each rank it runs NMF nrun times from random initializations (X.init = "runif"), builds the consensus matrix from the per-run hard clusterings, and returns two stability scores per rank: cophenetic (cophenetic correlation coefficient, Brunet et al. 2004) and dispersion (Kim & Park 2007, in [0,1]). Unlike the CV engines, a good rank maximizes stability. Optional keep.consensus = TRUE returns the consensus matrices.
Also reports pac, the Proportion of Ambiguous Clustering (Senbabaoglu et al. 2014; fraction of consensus entries in the ambiguous interval pac.range, default (0.1, 0.9)). Lower is better and it is more sensitive than the often-saturated cophenetic. The print/criteria-plot show all three metrics.
Returns an "nmfkc.consensus" object with print and plot methods: plot(cs) (type = "criteria") draws the stability curves; plot(cs, type = "heatmap", rank = ...) draws the consensus matrix heatmap(s) reordered by hierarchical clustering (default = all ranks in a n2mfrow grid; mfrow overridable).

New `nmfkc.bicv()`: bi-cross-validation for rank selection

Owen & Perry’s (2009) bi-cross-validation (BCV), a lightweight CV engine in the spirit of nmfkc.ecv: it returns the held-out error per rank (objfunc, sigma) and nothing more. Holds out a row-block and a column-block at once, fits NMF only on the retained block, and predicts the held-out block by folding the held-out rows/columns onto the fixed factors via non-negative regression (no information leakage, unlike element-wise nmfkc.ecv). nfolds = 2 (leave out half rows / half columns) per Owen & Perry’s recommendation.

**`*.rank`: eff.rank.idx shown for context (no best marker)**

The broken-stick-corrected effective-rank index (eff.rank.idx, green) is drawn for context only and no longer carries a “Best (Max)” marker: it is a factor-utilization diagnostic (most even relative to the random null), not a predictive rank optimum. The recommended rank is driven solely by the ECV minimum and the R-squared elbow.

**`*.rank`: broken-stick-corrected effective-rank index**

The *.rank criteria table gains effective.rank.expected (the broken-stick / uniform-Dirichlet null exp(H_Q - 1), H_Q = the Q-th harmonic number) and effective.rank.index, the [0, 1] index (effective.rank - expected) / (Q - expected) (clamped). The index anchors 0 at the random null and 1 at perfect evenness, removing the small-rank inflation of the raw effective.rank / Q. Its maximum is a meaningful rank, so the diagnostics plot now draws this corrected index (green, eff.rank.idx) with a restored “Best (Max)” marker in place of the raw ratio.

**`*.rank` results gain `plot()` / `print()` methods**

The rank-selection functions (nmfkc.rank(), nmfkc.net.rank(), nmfkc.signed.rank(), nmfae.rank(), nmfae.signed.rank()) now return a classed object ("nmf.rank"). plot() redraws the three-criterion diagnostics plot (honouring main, xlab, ylab, lwd) and print() shows the recommended rank, the per-criterion best ranks, and the criteria table. As before the constructor draws immediately when plot = TRUE; the $rank.best and $criteria fields are unchanged, so existing code keeps working.

New `nmf.cluster.flow()`: cluster-flow diagram across ranks

nmf.cluster.flow() and nmf.cluster.criteria() now treat the supplied fits as a generic (kept in the given order, sorted by rank), so the same rank fitted as different models is also supported. Both gain a names argument for the x-axis tick labels (default: each result’s $rank), and in nmf.cluster.flow() the reference argument is now the (1-based position) of the result that defines the colours – not a rank value – defaulting to the central result floor(length(fits) / 2) + 1 (e.g. the 2nd of 2 or 3 results).
The adjusted Rand index (ARI) between each pair of adjacent ranks is now computed and printed along the top of the figure (and returned in $ARI, length ), summarizing how much the hard clustering changes from one rank to the next.
Each cluster box is now tinted by the reference colour among the individuals it contains (the colour shared by the most member lines); ties are broken in favour of the earliest palette entry (the smallest reference-cluster id). This shows at a glance which reference cluster dominates each box at each rank.
nmf.cluster.flow() now inserts a gap of one average cluster () between clusters in the per-rank layout and sizes each grey box exactly to the minimum/maximum position of its members, so the cluster boxes are clearly separated with the gaps maximized. Each rank is normalized to the full height independently.
The cluster number is the dominant-factor index (argmax of the coefficient) of each fit, kept as-is so it matches the factor/basis numbering of the supplied models. A factor that never dominates any individual leaves an empty, unused cluster number (a gap, e.g. labels 2, 3 with no 1) – this is correct and consistent with the fit, and the labels are not renumbered.
nmf.cluster.flow() now returns a classed object with a dedicated plot() method, so the diagram can be (re)drawn with plot(fl, col = , lwd = , xlab = , ylab = , main = ) – the colour vector (indexed by reference cluster), line width, axis labels and title are all honoured. The constructor still draws immediately by default (plot = TRUE) and forwards graphical arguments to the plot method; use plot = FALSE to build the object and plot it later. Its print() method shows the adjacent-rank ARI and the full cluster table.
nmf.cluster.flow(fits, reference = ) takes a list of models fitted at different ranks (any non-negative MU family) and draws an alluvial / Sankey-style diagram of how the hard sample clustering changes with the rank : each individual flows left-to-right across the ranks (x-axis), its vertical position is set by its cluster (clusters reordered per rank by a barycenter heuristic to reduce crossings), and lines are coloured by the cluster at the rank – so one can watch the reference clusters split or merge. At every rank a translucent grey box is drawn of each cluster’s members with the cluster number centred inside, so the grouping and labels are visible at all ranks (not only the reference). The default line palette is now a strong, well-separated qualitative set (ColorBrewer , no pale colours) and can be overridden with . Returns (invisibly) the table with rows = individuals, columns = rank, entries = cluster number.

New `nmf.cluster.criteria()`: sample-clustering quality across ranks

nmf.cluster.criteria(fits, Y) takes a (one per rank; a single fit is also accepted) and reports the clustering-quality criteria silhouette, CPCC, and dist.cor for each rank, returning a per-rank $criteria table (mirroring nmf.cluster.flow()). It has plot() (line plot of the three criteria vs rank) and print() (the table) methods, and draws immediately when plot = TRUE. Works for any family (nmfkc, nmfkc.signed, nmfae, nmfae.signed, nmfkc.net, nmfre, nmf.sem/nmf.ffb; the last needs the exogenous block via Y2). These are clustering-stability diagnostics, deliberately separate from the rank-selection *.rank functions (r.squared / effective rank / ECV).
Hard sample clustering needs a non-negative coefficient/score matrix (a valid membership simplex). nmf.cluster.criteria() detects this from the actual coefficient: when it is non-negative the hard-label silhouette (and cluster sizes) are returned; when it is signed silhouette is NA while the distance-based CPCC and dist.cor are still computed. (ARI is not reported here – it compares two clusterings, e.g. across ranks or resamples, so it is not a single-fit quantity.)
nmfkc.rank() no longer carries ARI, silhouette, CPCC, or dist.cor in its criteria table – those clustering-stability metrics now live in nmf.cluster.criteria(). All five *.rank functions return the same five columns (rank, effective.rank, effective.rank.ratio, r.squared, sigma.ecv). Per-rank fits use detail = "fast", so the expensive O(N^2) distance computations are skipped during rank selection. rank.best is unchanged. The *.rank functions now emit a one-line message pointing to nmf.cluster.criteria() for clustering quality.

Rank-selection functions for the other NMF families

New nmfkc.net.rank(), nmfkc.signed.rank(), nmfae.rank() (paired ) and nmfae.signed.rank() (paired) bring nmfkc.rank-style rank selection to the other multiplicative-update models. Each reports the three criteria that are well defined for every family – r.squared, the effective rank (utilization), and the element-wise CV error sigma.ecv – and returns list(rank.best, criteria). (nmf.ffb / nmfre are not covered: they do not support the element masking that ECV needs.)
nmfkc.rank() plot simplified and unified. All *.rank functions now share one back-end .rank.finish() and draw the same concise three-criterion figure: r.squared (red), eff.rank (green), and sigma.ecv (blue, right axis), each as a line with points, rank-number labels, and a highlighted best marker – “Best (Elbow)” for the R-squared knee, “Best (Peak)” for the effective-rank utilization, and “Best (Min)” for the CV minimum. nmfkc.rank() still computes ARI, silhouette, CPCC, and dist.cor into its criteria table, but no longer plots them.
The four new *.rank functions gain a detail argument matching nmfkc.rank: "full" (default) runs the element-wise CV and reports sigma.ecv; "fast" skips the (expensive) CV, so the plot shows only r.squared and eff.rank and the recommended rank falls back to the R-squared elbow.

Internal: shared element-wise CV helpers

The four element-wise cross-validation functions (nmfkc.ecv(), nmfae.ecv(), nmfkc.signed.ecv(), nmfae.signed.ecv()) now build their folds through a single internal helper .ecv.make.folds(), removing four near-identical copies of the fold-partitioning loop. nmfkc.net.ecv() keeps its symmetric upper-triangle folds.
element-wise CV functions now share one config-indexed loop driver .ecv.run(labels, nfolds, run_one, progress): the single-rank ones (nmfkc.ecv(), nmfkc.net.ecv(), nmfkc.signed.ecv()) and the -grid ones (nmfae.ecv(), nmfae.signed.ecv()). Each supplies a model-specific run_one(i, k) closure (mask fold, refit config i, return held-out loss) and an optional progress callback; .ecv.run() handles the config-by-fold loop, the objfunc/sigma/objfunc.fold aggregation, and naming. This removes the last copies of the CV-loop machinery, including the per-grid reshaping in nmfae.ecv().
The refactor is behaviour-preserving: for the same seed the folds and all CV values (objfunc, sigma, objfunc.fold, names/labels) are byte-for-byte identical to before, verified across EU and KL losses, the symmetric (upper-triangle) case, and both paired and full grids.

Unified summary print blocks

New shared internal helpers .print.fit.statistics() and .print.structure.diagnostics() render the “Statistics” / “Goodness of fit” and “Structure Diagnostics” blocks for summary.nmfkc(), summary.nmfae(), and summary.nmfkc.net() (incl. the signed variant). Labels are padded to a common width so values are column-aligned, fields absent from a given model are skipped automatically (e.g. nmfkc.net has no residual SE), and any future fit statistic or sparsity row is now added in one place instead of per-summary.

Effective Rank in all five MU-family summaries

summary() now reports the Effective Rank as x.xx / Q (NN.N%) – the absolute value, the nominal rank, and the utilization ratio effective.rank / Q as a percentage – for nmfkc(), nmfkc.net(), nmfae(), nmf.ffb() / nmf.sem(), and nmfre() — previously only nmfkc() showed it. Each is computed by the new shared internal helper .effective.rank(B) from the model’s natural coefficient/score matrix: the coefficients (nmfkc), the latent encoding (nmfae), the node membership (nmfkc.net), the latent scores (nmf.ffb), and the BLUP scores (nmfre). NA at .

Rank-selection diagnostics: silhouette / CPCC fixed, IC removed

silhouette is now computed in the original data space. It used to be evaluated on the rank- B.prob simplex, whose dimension changes with ; that made it monotone in (always favouring the smallest rank) and hid genuine cluster structure. It is now the standard mean silhouette width over dist(t(Y)) (the fixed original-data sample distances) with the per-sample hard labels — the k-means convention. On data with real clusters it now shows an interior optimum (e.g. the road-OD network peaks at the same rank as the cross-validation minimum).
CPCC is now the classic cophenetic correlation of dist(t(B)). It used to be computed from the soft co-membership t(B.prob) %*% B.prob, which was nearly flat across . It is now cor(dist(t(B)), cophenetic(hclust(dist(t(B))))) — how well a hierarchical clustering of the rank- coefficient distances reproduces those distances (Sokal & Rohlf). It now varies with and recovers an interior optimum.
Removed ICp, AIC, and BIC from nmfkc()’s criterion list, from summary.nmfkc(), and from nmfkc.rank()’s table. Empirically (across three real datasets) ICp was monotone increasing (always selecting ) and AIC monotone decreasing (always selecting the largest ); for NMF, where the parameter count grows as , these information criteria do not have a usable interior optimum, so they were misleading rather than informative.
The internal helper .silhouette.simple() (centroid-approximate, took a B.prob matrix) was replaced by .silhouette.mean(D, labels), which returns the exact mean silhouette width from a distance matrix and labels.

Breaking change: symmetric NMF removed from `nmfkc()`

The Y.symmetric = "bi" / "tri" option (deprecated in v0.7.x) has been removed from nmfkc() and nmfkc.ecv(). Symmetric NMF of network data now lives exclusively in the dedicated nmfkc.net() / nmfkc.net.ecv() functions, which use the correct Frobenius bilateral-gradient updates. Passing Y.symmetric to nmfkc() or nmfkc.ecv() now stops with a message pointing to the replacement: nmfkc.net(Y, rank, type = "tri") (types "tri", "bi", "signed"). This also removes the bi/tri code branches (cube-root damping, fixed C = I, tri C-update, upper-triangle CV folds) from nmfkc(), simplifying the core function.

New diagnostic: effective rank

nmfkc() now reports criterion$effective.rank, the effective rank of the fit: exp of the Shannon entropy of the explained-variance distribution p_k = var(B[k, ]) / sum_j var(B[j, ]). By the trace identity sum_k var(B[k, ]) = tr(Cov(B)), each p_k is the exact fraction of the total coefficient variance carried by factor k, so the entropy is a genuine additive decomposition (variances add; standard deviations do not, which is why variance — not sd — is the natural partner for the entropy here). It ranges in [1, Q] and counts how many latent factors actively shape across-sample variation (dead, zero-variance factors drop out). This is the PCA-style explained-variance / effective-dimensionality measure and reuses the exp(entropy) functional form of Roy & Vetterli (2007).
summary.nmfkc() prints Effective Rank: x.xx / Q.
nmfkc.rank() adds an effective.rank column to its criteria table. When effective rank plateaus well below the nominal rank, the extra factors are not carrying additional coefficient variance — a signal that the rank is over-specified.
nmfkc.rank(plot = TRUE) overlays an eff.rank curve (effective rank divided by nominal rank, in [0, 1], solid green line) on the diagnostics plot. A peak in this utilization curve marks the rank at which the latent factors carry the most evenly distributed variance.

Diagnostics cleanup: B.prob crispness metrics

Removed B.prob.sd.min and B.prob.entropy.mean from nmfkc()’s criterion list, from summary.nmfkc(), and from nmfkc.rank()’s criteria table and plot. All three B.prob.* peakedness metrics are monotone in the rank Q, so they carry no peak/elbow signal for rank selection (verified empirically); the principled rank signals are ECV, the R-squared elbow, and the new effective.rank utilization.
B.prob.max.mean (clustering crispness) is retained, but only in summary.nmfkc() (“Clustering Crispness”) and the criterion list. At a fixed Q it remains a useful confidence check — the mean dominant-cluster membership — before treating B.cluster as hard labels. It is no longer shown in nmfkc.rank() (cross-Q), where its 1/Q baseline shift makes it misleading.
summary.nmfkc() no longer prints “Clustering Entropy” (it duplicated the crispness information).

Improvements

Unified three-variant R² across all NMF functions. Every NMF variant (nmfkc(), nmfae(), nmfae.signed(), nmfkc.net(), nmfkc.signed(), nmfre()) now returns three goodness-of-fit summaries on the same scale, computed by the new internal helper .r.squared.all():
- r.squared: Pearson (scale-invariant, in ). Unchanged from before.
- r.squared.uncentered: . Baseline = the zero matrix (natural for non-negative factorizations without an intercept); matches the “uncentered R²” of intercept-free regression.
- r.squared.centered: . Baseline = per-row mean; the standard (“centered”) multivariate- regression ; equals 0 when the model predicts the row mean.
The two suffixed variants differ only in their baseline (denominator); both use the Frobenius norm in the numerator. Naming follows the centered/uncentered distinction used by statistics software (e.g. statsmodels). All three respect Y.weights == 0 masking (the standard NA-hold-out convention). For nmfre() the same three variants are also reported on the fixed-only prediction as r.squared.fixed.*. Displayed by all summary.* methods.

Bug Fixes

nmfkc.net(): r.squared now correctly excludes weight-zero (NA-masked) entries when Y.weights is supplied or auto-masking is in effect, matching the convention used by nmfkc(), nmfae(), nmfae.signed(), and nmfkc.signed(). Previously the correlation was computed over the full matrix including replaced-NA cells, giving a distorted r.squared.

Documentation

nmfkc(): removed Examples 3 & 4 (deprecated Y.symmetric = "bi"/"tri"); the documentation now points users to \link{nmfkc.net}() for symmetric NMF.
summary.nmf.sem(): example code, @param, and @seealso updated to use the canonical nmf.ffb name (the S3 method continues to dispatch correctly via c("nmf.ffb", "nmf.sem") inheritance).

nmfkc 0.7.3

Documentation

README and nmf-sem-with-nmfkc.Rmd vignette code now reference the canonical nmf.ffb.* aliases (nmf.ffb(), nmf.ffb.cv(), nmf.ffb.DOT()) instead of the legacy nmf.sem.* names. Both names continue to work; the change only affects what users see on the GitHub Pages homepage and in the vignette source.

nmfkc 0.7.2

Headline: NMF-FFB rebrand and full bootstrap inference

nmf.ffb* family added as the canonical alias for nmf.sem* (Satoh 2025, arXiv:2512.18250 adopts “NMF-FFB” — Non-negative Matrix Factorization with Feed-Forward + Feedback — as the model’s canonical name). nmf.sem* continues to work and shares the same return classes (c("nmf.ffb", "nmf.sem") and c("nmf.ffb.inference", "nmf.sem.inference", ...)), so existing scripts are unaffected.
nmf.sem.inference() / nmf.ffb.inference(): replaced the legacy 1-step Newton wild bootstrap with a full X-fixed pair bootstrap. Resamples columns of (Y1, Y2), refits (C1, C2) with X held at the original fit, and reports per-element support_rate = mean(|c_b| > threshold) together with percentile CIs. Significance markers (* / ** / *** at sup > 0.95 / 0.99 / 0.999) follow the lavaan convention. Both Theta_1 (feedback) and Theta_2 (exogenous) are inference targets (previous version covered only Theta_2).
nmf.sem() / nmf.ffb(): now runs nmfkc(Y1, A = Y2) internally by default when X.init is a string method, forwarding X.init, X.L2.ortho, epsilon, maxit, seed. The feedforward fit is used both as the X warm-start and as the baseline for SC.map. nmfkc.baseline = FALSE opts out.

Bug Fixes

nmf.sem.inference(): fixed dimension bug in the Leontief identity matrix (I_mat <- diag(Q) should have been diag(P1)); previously every replicate was silently marked invalid when P1 != Q.
nmfkc.net(): now auto-masks NA entries of Y (parity with the other four NMF variants); previously errored at the min(Y) < 0 check when Y contained NA.
nmfkc(): Fixed C matrix asymmetry in tri-symmetric NMF (Y.symmetric = "tri"). The C update was using stale B and XB computed from the old X; now B and XB are recomputed after X is updated. Also fixed column reordering to permute both rows and columns of C. Previously the relative asymmetry could reach ~46%; now it is at machine precision (~1e-14).

Improvements

Y.weights semantics unified to lm()-style weighted least squares across nmfkc(), nmfae(), nmfkc.net(), nmfkc.signed(), nmfae.signed(): loss is now sum(W * (Y - Yhat)^2) (linear in W, matching lm()’s weights argument). Binary masks (W ∈ {0, 1}; the standard ECV / NA-mask case) are unaffected since W = W^2.
All MU functions now emit a "maximum iterations (N) reached..." warning when maxit is exhausted without meeting the relative- tolerance criterion (previously silent in nmfae, nmfae.signed, nmfkc.net, nmfkc.signed, nmfre, and nmf.sem).
All MU functions now share maxit = 5000 as the default (was 5000 / 20000 / 50000 inconsistently). Together with the maxit warning above, users see explicit feedback when 5000 is insufficient and can opt into a larger cap.
New shared internal helper .init_X_method() for X initialization via "nndsvd" / "kmeans" / "kmeansar" / "runif" / numeric matrix. All NMF families now use the same dispatch logic; previous ad-hoc inline implementations are removed.
nmf.sem() returns SC.map (input-output structural fidelity: correlation between the equilibrium operator and the feedforward baseline mapping; Satoh 2025 §4.SC.map) automatically when nmfkc.baseline is supplied or computed internally.
summary.nmf.sem(): rewritten to display the full-bootstrap inference output — separate Theta_1 / Theta_2 blocks with Estimate | CI_low | CI_high | support | Pr(>0) | sig, plus a bootstrap meta-info header.
coef.nmf.sem(): now returns a long-format data frame with rows for every entry of both C1 and C2 (Type | Basis | Covariate | Estimate); previously returned only the C2 matrix when no inference had been run. Schema matches the inference-augmented output for uniformity.
plot.nmf.sem(): default trace is now objfunc.full (loss + penalties — the actual monotonically-decreasing quantity that the multiplicative updates minimize) instead of objfunc (reconstruction only). New argument which = "full" | "reconstruction" | "both".
nmf.sem.DOT(): significance stars now appear on Theta_1 (feedback Y1 → F) edges in addition to Theta_2 (exogenous Y2 → F); X (F → Y1) edges remain unstarred since the basis is not the inference target.
plot.nmfae.ecv(): Heatmap cell text color is now always black for better readability on light-colored cells.
nmfkc(): X.init = "runif" now supports nstart > 1 for multi-start initialization. Multiple random starting points are evaluated with 10 standard NMF iterations, and the best (lowest Frobenius error) is selected.
nmfae(), nmfre(): r.squared is now computed as cor(Y, fitted)^2 (squared correlation between observed and fitted values), consistent with nmfkc(). Previously nmfae() used 1 - SS_res/SS_tot and nmfre() used the same regression-style R-squared, which can behave unexpectedly for intercept-free non-negative models.
nmfkc.kernel.beta.nearest.med(): added a candidates argument controlling the bandwidth grid. Options: "7points" (new default, t = {-1,-2/3,-1/3,0,1/3,2/3,1}), "4points" (t = {-1/2, 0, 1/2, 1}), or a user-supplied numeric vector of values. Previously the grid silently differed between the no-landmark (Uk = NULL; 4 points) and landmark (7 points) branches.

New Functions (Signed NMF family)

nmfkc.signed(): NMF-KC with signed covariate/coefficient. Model with , (signed), real-valued. Uses Ding et al. (2010) sign-splitting + Direct MU; may also contain negative entries (semi-NMF regression). Supports Y.weights for element-wise masking.
nmfkc.signed.cv(), nmfkc.signed.ecv(): column-wise and element-wise k-fold CV for rank selection on signed data.
nmfae.signed(): Three-layer autoencoder with . preserve soft clustering on both decoder and encoder sides while the bottleneck can carry negative weights (e.g., anti-correlated properties). Hybrid warm-start (from nmfae()) + Direct MU with multi-restart.
nmfae.signed.ecv(): element-wise CV for (decoder-rank, encoder-rank) selection.
nmfae.signed.inference(): sandwich SE + wild bootstrap for (no non-negativity projection on since it is signed).
S3 methods predict.*.signed(), plot.*.signed(), summary.*.signed(), and nmfae.signed.rename() helper.

New Functions (Network NMF family)

nmfkc.net(): Single unified entry point for symmetric NMF of network data, with type = "tri" | "bi" | "signed". All three variants use the Frobenius-full bilateral gradient (supersedes the one-sided approximation in nmfkc(Y.symmetric = ...)). type = "signed" supports signed via Ding et al. (2010) sign-splitting, preserving for soft clustering while allowing inter-cluster repulsion. The returned object’s fields are uniform across types: and are for tri/bi, and populated matrices for signed. is always populated (identity for bi, non-negative for tri, signed for signed).
nmfkc.net.ecv(): Element-wise cross-validation with upper-triangle folds (mirrored to the lower triangle to prevent symmetry leakage). Unified entry point for type = "tri" | "bi" | "signed" (calls nmfkc.net() with the matching type for each fold).
nmfkc.net.DOT(): Graphviz DOT visualization for symmetric NMF networks. Displays basis-to-node membership edges and inter-basis interaction edges (C matrix) with significance stars. Now has signed parameter (auto-detected from class) to render negative C entries as dashed edges.
nmfkc.net.inference(): Statistical inference for symmetric NMF. Wrapper around nmfkc.inference() with A = t(X). Returns off-diagonal C coefficients with sandwich SE and wild bootstrap.

Deprecations

nmfkc(Y, Y.symmetric = "bi"|"tri"): Deprecated in favor of nmfkc.net(Y, type = "bi"|"tri"). The old implementation uses a one-sided gradient approximation that empirically converges for but is theoretically incorrect and does not extend to signed . The deprecated branch still works in v0.6.8 (with a deprecation warning) and will be removed in a future release.

Parameter Renames (old names remain usable for backward compatibility)

nmf.sem.DOT(): weight_scale_y2f → weight_scale_c2, weight_scale_fy1 → weight_scale_x1 (matrix-name-based naming, consistent with nmfae.DOT() and nmfkc.DOT()).
nmf.sem.DOT(): sig.level moved to after threshold for consistency with other .DOT functions.

Documentation

README, vignettes, and roxygen @title / @description updated to use NMF-FFB as the canonical model name (with “(formerly NMF-SEM)” attached on first mention for discoverability of the legacy term). File names (R/nmf.sem.R, vignettes/nmf-sem-with- nmfkc.Rmd, man/nmf.sem.Rd), function names (nmf.sem*), and S3 classes ("nmf.sem") are unchanged so URLs and existing scripts continue to work.

nmfkc 0.6.7

Bug Fixes

Added fitted.nmfae() and residuals.nmfae() S3 methods; previously fitted() on an nmfae object silently returned NULL because the wrong field name ($XB instead of $Y1hat) was used.

Naming Unification (old names remain usable for backward compatibility)

Coefficient tables: all inference functions now use Basis / Covariate columns (was Factor/Exogenous in nmf.sem.inference(), Decoder/Encoder in nmfae.inference()).
Wild bootstrap defaults unified: wild.B = 500, wild.seed = 123 across all inference functions.
First argument of all .DOT functions renamed to result for consistency.
CV tuning parameters (nfolds, seed, shuffle) moved to ... in nmfkc.ecv(), nmfae.ecv(), nmfae.cv(), nmf.sem.cv(); div also accepted for backward compatibility.

nmfkc 0.6.6

New Functions

nmfkc.criterion(): Extracted criterion computation from nmfkc() as a standalone exported function. Supports detail = "full" / "fast" / "minimal" to control computation cost.
nmfre.inference(): Separated statistical inference from nmfre() optimization. Returns coefficient table with SE, z-values, and p-values via wild bootstrap.
nmf.sem.inference(): Statistical inference for the C2 parameter matrix in NMF-SEM. Uses sandwich SE and wild bootstrap.
S3 methods coef(), fitted(), residuals() for all model classes (nmfkc, nmfae, nmfre, nmf.sem).
S3 methods plot() for nmfre and nmf.sem (convergence diagnostics).
summary.nmf.sem(): Stability diagnostics, fit statistics, and C2 coefficient table.

Parameter Renames (old names remain usable for backward compatibility)

nmfkc(), nmfkc.rank(): save.time / save.memory → detail
nmfae(): Q → rank, R → rank.encoder
nmfre(): Q → rank, dfU.cap.rate → df.rate
nmfre.dfU.scan(), nmfkc.ar.degree.cv(): Q → rank
nmfkc.residual.plot(): Y_XB_palette → fitted.palette, E_palette → residual.palette
nmfkc.kernel.beta.nearest.med(): block_size → block.size, sample_size → sample.size

Other Improvements

hide.isolated option added to all .DOT functions (default TRUE).
nmf.sem.DOT(): Added sig.level parameter; C2 edges decorated with significance stars.
nmfkc(): Added X.restriction = "none" option and X.init = "kmeansar" initialization.
Added arXiv/DOI references to roxygen documentation for all main functions.
@section Lifecycle: Experimental added to nmfae().
Removed mc.cores parallel option from nmfae.ecv() for CRAN compliance.

nmfkc 0.6.0

Bug Fixes

Fixed variable T shadowing TRUE in information criterion computation.
Fixed nmfkc.ecv() to use KL divergence for evaluation when method="KL".
Added performance flags (save.time=TRUE) to nmfkc.ecv() inner calls.
Fixed zero-division in nmfkc.rank() elbow normalization when R-squared values are identical.
Fixed parameter name mismatch (rank → Q) in nmfkc.rank() call to nmfkc.ecv().
Fixed descending loop in nmf.sem.split() when P=2.
Added input validation for n.exogenous in nmf.sem.split().

Documentation

Added roxygen documentation for summary.nmfkc() and print.summary.nmfkc().
Added @return for plot.nmfkc() and predict.nmfkc().
Added missing @return items (method, n.missing, n.total, rank, mae) to nmfkc().

Code Quality

Replaced T/F with TRUE/FALSE.
Replaced 1:length() with seq_along().
Changed default font from Meiryo to Arial in DOT functions.
Aligned nmf.sem.cv() defaults with nmf.sem().

nmfkc 0.5.8

Graphviz DOT Output Consolidation and Cleanup

Harmonized all DOT-generating functions (nmf.sem.DOT, nmfkc.DOT, nmfkc.ar.DOT) for consistent structure, naming conventions, and visualization logic.
Standardized node and edge formatting rules, including unified cluster behavior, color schemes, and edge-scaling conventions.
Implemented threshold-aware coefficient labeling so that displayed numerical precision aligns with the visualization threshold, preventing misleadingly detailed labels.
Removed unused or redundant DOT fragments and improved compatibility across Graphviz engines.
Enhanced layout readability through consistent indentation, node grouping, and suppression of isolated nodes in specific visualization modes (e.g., type = "YA" in nmfkc.DOT).
Refactored and expanded internal DOT helper functions (.nmfkc_dot_format_coef, .nmfkc_dot_digits_from_threshold, .nmfkc_dot_cluster_nodes, etc.) for better maintainability and uniform behavior.
New Function: Implemented nmfkc.ecv() for Element-wise Cross-Validation (Wold’s CV).
- This function randomly masks elements of the observation matrix to evaluate structural reconstruction error.
- It provides a statistically robust criterion for rank selection, avoiding the monotonic error decrease often seen in standard column-wise CV.
- Supports vector input for rank to evaluate multiple ranks simultaneously.
Missing Value & Weight Support:
- nmfkc() and nmfkc.cv() now fully support missing values (NA) and observation weights via the hidden argument Y.weights (passed through ...).
- If Y contains NAs, they are automatically detected and masked (assigned a weight of 0) during optimization.
Rank Selection Diagnostics (nmfkc.rank):
- Dual-Axis Visualization: The plot now displays fitting metrics ($R^2$, etc.) on the left axis and ECV Sigma (RMSE) on the right axis (blue line).
- Automatic Best Rank labeling: The plot explicitly marks the “Best” rank based on two criteria:
  - Elbow: Geometric elbow point of the $R^2$ curve.
  - Min: Minimum error point of the Element-wise CV.
- save.time defaults to FALSE, enabling the robust Element-wise CV calculation by default.
Argument Standardization:
- Unified the rank argument name to rank across all functions (nmfkc, nmfkc.cv, nmfkc.ecv, nmfkc.rank).
- The legacy argument Q is still supported for backward compatibility but internally mapped to rank.
Summary Improvements:
- Updated summary() and print() methods to report:
  - Sparsity of Basis ($X$) and Coefficients ($B$).
  - Clustering Entropy (indicating “Crisp” vs “Ambiguous” clustering).
  - Clustering Crispness (Mean Max Probability).
  - Number and percentage of missing values in $Y$.
Other Improvements:
- Added a validation check in nmfkc.ar() to ensure the input Y has no missing values (as they cannot be propagated to the covariate matrix A in VAR models).
- Refined nmfkc.residual.plot() layout margins for better visibility of titles.
- Updated documentation to reflect all changes.
Regularization Update:
The regularization scheme has been revised from L2 (ridge) to L1 (lasso-type) penalties.
- gamma now controls the L1 penalty on the coefficient matrix ( B = C A ), promoting sparsity in sample-wise coefficients.
- A new argument lambda has been added to control the L1 penalty on the parameter matrix ( C ), encouraging sparsity in the shared template structure.
  Both parameters can be passed through the ellipsis (...) to nmfkc() and related functions.
Function Signature Simplification:** Many less-frequently used arguments in nmfkc() (e.g., gamma, X.restriction, X.init) and in nmfkc.cv() (e.g., div, seed) have been moved into the ellipsis (...) for a cleaner function signature.
Performance Improvement: The internal function .silhouette.simple was vectorized and optimized to reduce computational cost, particularly for the calculation of a(i) and b(i).
Removed the fast.calc option from the nmfkc() function.
Added the X.init argument to the nmfkc() function, allowing selection between 'kmeans' and 'nndsvd' initialization methods.
The penalty term has been changed from tr(CC') to tr(BB') = tr(CAA'C').
Implemented the internal .z and xnorm functions.
Added the fast.calc option to the nmfkc() function.
Optimized internal calculations for improved performance.
Updated citation("nmfkc") and added AIC/BIC to the output.
Implemented the nmfkc.ar.stationarity() function.
Modified the z() function.
Used crossprod() for faster matrix multiplication.
Implemented the nmfkc.ar.DOT() function.
Added logic to sort the columns of X to form a unit matrix in special cases.
Implemented nmfkc.kernel.beta.cv() and nmfkc.ar.degree.cv() functions.
Set the default column names of X to Basis1, Basis2, etc.
Added X.prob and X.cluster to the return object.
Skipped CPCC and silhouette calculations when save.time = TRUE.
Added a prototype for the nmfkc.ar() function.
Added the criterion argument to the nmfkc() function to support multiple criteria.
Updated the nmfkc.rank() function.
Added the criterion argument to the nmfkc.rank() function.
Implemented the save.time argument.
Implemented the nmfkc.rank() function.
Implemented the nstart option from the kmeans() function.
Added an experimental implementation of the nmfkc.rank() function.
Removed zero-variance columns and rows with a warning.
Added source and references to the documentation.
Renamed several components for clarity:
- nmfkcreg to nmfkc
- create.kernel to nmfkc.kernel
- nmfkcreg.cv to nmfkc.cv
- P to B.prob
- cluster to B.cluster
- unit to X.column
- trace to print.trace
- dims to print.dims
Added the r.squared argument to the nmfkcreg.cv() function.
In nmfkcreg():
- Added the dims argument to check matrix sizes.
- Added the unit argument to normalize the basis matrix columns.
Modified the create.kernel() function to support prediction.
Updated examples on GitHub.
Removed the YHAT return value; use XB instead.
Added the cluster return value for hard clustering.

nmfkc 0.8.8

Removed the B.L1 penalty (and its gamma alias)

MAP penalties for nmfre()

nmfae* deprecated in favour of nmf.rrr*

nmf.sem* deprecated in favour of nmf.ffb*

C.L2 ridge for the signed families

Basis penalties extended to the signed families

by option: grouping order of coefficient tables

Classed CV objects with print / plot

Naming / API consistency pass (aligned to the nmfkc house style)

X.L2.smooth: row-smoothness penalty on the basis

Public API: nmf.rrr and nmf.ffb are the documented names

X.init = "kmeans++" basis initialization

nmf.rrr / nmfae family: rank1 / rank2 arguments

nmfre.ecv: rank selection for NMF-RE

nmfkc.DOT: signed-coefficient graphs

nmfre: marginal-NLL convergence trace

nmfre: optimization and inference fully separated

nmfre: EM/ECM algorithm and sign-free fixed effects (paper port)

nmf.rrr: NMF-RRR names for the nmfae family

nmfae(): Kullback-Leibler divergence objective

nmfkc.inference(): re-fit wild bootstrap for singular information

nmfkc 0.8.2

nmfkc.net.DOT(): default layout is now "neato"

Bug fix: nmfkc.net.DOT() mis-detected type = "bi" as "tri"

nmfkc.bicv() / nmfkc.consensus(): leaner signatures

nmfkc.ard(): simpler, safer interface

nmfkc.ard(): better default prior scale

New nmfkc.ard(): ARD rank determination (Tan & Fevotte 2013, prototype)

New nmfkc.consensus(): consensus-clustering rank selection (Brunet 2004)

New nmfkc.bicv(): bi-cross-validation for rank selection

*.rank: eff.rank.idx shown for context (no best marker)

*.rank: broken-stick-corrected effective-rank index

*.rank results gain plot() / print() methods

New nmf.cluster.flow(): cluster-flow diagram across ranks

New nmf.cluster.criteria(): sample-clustering quality across ranks

Rank-selection functions for the other NMF families

Internal: shared element-wise CV helpers

Unified summary print blocks

Effective Rank in all five MU-family summaries

Rank-selection diagnostics: silhouette / CPCC fixed, IC removed

Breaking change: symmetric NMF removed from nmfkc()

New diagnostic: effective rank

Diagnostics cleanup: B.prob crispness metrics

Improvements

Bug Fixes

Documentation

nmfkc 0.7.3

Documentation

nmfkc 0.7.2

Headline: NMF-FFB rebrand and full bootstrap inference

Bug Fixes

Improvements

New Functions (Signed NMF family)

New Functions (Network NMF family)

Deprecations

Parameter Renames (old names remain usable for backward compatibility)

Documentation

nmfkc 0.6.7

Bug Fixes

Naming Unification (old names remain usable for backward compatibility)

nmfkc 0.6.6

New Functions

Parameter Renames (old names remain usable for backward compatibility)

Other Improvements

nmfkc 0.6.0

Bug Fixes

Documentation

Code Quality

nmfkc 0.5.8

Graphviz DOT Output Consolidation and Cleanup

Removed the `B.L1` penalty (and its `gamma` alias)

MAP penalties for `nmfre()`

**`nmfae` deprecated in favour of `nmf.rrr`**

**`nmf.sem` deprecated in favour of `nmf.ffb`**

`C.L2` ridge for the signed families

`by` option: grouping order of coefficient tables

Classed CV objects with `print` / `plot`

`X.L2.smooth`: row-smoothness penalty on the basis

Public API: `nmf.rrr` and `nmf.ffb` are the documented names

`X.init = "kmeans++"` basis initialization

`nmf.rrr` / `nmfae` family: `rank1` / `rank2` arguments

`nmfre.ecv`: rank selection for NMF-RE

`nmfkc.DOT`: signed-coefficient graphs

`nmfre`: marginal-NLL convergence trace

`nmfre`: optimization and inference fully separated

`nmfre`: EM/ECM algorithm and sign-free fixed effects (paper port)

`nmf.rrr`: NMF-RRR names for the `nmfae` family

`nmfae()`: Kullback-Leibler divergence objective

`nmfkc.inference()`: re-fit wild bootstrap for singular information

`nmfkc.net.DOT()`: default layout is now `"neato"`

Bug fix: `nmfkc.net.DOT()` mis-detected `type = "bi"` as `"tri"`

`nmfkc.bicv()` / `nmfkc.consensus()`: leaner signatures

`nmfkc.ard()`: simpler, safer interface

`nmfkc.ard()`: better default prior scale

New `nmfkc.ard()`: ARD rank determination (Tan & Fevotte 2013, prototype)

New `nmfkc.consensus()`: consensus-clustering rank selection (Brunet 2004)

New `nmfkc.bicv()`: bi-cross-validation for rank selection

**`*.rank`: eff.rank.idx shown for context (no best marker)**

**`*.rank`: broken-stick-corrected effective-rank index**

**`*.rank` results gain `plot()` / `print()` methods**

New `nmf.cluster.flow()`: cluster-flow diagram across ranks

New `nmf.cluster.criteria()`: sample-clustering quality across ranks

Breaking change: symmetric NMF removed from `nmfkc()`