
Robust nonlinear optimization for R
When optim() fails on your ill-conditioned model,
arcopt is designed to succeed. It uses Adaptive
Regularization with Cubics (ARC) to handle indefinite Hessians and
escape saddle points automatically.
# Install from GitHub
pak::pak("marcus-waldman/arcopt")
# Or with devtools
devtools::install_github("marcus-waldman/arcopt")library(arcopt)
# Rosenbrock function - a classic difficult optimization problem
result <- arcopt(
x0 = c(-1.2, 1),
fn = function(x) (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2,
gr = function(x) c(
-2 * (1 - x[1]) - 400 * x[1] * (x[2] - x[1]^2),
200 * (x[2] - x[1]^2)
),
hess = function(x) matrix(c(
1200 * x[1]^2 - 400 * x[2] + 2, -400 * x[1],
-400 * x[1], 200
), 2, 2)
)
result$par
#> [1] 1 1arcopt offers two optimization strategies:
Best when you can compute the Hessian analytically or via automatic differentiation.
# Provide fn, gr, and hess
result <- arcopt(x0, fn, gr, hess)Best when computing the Hessian is expensive or unavailable. Uses BFGS/SR1 approximations. arcopt seeds the initial approximation \(B_0\) with a one-time finite-difference Hessian computed from the supplied gradient (cost: \(2n\) gradient evaluations at startup), which dramatically improves saddle-escape behavior compared to the identity-initialization convention used in classical BFGS/SR1.
# Just provide fn and gr - no Hessian needed
result <- arcopt(x0, fn, gr, control = list(use_qn = TRUE))| Scenario | Recommendation |
|---|---|
| Analytic Hessian available | Use exact Hessian (default) |
| Hessian via autodiff (torch, Stan) | Use exact Hessian |
| Hessian expensive (n > 100) | Use use_qn = TRUE |
| No Hessian function | Use use_qn = TRUE |
Tested on 8 standard functions with 10 starting points each (80 test cases):
| Method | Convergence | Avg Iterations | Hessian Evals |
|---|---|---|---|
| Exact Hessian | 100% | 22 | 1,443 |
| BFGS QN | 100% | 28 | 0 |
| SR1 QN | 99% | 61 | 0 |
BFGS Quasi-Newton achieves the same 100% convergence as exact Hessian with only 26% more iterations and zero Hessian evaluations.
On problems with symmetric saddle points (e.g., growth-mixture
models, factor models with rotational symmetry),
qn_method = "hybrid" combines SR1’s ability to represent
indefinite curvature with BFGS’s fast local convergence and matches
full-Hessian solvers’ saddle-escape reliability using only gradient
information.
# Optimize with bounds: 0 <= x <= 0.5
result <- arcopt(
x0 = c(0.1, 0.1),
fn = fn, gr = gr, hess = hess,
lower = c(0, 0),
upper = c(0.5, 0.5)
)Standard optimizers struggle when the Hessian is indefinite or nearly singular—common in:
Cubic regularization automatically handles negative curvature without line searches or trust-region adjustments. It provably escapes saddle points and converges to local minima.
Instead of the quadratic model used by Newton’s method:
m(s) = f + g's + (1/2) s'Hs
ARC adds a cubic term:
m(s) = f + g's + (1/2) s'Hs + (σ/3)||s||³
This cubic term prevents unbounded steps when the Hessian has negative eigenvalues.
arcopt() automatically adapts its subproblem solver to
the local landscape:
hess() calls until convergence or revert, saving wall-clock
time on expensive Hessians (Stan AD, finite differences).result <- arcopt(x0, fn, gr, hess, control = list(
tr_fallback_enabled = TRUE, # cubic->TR on flat-ridge (default)
qn_polish_enabled = TRUE # cubic<->BFGS polish in basin (opt-in)
))
result$diagnostics$solver_mode_final # "cubic", "tr", or "qn_polish"
result$diagnostics$ridge_switches # count of cubic->TR transitions
result$diagnostics$qn_polish_switches # count of cubic->polish transitions
result$diagnostics$qn_polish_reverts # count of polish->cubic reversionsSee design/design-principles.qmd §3a for the σ↔︎r
duality framing and todo/tr-fallback-hybrid-briefing.md /
todo/qn-polish-mode-briefing.md for the detector
designs.
arcopt(x0, fn, gr, hess,
control = list(
# Quasi-Newton mode (no exact Hessian needed)
use_qn = TRUE,
qn_method = "hybrid", # recommended for saddle-prone problems;
# also "bfgs", "sr1"
# Convergence tolerances
gtol_abs = 1e-5, # Gradient norm
maxit = 1000, # Max iterations
# Hybrid solver-mode dispatch
tr_fallback_enabled = TRUE, # cubic->TR on stagnation
qn_polish_enabled = FALSE, # cubic<->polish in quadratic basin
# Output
trace = 1, # Depth of saved trace data (0/1/2/3)
verbose = FALSE # Print one line per iteration to console
)
)The full set of advanced controls (cubic regularization tuning,
TR-fallback thresholds, polish-mode thresholds, QN routing) lives under
?arcopt_advanced_controls.
| Feature | arcopt | optim(BFGS) | optim(L-BFGS-B) |
|---|---|---|---|
| Handles indefinite Hessian | ✓ | ✗ | ✗ |
| Escapes saddle points | ✓ | ✗ | ✗ |
| Box constraints | ✓ | ✗ | ✓ |
| Hessian-free mode | ✓ | ✓ | ✓ |
| Best for | Complex likelihoods | Smooth convex | Large bound-constrained |
arcopt is a local optimizer. Use something else for:
DEoptim,
GenSA, or nloptr with global algorithmsnloptr with
COBYLA or dfoptimlbfgs or sparse methodslpSolve, quadprogBest for: 2-500 parameters, nonconvex objectives, need robust convergence.
result <- arcopt(x0, fn, gr, hess)
result$par # Optimal parameters
result$value # Optimal function value
result$gradient # Gradient at solution
result$hessian # Hessian at solution (exact in cubic/tr mode;
# BFGS approximation if solver_mode_final == "qn_polish")
result$converged # TRUE if converged
result$iterations # Number of iterations
result$evaluations # List: fn, gr, hess counts
result$message # Why it stopped
# Solver-mode diagnostics (nested sublist; most users do not inspect this).
# See ?arcopt_advanced_controls for the full schema.
result$diagnostics$solver_mode_final # "cubic", "tr", or "qn_polish"
result$diagnostics$ridge_switches # count of cubic->TR transitions
result$diagnostics$radius_final # final TR radius (NA if no switch)
result$diagnostics$qn_polish_switches # count of cubic->qn_polish transitions
result$diagnostics$qn_polish_reverts # count of qn_polish->cubic reversions
result$diagnostics$hess_evals_at_polish_switch # baseline for Hessian-eval savingsAdaptive Regularization with Cubics (ARC)
Cartis, C., Gould, N. I. M., & Toint, P. L. (2011). Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results. Mathematical Programming, 127(2), 245-295.
Nesterov, Y., & Polyak, B. T. (2006). Cubic regularization of Newton method and its global performance. Mathematical Programming, 108(1), 177-205.
Quasi-Newton with Cubic Regularization
Kamzolov, D., Ziu, K., Agafonov, A., & Takáč, M. (2025). Accelerated Adaptive Cubic Regularized Quasi-Newton Methods. Journal of Optimization Theory and Applications, 208. https://doi.org/10.1007/s10957-025-02804-3
Kamzolov, D., Ziu, K., Agafonov, A., & Takáč, M. (2023). Cubic Regularization is the Key! The First Accelerated Quasi-Newton Method with a Global Convergence Rate of O(k^-2) for Convex Functions. arXiv:2302.04987.
Benson, H. Y., & Shanno, D. F. (2018). Cubic regularization in symmetric rank-1 quasi-Newton methods. Mathematical Programming Computation, 10(4), 457-494.
MIT © 2025 Marcus Waldman
Issues and PRs welcome at: https://github.com/marcus-waldman/arcopt/issues