Local Optimization

Overview

Local optimization finds a point where an objective function f(x) achieves a minimum value relative to all nearby points in the search space. Unlike global optimization, which seeks the absolute minimum across the entire domain, local methods focus on identifying a local minimum—a point where the function value is lower than at all neighboring points, though not necessarily the lowest possible value overall.

Local optimization is fundamental to countless applications: training neural networks, calibrating econometric models, fitting nonlinear regression models, tuning control systems, and solving engineering design problems. The efficiency of local methods makes them practical for high-dimensional problems where global search would be computationally prohibitive.

Implementation: These tools leverage SciPy’s optimization routines for general-purpose minimization, including BFGS, trust-region methods, and gradient-free approaches. For problems requiring automatic differentiation, CasADi provides symbolic differentiation capabilities that enable efficient gradient and Hessian computation.

Problem Structure and Scale: The choice of optimization algorithm depends critically on problem dimensions, smoothness, and derivative availability. For univariate optimization, specialized methods exploit the one-dimensional structure to converge rapidly. Use MINIMIZE_SCALAR when optimizing a single variable, as these methods use bracketing and interpolation strategies faster than general multivariate solvers.

For multivariate optimization without gradient information, MINIMIZE employs derivative-free methods like Nelder-Mead simplex search or Powell’s method. These gradient-free approaches are robust when derivatives are unavailable, expensive to compute, or the objective function is non-smooth. They excel for complex objective functions with many local features, though they converge more slowly than gradient-based methods.

Gradient-Based Methods: When analytical gradients are available or can be computed efficiently through automatic differentiation, gradient-based methods dramatically accelerate convergence. CA_MINIMIZE combines CasADi’s symbolic differentiation with advanced solvers to compute exact first and second derivatives. Gradient-based approaches—including steepest descent, BFGS quasi-Newton methods, and trust-region algorithms—achieve superlinear convergence for well-behaved problems. These methods are particularly effective for fitting models to data, solving inverse problems, and parameter estimation where the objective function is differentiable.

The Starting Point Challenge: A defining characteristic of local optimization is that the solution depends on the starting point. For smooth, convex functions, any local minimum is also the global minimum, and the initial guess matters little. However, for non-convex functions with multiple valleys and peaks, different starting points may converge to different local minima. Figure 1 illustrates this phenomenon and the distinction between different problem types.

Convex optimization problems—where the objective function is convex and the feasible region is convex—guarantee that any local minimum is the global minimum. When working with such problems, interior-point methods and proximal algorithms provide powerful alternatives. For non-convex problems, practitioners often employ multi-start strategies: running the optimizer from multiple initial points and selecting the best solution found, though this increases computational cost.

Convergence Analysis: Local optimization algorithms achieve different convergence rates depending on problem structure and method choice. Gradient-free methods typically achieve linear convergence. Methods using first derivatives (like BFGS) achieve superlinear convergence. Newton’s method and trust-region approaches using second derivatives achieve quadratic convergence near the optimum. The trade-off between convergence speed and computational cost per iteration drives algorithm selection: Newton’s method converges fast but requires expensive Hessian computations, while gradient descent is simple but may require more iterations.

<>:48: SyntaxWarning: invalid escape sequence '\|'
<>:48: SyntaxWarning: invalid escape sequence '\|'
/tmp/ipykernel_74256/57490755.py:48: SyntaxWarning: invalid escape sequence '\|'
  ax2.set_ylabel('Error $\|x_k - x^*\|$')

Figure 1: Local Optimization Concepts: (A) A non-convex objective function with multiple local minima—starting from different initial points (circles) converges to different minima. (B) Convergence rate comparison: gradient-free (linear), gradient-based BFGS (superlinear), and Newton with Hessian (quadratic) methods.

Tools

Tool	Description
CA_MINIMIZE	Minimize a multivariate function using CasADi with automatic differentiation.
MINIMIZE	Minimize a multivariate function using SciPy’s minimize routine.
MINIMIZE_SCALAR	Minimize a single-variable function using SciPy’s minimize_scalar.