Curve Fitting

Overview

Curve fitting is the process of constructing a mathematical function that best approximates a series of data points. At its heart, curve fitting transforms empirical observations into predictive models, enabling interpolation, extrapolation, and scientific insight. The fundamental question is deceptively simple: given a set of (x, y) pairs, what function f(x) best captures the underlying relationship?

This discipline bridges theory and experiment across virtually every quantitative field. In chemistry and biochemistry, curve fitting extracts kinetic parameters from reaction rates and binding assays. In engineering, it models system responses, material properties, and signal characteristics. In economics and finance, it reveals trends, cycles, and forecast trajectories. The ubiquity of curve fitting reflects a deeper truth: real-world phenomena rarely present themselves as clean equations—we must infer them from noisy, incomplete data.

The Fitting Process

All curve fitting involves three core elements:

  1. The Model: A mathematical expression f(x; \theta) parameterized by \theta = (\theta_1, \theta_2, \ldots, \theta_n). This could be a simple line (y = mx + b), a nonlinear function (e.g., Michaelis-Menten), or an arbitrary user-defined expression.

  2. The Objective: A loss function that quantifies the quality of fit. The most common is the sum of squared residuals (SSR): \text{SSR}(\theta) = \sum_{i=1}^{N} [y_i - f(x_i; \theta)]^2 Minimizing SSR yields the least-squares estimate.

  3. The Solver: An optimization algorithm that searches the parameter space \theta to minimize the objective. Linear models have closed-form solutions; nonlinear models require iterative methods.

Least Squares Methods

Least squares regression is the workhorse of curve fitting. When the model is linear in its parameters (e.g., polynomial regression), the solution is analytical via normal equations or matrix decomposition. When the model is nonlinear (e.g., exponential decay, sigmoid growth), iterative algorithms are required.

Pre-Built Models for Domain-Specific Applications

While general-purpose fitting functions accept arbitrary model expressions, many scientific and engineering domains use canonical functional forms repeatedly. Pre-built model functions streamline workflows by encapsulating domain expertise:

Model Selection and Validation

Choosing the right model is both an art and a science. Overfitting occurs when a model captures noise rather than signal (high variance, poor generalization). Underfitting occurs when the model is too simple to capture the true relationship (high bias). Tools for model selection include:

Native Excel Capabilities

Excel provides several built-in curve fitting tools, but they are primarily limited to simple cases:

  • LINEST: Fits linear and polynomial models using least squares. Returns regression coefficients and statistics but is limited to models that are linear in their parameters.

  • LOGEST: Fits exponential models of the form y = b \cdot m^x by linearizing via logarithmic transformation.

  • Trendline: Adds fitted curves to charts (linear, polynomial, exponential, logarithmic, power). Provides R^2 values and equation display, but lacks flexibility for custom models or parameter constraints.

  • Solver Add-in: Can minimize the sum of squared residuals for arbitrary models but requires manual setup of objective cells and is cumbersome for routine fitting tasks.

Limitations: Native Excel tools cannot handle complex nonlinear models, parameter constraints, uncertainty propagation, or model composition. They lack the algorithmic sophistication of modern nonlinear optimizers like Levenberg-Marquardt or trust-region methods.

Third-Party Excel Add-ins

  • XLSTAT: Comprehensive statistical software with advanced regression capabilities, including nonlinear regression, weighted least squares, and robust fitting methods.

  • SigmaPlot: Offers extensive curve fitting with a library of over 100 built-in equations, automatic initial parameter estimation, and detailed goodness-of-fit statistics. Popular in scientific research.

  • TableCurve 2D/3D (SYSTAT): Automatically tests thousands of model equations to find the best fit. Ideal for exploratory data analysis when the functional form is unknown.

  • DataFit: Specializes in nonlinear regression with an extensive model library and statistical output. Allows custom model definition.

Least Squares

Tool Description
CA_CURVE_FIT Fit an arbitrary symbolic model to data using CasADi and automatic differentiation.
CURVE_FIT Fit a model expression to xdata, ydata using scipy.optimize.curve_fit.
LM_FIT Fit data using lmfit’s built-in models with optional model composition.
MINUIT_FIT Fit an arbitrary model expression to data using iminuit least-squares minimization with uncertainty estimates.

Models

Tool Description
ADSORPTION Fits adsorption models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
AGRICULTURE Fits agriculture models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
BINDING_MODEL Fits binding_model models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
CHROMA_PEAKS Fits chroma_peaks models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
DOSE_RESPONSE Fits dose_response models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
ELECTRO_ION Fits electro_ion models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
ENZYME_BASIC Fits enzyme_basic models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
ENZYME_INHIBIT Fits enzyme_inhibit models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
EXP_ADVANCED Fits exp_advanced models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
EXP_DECAY Fits exp_decay models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
EXP_GROWTH Fits exponential growth models to data using scipy.optimize.curve_fit.
GROWTH_POWER Fits growth_power models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
GROWTH_SIGMOID Fits growth_sigmoid models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
MISC_PIECEWISE Fits misc_piecewise models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
PEAK_ASYM Fits peak_asym models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
POLY_BASIC Fits poly_basic models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
RHEOLOGY Fits rheology models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
SPECTRO_PEAKS Fits spectro_peaks models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
STAT_DISTRIB Fits stat_distrib models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
STAT_PARETO Fits stat_pareto models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.
WAVEFORM Fits waveform models to data using scipy.optimize.curve_fit. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html for details.