CURVE_FIT

Overview

The CURVE_FIT function is an Excel-facing wrapper around SciPy’s scipy.optimize.curve_fit. It performs non-linear least-squares fitting of a user-selected model to paired x and y observations. The wrapper exposes a small, Excel-friendly surface: a model identifier, xdata and ydata as rectangular 2D lists (Excel ranges), and a few commonly-used optional controls (initial parameter guesses, simple bounds, and the solver method). It intentionally simplifies the SciPy interface to be usable from spreadsheets.

The underlying optimization minimizes the sum of squared residuals $S(\theta)$ over parameters $\theta$ for a model function $f(x;\theta)$ :

S(\theta) = \sum_{i=1}^{M} \left(y_i - f(x_i;\theta)\right)^2

Popular models provided by the wrapper include Boltzmann, Box-Lucas (single-exponential approach to an asymptote), and two-term exponential growth. Example model forms used by the wrapper include:

\text{Box-Lucas: }\quad f(x;a,b) = a\left(1 - e^{-b x}\right) \text{Boltzmann: }\quad f(x;A_1,A_2,x_0,dx) = \frac{A_1 - A_2}{1 + e^{(x-x_0)/dx}} + A_2 \text{Two-term exponential: }\quad f(x;y_0,A_1,t_1,A_2,t_2) = y_0 + A_1 e^{x/t_1} + A_2 e^{x/t_2}

For full SciPy behavior, see the SciPy documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html

The wrapper simplifies or omits several SciPy curve_fit features that are not practical from Excel:

No sigma or absolute_sigma weighting. Excel ranges make passing weight vectors or covariance matrices error-prone.
No callable jac (Jacobian) can be passed from Excel; numerical estimation is used by SciPy as needed.
No check_finite / nan_policy options; inputs are validated and must be numeric.
No full_output or advanced low-level solver kwargs; the wrapper returns parameter names and fitted values only.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:


=CURVE_FIT(model_id, xdata, ydata, [initial_guess], [bounds], [method])

model_id (string, required): Identifier of the model to fit (e.g. "box_lucas_one", "boltzmann", "exp_grow_two").
xdata (2D list, required): A rectangular 2D list (Excel range). Each row represents one observation; the first column contains the predictor x value. Must have at least two rows.
ydata (2D list, required): A rectangular 2D list (Excel range) matching xdata row count. Each row’s first column is the observed y value.
initial_guess (2D list, optional, default: model-specific heuristic): A single-row 2D list containing initial guesses for the model parameters in the order the model expects. Example: [[7.0, 1.0]].
bounds (2D list, optional, default: unbounded): A two-row 2D list of the form [[lower_1, lower_2, ...], [upper_1, upper_2, ...]]. Use None (or leave a cell blank in Excel) to indicate an unbounded side for a parameter.
method (string, optional, default: solver-chosen): One of SciPy’s supported methods for curve_fit (e.g. "trf", "dogbox", "lm"). Note: "lm" does not support bounds and requires number of observations >= number of parameters.

Return value: A 2D list where the first row contains the parameter names (strings) and the second row contains the fitted parameter values (floats). On error the function returns a single string describing the problem.

Examples

Below are four examples that match the demo_cases in const_curve_fit.py. Outputs are rounded to 3 decimal places.

Example 1: Box-Lucas basic

Inputs:

xdata	ydata
0	0.000
1	3.935
2	6.321
3	7.769
4	8.647
5	9.179
6	9.502
7	9.698
8	9.817
9	9.889

Excel formula:


=CURVE_FIT("box_lucas_one", {0;1;2;3;4;5;6;7;8;9}, {0;3.934693;6.321206;7.768698;8.646647;9.179150;9.502129;9.698026;9.816844;9.888910})

Expected output:

a	b
10.000	0.500

Example 2: Boltzmann basic

Inputs:

xdata	ydata
0	1.000
1	1.119
2	1.303
3	1.569
4	1.964
5	3.000
6	3.964
7	4.431
8	4.697
9	4.872
10	4.958

Excel formula:


=CURVE_FIT("boltzmann", {0;1;2;3;4;5;6;7;8;9;10}, {1.0;1.119203;1.302948;1.568848;1.963527;3.0;3.963527;4.431152;4.697052;4.872263;4.958144})

Expected output:

A1	A2	x0	dx
5.000	1.000	5.000	1.000

Example 3: Two-term exponential (exp_grow_two) basic

Inputs (7 rows):

xdata	ydata
0	4.000
1	12.873
2	41.171
3	135.914
4	448.169
5	1490.479
6	4951.020

Excel formula:


=CURVE_FIT("exp_grow_two", {0;1;2;3;4;5;6}, {4.0;12.873127;41.170565;135.913848;448.169143;1490.478805;4951.020565})

Expected output:

y_0	A_1	t_1	A_2	t_2
1762.802	2.193	2.034	-1760.998	-13486.838

Example 4: Box-Lucas with optional arguments

Inputs:

xdata	ydata	initial_guess	bounds (lower row)	bounds (upper row)
1	6.706	7.0	0.0	20.0
2	10.865		0.0	10.0
3	12.975
4	14.184
5	15.032
6	15.579
7	15.955
8	16.212

Excel formula:


=CURVE_FIT("box_lucas_one", {1;2;3;4;5;6;7;8}, {6.706;10.8647;12.9747;14.1837;15.0324;15.5788;15.9552;16.2123}, {7.0,1.0}, {{0.0,0.0},{20.0,10.0}})

Expected output:

a	b
8.000	0.800

Notes

When writing array constants in Excel use {1,2} for a row, {1;2} for a column, and {1,2;3,4} for a matrix. Do not use Excel array constants inside Markdown tables; use them only in the Excel formula code blocks above.
If you need more models added to the wrapper (different functional forms), add them to the MODELS dictionary in func_curve_fit.py and provide a stable parameter-ordering guess function.

Completion: created docs_curve_fit.md next to func_curve_fit.py, const_curve_fit.py, and spec_curve_fit.md in the same directory.

Python Code


from typing import List, Union, Optional
from scipy.optimize import curve_fit as scipy_curve_fit
from scipy import special as sc
import math
import inspect
import numpy as np
 
Number = Union[float, int]
 
MODELS = {
    "Allometric1": {
        # y = a * x^b
        "model": lambda x, a, b: a * np.power(x, b),
    "guess": lambda xa, ya: (float(np.max(ya)), 1.0),
    "bounds": [[None, None], [None, None]]
    },
    "Beta": {
        # simple 3-parameter shape: y = A * x^(p-1) * (1-x)^(q-1) approximated on [0,1]
        "model": lambda x, A, p, q: A * np.power(x, p-1) * np.power(1.0 - x, q-1),
    "guess": lambda xa, ya: (float(np.max(ya)), 2.0, 2.0),
    # A >= 0, p>1, q>1 to match Origin's beta parameter restrictions
    "bounds": [[0.0, 1.0000001, 1.0000001], [None, None, None]]
    },
    "Boltzmann": {
        "model": lambda x, A1, A2, x0, dx: (A1 - A2) / (1.0 + np.exp((x - x0) / dx)) + A2,
    "guess": lambda xa, ya: (float(np.max(ya)), float(np.min(ya)), float(np.median(xa)), 1.0),
    "bounds": [[None, None, None, None], [None, None, None, None]]
    },
    "Dhyperbl": {
        # decaying hyperbola-like: y = a / (b + x)
        "model": lambda x, a, b: a / (b + x),
    "guess": lambda xa, ya: (float(max(ya)), 1.0),
    "bounds": [[None, None], [None, None]]
    },
    "ExpAssoc": {
        # y = A*(1 - exp(-B*x))
        "model": lambda x, A, B: A * (1 - np.exp(-B * x)),
    "guess": lambda xa, ya: (float(np.max(ya)), 1.0),
    "bounds": [[None, 0.0], [None, None]]
    },
    "ExpDec1": {
        # single exponential decay: y = A * exp(-B*x)
        "model": lambda x, A, B: A * np.exp(-B * x),
    "guess": lambda xa, ya: (float(np.max(ya)), 0.5),
    "bounds": [[None, 0.0], [None, None]]
    },
    "ExpDec2": {
        # two-term decay: y = A1*exp(-B1*x) + A2*exp(-B2*x)
        "model": lambda x, A1, B1, A2, B2: A1 * np.exp(-B1 * x) + A2 * np.exp(-B2 * x),
    "guess": lambda xa, ya: (float(max(ya)), 1.0, float(min(ya)), 0.1),
    "bounds": [[None, 0.0, None, 0.0], [None, None, None, None]]
    },
    "ExpDec3": {
        # three-term decay: y = A1*exp(-B1*x) + A2*exp(-B2*x) + A3*exp(-B3*x)
        "model": lambda x, A1, B1, A2, B2, A3, B3: A1 * np.exp(-B1 * x) + A2 * np.exp(-B2 * x) + A3 * np.exp(-B3 * x),
    "guess": lambda xa, ya: (float(max(ya)), 1.0, float(max(ya))/2, 0.5, float(min(ya)), 0.1),
    "bounds": [[None, 0.0, None, 0.0, None, 0.0], [None, None, None, None, None, None]]
    },
    "ExpGrow1": {
        # single exponential growth: y = A * exp(B*x)
        "model": lambda x, A, B: A * np.exp(B * x),
    "guess": lambda xa, ya: (float(min(ya))+1.0, 0.5),
    "bounds": [[None, 0.0], [None, None]]
    },
    "ExpGrow2": {
        # two-term growth: y = A1*exp(B1*x) + A2*exp(B2*x)
        "model": lambda x, A1, B1, A2, B2: A1 * np.exp(B1 * x) + A2 * np.exp(B2 * x),
    "guess": lambda xa, ya: (float(min(ya)), 0.5, float(max(ya)-min(ya)), 0.1),
    "bounds": [[None, 0.0, None, 0.0], [None, None, None, None]]
    },
    "Gauss": {
        # Gaussian: y = A * exp(-(x-x0)^2 / (2*sigma^2))
        "model": lambda x, A, x0, sigma: A * np.exp(-np.power(x - x0, 2) / (2.0 * sigma * sigma)),
    "guess": lambda xa, ya: (float(max(ya)), float(np.median(xa)), 1.0),
    "bounds": [[None, None, 0.0], [None, None, None]]
    },
    "GaussAmp": {
        # Gaussian with amplitude parameterized differently: y = y0 + A*exp(...)
        "model": lambda x, y0, A, x0, sigma: y0 + A * np.exp(-np.power(x - x0, 2) / (2.0 * sigma * sigma)),
    "guess": lambda xa, ya: (float(min(ya)), float(max(ya)-min(ya)), float(np.median(xa)), 1.0),
    "bounds": [[None, None, None, 0.0], [None, None, None, None]]
    },
    "Hyperbl": {
        # hyperbola: y = a / (b + x)
        "model": lambda x, a, b: a / (b + x),
    "guess": lambda xa, ya: (float(max(ya)), 1.0),
    "bounds": [[None, None], [None, None]]
    },
    "Logistic": {
        # logistic: y = A / (1 + exp(-k*(x-x0)))
        "model": lambda x, A, k, x0: A / (1.0 + np.exp(-k * (x - x0))),
    "guess": lambda xa, ya: (float(max(ya)), 1.0, float(np.median(xa))),
    "bounds": [[None, 0.0, None], [None, None, None]]
    },
    "LogNormal": {
        # log-normal: y = A * exp(- (ln(x)-mu)^2 / (2*sigma^2)) / x
        "model": lambda x, A, mu, sigma: A * np.exp(-np.power(np.log(x) - mu, 2) / (2.0 * sigma * sigma)) / x,
        # Protect against non-positive x in the guess; xa is a numpy array when used
        "guess": lambda xa, ya: (
            float(max(ya)),
            float(np.mean(np.log(xa[xa > 0])) if np.any(xa > 0) else 1.0),
            1.0,
        ),
        # sigma > 0
        "bounds": [[None, None, 0.0], [None, None, None]]
    },
    "Lorentz": {
        # Lorentzian: y = A * (w^2 / ((x-x0)^2 + w^2))
        "model": lambda x, A, x0, w: A * (w * w) / (np.power(x - x0, 2) + w * w),
    "guess": lambda xa, ya: (float(max(ya)), float(np.median(xa)), 1.0),
    "bounds": [[None, None, 0.0], [None, None, None]]
    },
    "Poisson": {
        # Poisson pmf-like shape for integer x: y = A * (lambda^x * exp(-lambda) / x!) approximated via gamma
        # Use gamma(x+1) for factorial to support non-integer x inputs gracefully
        "model": lambda x, A, lam: A * (np.power(lam, x) * np.exp(-lam) / sc.gamma(x + 1.0)),
    "guess": lambda xa, ya: (float(max(ya)), float(np.mean(xa))),
    "bounds": [[None, 0.0], [None, None]]
    },
    "Pulse": {
        # simple pulse: y = A for x in [x0-width/2, x0+width/2] else 0 -> approximated with heaviside
        "model": lambda x, A, x0, w: A * (np.heaviside((x - (x0 - 0.5 * w)), 0.5) - np.heaviside((x - (x0 + 0.5 * w)), 0.5)),
    "guess": lambda xa, ya: (float(max(ya)), float(np.median(xa)), 1.0),
    "bounds": [[0.0, None, 0.0], [None, None, None]]
    },
    "Rational0": {
        # simple rational: y = (a + b*x) / (1 + c*x)
        "model": lambda x, a, b, c: (a + b * x) / (1.0 + c * x),
    "guess": lambda xa, ya: (float(min(ya)), 1.0, 0.1),
    "bounds": [[None, None, None], [None, None, None]]
    },
    "Sine": {
        # sine wave: y = A * sin(omega*x + phi) + y0
        "model": lambda x, A, omega, phi, y0: A * np.sin(omega * x + phi) + y0,
    "guess": lambda xa, ya: (float((max(ya)-min(ya))/2), 2.0 * np.pi / (np.max(xa)-np.min(xa)+1e-6), 0.0, float(np.mean(ya))),
    "bounds": [[None, 0.0, None, None], [None, None, None, None]]
    },
    "Voigt": {
        # Voigt: convolution of Gaussian (wG = Gaussian FWHM) and Lorentzian (wL = Lorentzian FWHM)
        # Parameters: y0, xc, A, wG, wL (matches Origin). A is the area under the peak.
        # sigma = wG / (2*sqrt(2*ln2)); gamma (HWHM) = wL / 2
        # Standard normalized Voigt: V(x) = Re[wofz(z)] / (sigma*sqrt(2*pi)), where
        # z = ((x-xc) + i*gamma) / (sigma*sqrt(2)). Multiplying by A gives area A.
        "model": lambda x, y0, xc, A, wG, wL: (
            np.asarray(y0, dtype=np.float64)
            + np.asarray(A, dtype=np.float64)
            * (
                np.real(
                    sc.wofz(
                        ((np.asarray(x, dtype=np.float64) - xc) + 1j * (wL / 2.0))
                        / ((wG / (2.0 * np.sqrt(2.0 * np.log(2.0)))) * np.sqrt(2.0))
                    )
                )
                / ((wG / (2.0 * np.sqrt(2.0 * np.log(2.0)))) * np.sqrt(2.0 * np.pi))
            )
        ),
        "guess": lambda xa, ya: (
            float(np.min(ya)),                                     # y0 offset
            float(xa[np.argmax(ya)]),                              # xc center at max y
            max(0.0, float(np.trapz(ya - np.min(ya), xa))),        # A approximate area
            float((np.max(xa) - np.min(xa)) / 6.0) or 1.0,         # wG guess (approx width)
            float((np.max(xa) - np.min(xa)) / 12.0) or 0.5,        # wL guess (narrower)
        ),
        "bounds": [[None, None, None, 0.0, 0.0], [None, None, None, None, None]]
    }
}
 
def curve_fit(model_id: str,
                     xdata: List[List[Number]],
                     ydata: List[List[Number]],
                     initial_guess: Optional[List[List[Number]]] = None,
                     bounds: Optional[List[List[Number]]] = None,
                     method: Optional[str] = None) -> Union[List[List[Union[str, float]]], str]:
    """
    Fit a model specified by model_id to xdata, ydata using scipy.optimize.curve_fit.
 
    Args:
        model_id: String identifier for the model (e.g. "exp_gro_two", "exp_grow_two").
        xdata: 2D list where each row is a sample; must have at least two rows.
        ydata: 2D list matching xdata in length; each row contains the y value in its first element.
        initial_guess: Optional 2D list representing initial guess for parameters.
        bounds: Optional 2D list [[lower_list],[upper_list]] for parameters.
        method: Optional method string passed to curve_fit (e.g., 'trf', 'dogbox', 'lm').
 
    Returns:
        2D list where first row is parameter names and second row are the fitted values, or an error string on failure.
 
    This example function is provided as-is without any representation of accuracy.
    """
    def _ensure_2d_list(arg: Union[List[List[Number]], object]) -> np.ndarray:
        """
        Validate that arg is a 2D list with at least two rows and return a 1-D numpy
        array (float64) of the first column values. This wrapper intentionally does
        NOT support vector predictors (multi-column x rows) for now and will raise
        on single-sample inputs.
        """
        if not isinstance(arg, list) or len(arg) < 2:
            raise ValueError("must be a 2D list with at least two rows")
        vals = []
        for i, row in enumerate(arg):
            if not isinstance(row, list) or len(row) == 0:
                raise ValueError(f"row {i} must be a non-empty list")
            # We only accept the first element of each row as the scalar value
            try:
                vals.append(float(row[0]))
            except Exception:
                raise ValueError(f"row {i} contains non-numeric value")
        arr = np.asarray(vals, dtype=np.float64)
        return arr
 
    if model_id not in MODELS:
        return f"Invalid model_id: {model_id}. Supported models: {list(MODELS.keys())}" 
    model_info = MODELS[model_id]
    model_func = model_info["model"]
    guess_func = model_info["guess"]
    # Derive parameter names from the model function signature (skip the first arg, typically 'x')
    try:
        sig = inspect.signature(model_func)
        # parameters preserves order; skip parameter 0 (x)
        param_names = [p.name for i, p in enumerate(sig.parameters.values()) if i > 0]
    except Exception:
        return "Internal error: unable to inspect model function signature."
    n_params = len(param_names)
 
    try:
        x_arr = _ensure_2d_list(xdata)
    except ValueError as e:
        return f"Invalid input for xdata: {e}"
    try:
        y_arr = _ensure_2d_list(ydata)
    except ValueError as e:
        return f"Invalid input for ydata: {e}"
 
    if x_arr.shape[0] != y_arr.shape[0]:
        return "Invalid input: xdata and ydata must have the same number of rows."
 
    # Already ensured float64 in _ensure_2d_list
    xa = x_arr
    ya = y_arr
 
    # Derive initial guess (p0) and normalize to a 1-D tuple of float64
    if initial_guess is not None:
        if not isinstance(initial_guess, list) or len(initial_guess) == 0 or not isinstance(initial_guess[0], list):
            return f"Invalid input: initial_guess must be a 2D list (e.g. [[{','.join(param_names)}]])."
        try:
            p0_raw = initial_guess[0]
        except Exception as e:
            return f"Invalid initial_guess structure: {e}"
    else:
        # allow guess_func to return array-like or tuple/list
        try:
            p0_raw = guess_func(xa, ya)
        except Exception as e:
            return f"Initial guess generation error: {e}"
 
    # Normalize p0 to a flat tuple of float64 and validate length
    try:
        p0_arr = np.asarray(p0_raw, dtype=np.float64).ravel()
        p0 = tuple(float(x) for x in p0_arr.tolist())
    except Exception as e:
        return f"Invalid initial_guess values: {e}"
    if len(p0) != n_params:
        return f"Invalid input: initial_guess must contain {n_params} values."
 
    # Bounds handling: prefer user-provided bounds, otherwise model-specific defaults if present
    model_bounds = model_info.get('bounds') if isinstance(model_info, dict) else None
    if bounds is None and model_bounds is not None:
        bounds = model_bounds
 
    # If bounds provided (either by user or model defaults), validate and normalize
    if bounds is not None:
        if not isinstance(bounds, list) or len(bounds) != 2 or not isinstance(bounds[0], list) or not isinstance(bounds[1], list):
            return f"Invalid input: bounds must be a 2D list like [[lower1,...],[upper1,...]]."
        try:
            raw_lower = bounds[0]
            raw_upper = bounds[1]
            if not isinstance(raw_lower, list) or not isinstance(raw_upper, list):
                return f"Invalid input: bounds must be a 2D list like [[lower1,...],[upper1,...]]."
            if len(raw_lower) != n_params or len(raw_upper) != n_params:
                return f"Invalid input: bounds must contain {n_params} values for lower and {n_params} for upper."
            lower = []
            upper = []
            # Map None to -inf for lower and +inf for upper; otherwise convert to float
            for v in raw_lower:
                if v is None:
                    lower.append(-np.inf)
                else:
                    lower.append(float(v))
            for v in raw_upper:
                if v is None:
                    upper.append(np.inf)
                else:
                    upper.append(float(v))
        except Exception as e:
            return f"Invalid bounds values: {e}"
        # Convert to numpy arrays (preserve original behavior otherwise)
        bounds_tuple = (np.asarray(lower, dtype=np.float64), np.asarray(upper, dtype=np.float64))
    else:
        bounds_tuple = None
 
    # method vs bounds validation: Levenberg-Marquardt (lm) cannot handle bounds
    if method == 'lm' and bounds_tuple is not None:
        return "Invalid input: method='lm' cannot be used with bounds; use 'trf' or 'dogbox' instead."
 
    # With lm, number of observations (M) must be >= number of parameters (N)
    if method == 'lm' and xa.shape[0] < n_params:
        return f"Invalid input: method='lm' requires number of observations >= number of parameters ({xa.shape[0]} < {n_params})."
 
    # Wrap the model to ensure outputs are float64 (SciPy recommends float64)
    def _model(x, *params):
        try:
            return np.asarray(model_func(x, *params), dtype=np.float64)
        except Exception as e:
            # Return exception as a string via our interface
            raise
 
    try:
        curve_fit_kwargs = dict(p0=p0)
        if bounds_tuple is not None:
            curve_fit_kwargs['bounds'] = bounds_tuple
        if method is not None:
            curve_fit_kwargs['method'] = method
        curve_fit_kwargs['maxfev'] = 10000
        popt, pcov = scipy_curve_fit(_model, xa, ya, **curve_fit_kwargs)
    except Exception as e:
        return f"curve_fit error: {e}"
 
    try:
        fitted_vals = [float(v) for v in popt]
    except Exception as e:
        return f"Result parsing error: {e}"
    for v in fitted_vals:
        if math.isnan(v) or math.isinf(v):
            return "Fitting produced invalid numeric values (NaN or inf)."
    return [param_names, fitted_vals]