SMOOTH_SPLINE

Overview

The SMOOTH_SPLINE function fits a smoothing cubic spline through noisy data points, producing a smooth curve that balances fidelity to the data with overall smoothness. Unlike interpolating splines that pass exactly through each point, smoothing splines allow for controlled deviation from the data to reduce the influence of noise and outliers.

This implementation uses the make_smoothing_spline function from SciPy, which solves a regularized weighted linear regression problem. The smoothing spline is found by minimizing the following objective function:

\sum_{i=1}^{n} w_i |y_i - f(x_i)|^2 + \lambda \int_{x_1}^{x_n} (f''(u))^2 \, du

where f is the spline function, w_i are the weights for each data point, and \lambda (lambda) is the regularization parameter that controls the tradeoff between data fidelity and smoothness. The first term penalizes deviation from the observed data, while the second term penalizes the curvature of the fitted curve (measured by the squared second derivative).

The smoothing parameter \lambda plays a critical role: larger values produce smoother curves with less sensitivity to individual data points, while smaller values yield curves that follow the data more closely. When \lambda is not specified, the algorithm automatically selects an optimal value using Generalized Cross-Validation (GCV), a statistical technique that estimates the prediction error without requiring a separate validation set. This approach is based on the work of Grace Wahba and the FORTRAN implementation by Woltring.

The weights parameter allows different data points to have different levels of influence on the fitted spline. Points with higher weights will be fit more closely, which is useful when some measurements are known to be more reliable than others.

Smoothing splines are widely used in exploratory data analysis, trend extraction from time series, and nonparametric regression. For theoretical background, see the chapter on Smoothing Splines in The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=SMOOTH_SPLINE(x, y, x_new, w, lam)
  • x (list[list], required): The x-coordinates of the data points
  • y (list[list], required): The y-coordinates of the data points
  • x_new (list[list], required): The x-coordinates at which to evaluate the spline
  • w (list[list], optional, default: null): Weights for spline fitting
  • lam (float, optional, default: null): Smoothing factor (lambda)

Returns (list[list]): A 2D list of interpolated values, or an error message (str) if invalid.

Examples

Example 1: Demo case 1

Inputs:

x y x_new
0 0 0.5
1 1 1.5
2 0.5 2.5
3 2 3.5
4 1.5

Excel formula:

=SMOOTH_SPLINE({0;1;2;3;4}, {0;1;0.5;2;1.5}, {0.5;1.5;2.5;3.5})

Expected output:

Result
0.397
0.8162
1.22
1.6

Example 2: Demo case 2

Inputs:

x y x_new w
0 1 0.5 1
1 2 1.5 2
2 1.5 2.5 1
3 3 3.5 1
4 2.5 1

Excel formula:

=SMOOTH_SPLINE({0;1;2;3;4}, {1;2;1.5;3;2.5}, {0.5;1.5;2.5;3.5}, {1;2;1;1;1})

Expected output:

Result
1.501
1.898
2.268
2.612

Example 3: Demo case 3

Inputs:

x y x_new lam
0 0 1 0.1
1 0.8 2
2 0.9 3
3 0.1
4 1

Excel formula:

=SMOOTH_SPLINE({0;1;2;3;4}, {0;0.8;0.9;0.1;1}, {1;2;3}, 0.1)

Expected output:

Result
0.7079
0.7249
0.4193

Example 4: Demo case 4

Inputs:

x y x_new w lam
1 2 1.5 1 0.5
2 3.5 2.5 1
3 3 3.5 2
4 5 4.5 1
5 4.5 1

Excel formula:

=SMOOTH_SPLINE({1;2;3;4;5}, {2;3.5;3;5;4.5}, {1.5;2.5;3.5;4.5}, {1;1;2;1;1}, 0.5)

Expected output:

Result
2.557
3.172
3.853
4.528

Python Code

from scipy.interpolate import make_smoothing_spline as scipy_make_smoothing_spline

def smooth_spline(x, y, x_new, w=None, lam=None):
    """
    Smoothing cubic spline.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.make_smoothing_spline.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): The x-coordinates of the data points
        y (list[list]): The y-coordinates of the data points
        x_new (list[list]): The x-coordinates at which to evaluate the spline
        w (list[list], optional): Weights for spline fitting Default is None.
        lam (float, optional): Smoothing factor (lambda) Default is None.

    Returns:
        list[list]: A 2D list of interpolated values, or an error message (str) if invalid.
    """
    def to2d(val):
        """Convert scalar to 2D list if needed."""
        return [[val]] if not isinstance(val, list) else val

    def flatten(arr):
        """Flatten 2D list to 1D list."""
        return [item for sublist in arr for item in sublist]

    try:
        # Normalize inputs to 2D lists
        x = to2d(x)
        y = to2d(y)
        x_new = to2d(x_new)

        # Flatten to 1D arrays for scipy
        x_flat = flatten(x)
        y_flat = flatten(y)
        x_new_flat = flatten(x_new)

        # Check for valid lengths
        if len(x_flat) != len(y_flat):
            return "Invalid input: x and y must have the same length."

        if len(x_flat) < 2:
            return "Invalid input: x and y must have at least 2 data points."

        # Process weights if provided
        w_flat = None
        if w is not None:
            w = to2d(w)
            w_flat = flatten(w)
            if len(w_flat) != len(x_flat):
                return "Invalid input: w must have the same length as x and y."
            if any(wi <= 0 for wi in w_flat):
                return "Invalid input: weights must be positive."

        # Validate lam
        if lam is not None and lam < 0:
            return "Invalid input: lam must be non-negative."

        # Sort data by x
        if w_flat:
            data = sorted(zip(x_flat, y_flat, w_flat), key=lambda pair: pair[0])
            x_flat, y_flat, w_flat = zip(*data)
            x_flat = list(x_flat)
            y_flat = list(y_flat)
            w_flat = list(w_flat)
        else:
            data = sorted(zip(x_flat, y_flat), key=lambda pair: pair[0])
            x_flat, y_flat = zip(*data)
            x_flat = list(x_flat)
            y_flat = list(y_flat)

        # Check for duplicates in x
        for i in range(len(x_flat) - 1):
            if x_flat[i] == x_flat[i+1]:
                return "Invalid input: x values must be unique."

        # Create and evaluate spline
        spline = scipy_make_smoothing_spline(x_flat, y_flat, w=w_flat, lam=lam)
        result = spline(x_new_flat)

        # Convert result to 2D column vector
        return [[float(val)] for val in result]

    except Exception as e:
        return str(e)

Online Calculator