SMOOTH_SPLINE
Overview
The SMOOTH_SPLINE function fits a smoothing cubic spline through noisy data points, producing a smooth curve that balances fidelity to the data with overall smoothness. Unlike interpolating splines that pass exactly through each point, smoothing splines allow for controlled deviation from the data to reduce the influence of noise and outliers.
This implementation uses the make_smoothing_spline function from SciPy, which solves a regularized weighted linear regression problem. The smoothing spline is found by minimizing the following objective function:
\sum_{i=1}^{n} w_i |y_i - f(x_i)|^2 + \lambda \int_{x_1}^{x_n} (f''(u))^2 \, du
where f is the spline function, w_i are the weights for each data point, and \lambda (lambda) is the regularization parameter that controls the tradeoff between data fidelity and smoothness. The first term penalizes deviation from the observed data, while the second term penalizes the curvature of the fitted curve (measured by the squared second derivative).
The smoothing parameter \lambda plays a critical role: larger values produce smoother curves with less sensitivity to individual data points, while smaller values yield curves that follow the data more closely. When \lambda is not specified, the algorithm automatically selects an optimal value using Generalized Cross-Validation (GCV), a statistical technique that estimates the prediction error without requiring a separate validation set. This approach is based on the work of Grace Wahba and the FORTRAN implementation by Woltring.
The weights parameter allows different data points to have different levels of influence on the fitted spline. Points with higher weights will be fit more closely, which is useful when some measurements are known to be more reliable than others.
Smoothing splines are widely used in exploratory data analysis, trend extraction from time series, and nonparametric regression. For theoretical background, see the chapter on Smoothing Splines in The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=SMOOTH_SPLINE(x, y, x_new, w, lam)
x(list[list], required): The x-coordinates of the data pointsy(list[list], required): The y-coordinates of the data pointsx_new(list[list], required): The x-coordinates at which to evaluate the splinew(list[list], optional, default: null): Weights for spline fittinglam(float, optional, default: null): Smoothing factor (lambda)
Returns (list[list]): A 2D list of interpolated values, or an error message (str) if invalid.
Examples
Example 1: Demo case 1
Inputs:
| x | y | x_new |
|---|---|---|
| 0 | 0 | 0.5 |
| 1 | 1 | 1.5 |
| 2 | 0.5 | 2.5 |
| 3 | 2 | 3.5 |
| 4 | 1.5 |
Excel formula:
=SMOOTH_SPLINE({0;1;2;3;4}, {0;1;0.5;2;1.5}, {0.5;1.5;2.5;3.5})
Expected output:
| Result |
|---|
| 0.397 |
| 0.8162 |
| 1.22 |
| 1.6 |
Example 2: Demo case 2
Inputs:
| x | y | x_new | w |
|---|---|---|---|
| 0 | 1 | 0.5 | 1 |
| 1 | 2 | 1.5 | 2 |
| 2 | 1.5 | 2.5 | 1 |
| 3 | 3 | 3.5 | 1 |
| 4 | 2.5 | 1 |
Excel formula:
=SMOOTH_SPLINE({0;1;2;3;4}, {1;2;1.5;3;2.5}, {0.5;1.5;2.5;3.5}, {1;2;1;1;1})
Expected output:
| Result |
|---|
| 1.501 |
| 1.898 |
| 2.268 |
| 2.612 |
Example 3: Demo case 3
Inputs:
| x | y | x_new | lam |
|---|---|---|---|
| 0 | 0 | 1 | 0.1 |
| 1 | 0.8 | 2 | |
| 2 | 0.9 | 3 | |
| 3 | 0.1 | ||
| 4 | 1 |
Excel formula:
=SMOOTH_SPLINE({0;1;2;3;4}, {0;0.8;0.9;0.1;1}, {1;2;3}, 0.1)
Expected output:
| Result |
|---|
| 0.7079 |
| 0.7249 |
| 0.4193 |
Example 4: Demo case 4
Inputs:
| x | y | x_new | w | lam |
|---|---|---|---|---|
| 1 | 2 | 1.5 | 1 | 0.5 |
| 2 | 3.5 | 2.5 | 1 | |
| 3 | 3 | 3.5 | 2 | |
| 4 | 5 | 4.5 | 1 | |
| 5 | 4.5 | 1 |
Excel formula:
=SMOOTH_SPLINE({1;2;3;4;5}, {2;3.5;3;5;4.5}, {1.5;2.5;3.5;4.5}, {1;1;2;1;1}, 0.5)
Expected output:
| Result |
|---|
| 2.557 |
| 3.172 |
| 3.853 |
| 4.528 |
Python Code
from scipy.interpolate import make_smoothing_spline as scipy_make_smoothing_spline
def smooth_spline(x, y, x_new, w=None, lam=None):
"""
Smoothing cubic spline.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.make_smoothing_spline.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): The x-coordinates of the data points
y (list[list]): The y-coordinates of the data points
x_new (list[list]): The x-coordinates at which to evaluate the spline
w (list[list], optional): Weights for spline fitting Default is None.
lam (float, optional): Smoothing factor (lambda) Default is None.
Returns:
list[list]: A 2D list of interpolated values, or an error message (str) if invalid.
"""
def to2d(val):
"""Convert scalar to 2D list if needed."""
return [[val]] if not isinstance(val, list) else val
def flatten(arr):
"""Flatten 2D list to 1D list."""
return [item for sublist in arr for item in sublist]
try:
# Normalize inputs to 2D lists
x = to2d(x)
y = to2d(y)
x_new = to2d(x_new)
# Flatten to 1D arrays for scipy
x_flat = flatten(x)
y_flat = flatten(y)
x_new_flat = flatten(x_new)
# Check for valid lengths
if len(x_flat) != len(y_flat):
return "Invalid input: x and y must have the same length."
if len(x_flat) < 2:
return "Invalid input: x and y must have at least 2 data points."
# Process weights if provided
w_flat = None
if w is not None:
w = to2d(w)
w_flat = flatten(w)
if len(w_flat) != len(x_flat):
return "Invalid input: w must have the same length as x and y."
if any(wi <= 0 for wi in w_flat):
return "Invalid input: weights must be positive."
# Validate lam
if lam is not None and lam < 0:
return "Invalid input: lam must be non-negative."
# Sort data by x
if w_flat:
data = sorted(zip(x_flat, y_flat, w_flat), key=lambda pair: pair[0])
x_flat, y_flat, w_flat = zip(*data)
x_flat = list(x_flat)
y_flat = list(y_flat)
w_flat = list(w_flat)
else:
data = sorted(zip(x_flat, y_flat), key=lambda pair: pair[0])
x_flat, y_flat = zip(*data)
x_flat = list(x_flat)
y_flat = list(y_flat)
# Check for duplicates in x
for i in range(len(x_flat) - 1):
if x_flat[i] == x_flat[i+1]:
return "Invalid input: x values must be unique."
# Create and evaluate spline
spline = scipy_make_smoothing_spline(x_flat, y_flat, w=w_flat, lam=lam)
result = spline(x_new_flat)
# Convert result to 2D column vector
return [[float(val)] for val in result]
except Exception as e:
return str(e)