THEILSLOPES

Overview

The THEILSLOPES function performs robust linear regression using the Theil-Sen estimator, a non-parametric method that fits a line to sample points by computing the median of slopes between all pairs of data points. Unlike ordinary least squares (OLS) regression, the Theil-Sen estimator is highly resistant to outliers and does not require normally distributed residuals, making it ideal for analyzing real-world data with anomalies or skewed distributions.

The algorithm, developed by Henri Theil (1950) and extended by Pranab K. Sen (1968), computes the slope m as the median of all pairwise slopes:

m = \text{median}\left\{ \frac{y_j - y_i}{x_j - x_i} \mid i < j, \; x_i \neq x_j \right\}

Once the slope is determined, the intercept b is calculated using one of two methods:

  • Separate (default): b = \text{median}(y) - m \cdot \text{median}(x)
  • Joint: b = \text{median}(y - m \cdot x)

The resulting line y = mx + b passes above and below approximately equal numbers of data points. The function also returns a confidence interval for the slope estimate, based on the distribution of pairwise slopes.

The Theil-Sen estimator has a breakdown point of approximately 29.3%, meaning it can tolerate corruption of up to 29.3% of the input data without significant degradation in accuracy. This makes it substantially more robust than OLS regression, which has no resistance to outliers. The method has been widely applied in environmental monitoring, climatology, remote sensing, and trend analysis where outlier contamination is common.

This implementation uses SciPy’s theilslopes function from the scipy.stats module. For related approaches, see repeated median regression (Siegel’s method), which offers a higher breakdown point of 50%. For more background on the mathematical foundations, refer to the Theil-Sen estimator Wikipedia article.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=THEILSLOPES(y, x, alpha, theilslopes_method)
  • y (list[list], required): Dependent variable values (one value per row)
  • x (list[list], optional, default: null): Independent variable values (one value per row)
  • alpha (float, optional, default: 0.95): Confidence level for the confidence interval of the slope
  • theilslopes_method (str, optional, default: “separate”): Method for calculating intercept and slope estimates

Returns (list[list]): 2D list [[slope, intercept, low, high]], or error string.

Examples

Example 1: Demo case 1

Inputs:

y alpha theilslopes_method
1 0.95 separate
2
3
4
5

Excel formula:

=THEILSLOPES({1;2;3;4;5}, 0.95, "separate")

Expected output:

Result
1 1 1 1

Example 2: Demo case 2

Inputs:

y x
2 1
4 2
6 3
8 4
10 5

Excel formula:

=THEILSLOPES({2;4;6;8;10}, {1;2;3;4;5})

Expected output:

Result
2 0 2 2

Example 3: Demo case 3

Inputs:

y x alpha theilslopes_method
1 1 0.95 separate
2 2
3 3
100 4
5 5

Excel formula:

=THEILSLOPES({1;2;3;100;5}, {1;2;3;4;5}, 0.95, "separate")

Expected output:

Result
1 0 -95 97

Example 4: Demo case 4

Inputs:

y x alpha theilslopes_method
1 1 0.9 joint
2 2
3 3
4 4
5 5

Excel formula:

=THEILSLOPES({1;2;3;4;5}, {1;2;3;4;5}, 0.9, "joint")

Expected output:

Result
1 0 1 1

Python Code

import math
from scipy.stats import theilslopes as scipy_theilslopes

def theilslopes(y, x=None, alpha=0.95, theilslopes_method='separate'):
    """
    Compute the Theil-Sen estimator for a set of points (robust linear regression).

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.theilslopes.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        y (list[list]): Dependent variable values (one value per row)
        x (list[list], optional): Independent variable values (one value per row) Default is None.
        alpha (float, optional): Confidence level for the confidence interval of the slope Default is 0.95.
        theilslopes_method (str, optional): Method for calculating intercept and slope estimates Valid options: Separate, Joint. Default is 'separate'.

    Returns:
        list[list]: 2D list [[slope, intercept, low, high]], or error string.
    """
    # Validate y
    if not isinstance(y, list) or len(y) < 2:
        return "Invalid input: y must be a 2D list with at least two rows."
    try:
        y_flat = [float(row[0]) if isinstance(row, list) else float(row) for row in y]
    except Exception:
        return "Invalid input: y must contain numeric values."
    n = len(y_flat)
    # Validate x
    if x is None:
        x_flat = list(range(n))
    else:
        if not isinstance(x, list) or len(x) != n:
            return "Invalid input: x must be a 2D list with the same number of rows as y."
        try:
            x_flat = [float(row[0]) if isinstance(row, list) else float(row) for row in x]
        except Exception:
            return "Invalid input: x must contain numeric values."
    # Validate alpha
    try:
        alpha_val = float(alpha)
    except Exception:
        return "Invalid input: alpha must be a float."
    if not (0 < alpha_val < 1):
        return "Invalid input: alpha must be between 0 and 1."
    # Validate theilslopes_method
    if theilslopes_method not in ("separate", "joint"):
        return "Invalid input: theilslopes_method must be 'separate' or 'joint'."
    try:
        res = scipy_theilslopes(y_flat, x_flat, alpha=alpha_val, method=theilslopes_method)
        # Ensure no NaN/inf in result
        vals = [res.slope, res.intercept, res.low_slope, res.high_slope]
        for v in vals:
            if v is None or (isinstance(v, float) and (math.isnan(v) or math.isinf(v))):
                return "Computation resulted in invalid value."
        return [vals]
    except Exception as e:
        return f"scipy.stats.theilslopes error: {e}"

Online Calculator