SPEARMANR

Overview

The SPEARMANR function calculates the Spearman rank-order correlation coefficient between two datasets, along with an associated p-value for testing statistical significance. This nonparametric measure assesses the monotonicity of the relationship between two variables—that is, whether one variable tends to increase (or decrease) as the other increases, regardless of the exact functional form.

Unlike the Pearson correlation, which measures linear relationships and requires normally distributed data, the Spearman correlation operates on ranked data, making it robust to outliers and applicable to ordinal variables. The coefficient ranges from −1 to +1, where +1 indicates a perfect positive monotonic relationship (as one variable increases, so does the other), −1 indicates a perfect negative monotonic relationship, and 0 indicates no monotonic association.

The Spearman coefficient is computed by first converting each dataset to ranks, then calculating the Pearson correlation on those ranks. For datasets without tied values, this is equivalent to:

\rho = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)}

where d_i is the difference between the ranks of corresponding observations and n is the number of observations. When ties are present, the implementation uses a more general formula based on the Pearson correlation of ranked values.

When to Choose: Spearman vs. Pearson

Choosing the right correlation coefficient depends on your data’s distribution and the nature of the relationship you want to measure.

Feature	Pearson	Spearman
Relationship Type	Linear	Monotonic
Data Scale	Interval or Ratio	Ordinal or Continuous
Distribution	Bivariate Normal	Any (Non-parametric)
Sensitivity to Outliers	High	Low (Robust)
Assumptions	Homoscedasticity, no outliers	None

Python Example: Comparing Spearman and Pearson

This example demonstrates how Spearman correlation remains robust even when a relationship is nonlinear or contains outliers.

import numpy as np
from scipy.stats import spearmanr, pearsonr

# 1. Monotonic but nonlinear relationship (exponential)
x = np.linspace(1, 10, 100)
y = np.exp(x)

# Pearson (linear) will be < 1, Spearman (monotonic) will be 1
corr_p, _ = pearsonr(x, y)
corr_s, _ = spearmanr(x, y)

print(f"Nonlinear (Exponential): Pearson={corr_p:.3f}, Spearman={corr_s:.3f}")

# 2. Linear relationship with an extreme outlier
x_out = np.linspace(1, 10, 10)
y_out = 2 * x_out + 1
y_out[-1] = -100 # Add outlier

corr_p_out, _ = pearsonr(x_out, y_out)
corr_s_out, _ = spearmanr(x_out, y_out)

print(f"With Outlier: Pearson={corr_p_out:.3f}, Spearman={corr_s_out:.3f}")

This implementation uses the SciPy library’s scipy.stats.spearmanr function. The returned p-value tests the null hypothesis that the two samples have no ordinal correlation. Note that the p-value calculation is most accurate for large samples (>500 observations); for smaller samples, permutation tests may provide more reliable significance estimates. For theoretical background, see Zwillinger & Kokoska (2000), CRC Standard Probability and Statistics Tables and Formulae, and Spearman’s rank correlation coefficient on Wikipedia.

The function returns a two-element array containing the correlation coefficient and the p-value. If either input is constant (no variance), the correlation is undefined and the function returns zeros.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=SPEARMANR(x, y)

x (list[list], required): First set of observations (column vector)
y (list[list], required): Second set of observations (column vector), same length as x

Returns (list[list]): 2D list [[correlation, p_value]], or error message string.

Example 1: Demo case 1

Inputs:

x	y
1	5
2	6
3	7
4	8
5	9

Excel formula:

=SPEARMANR({1;2;3;4;5}, {5;6;7;8;9})

Expected output:

Result
1	1.40427e-24

Example 2: Demo case 2

Inputs:

x	y
1	9
2	8
3	7
4	6
5	5

Excel formula:

=SPEARMANR({1;2;3;4;5}, {9;8;7;6;5})

Expected output:

Result
-1	1.40427e-24

Example 3: Demo case 3

Inputs:

x	y
1	1
1	2
1	3
1	4

Excel formula:

=SPEARMANR({1;1;1;1}, {1;2;3;4})

Expected output:

Result
0	0

Example 4: Demo case 4

Inputs:

x	y
1	5
2	3
3	6
4	2
5	1

Excel formula:

=SPEARMANR({1;2;3;4;5}, {5;3;6;2;1})

Expected output:

Result
-0.7	0.18812

Python Code

Show Code

from scipy.stats import spearmanr as scipy_spearmanr
import math

def spearmanr(x, y):
    """
    Calculate a Spearman rank-order correlation coefficient with associated p-value.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.spearmanr.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): First set of observations (column vector)
        y (list[list]): Second set of observations (column vector), same length as x

    Returns:
        list[list]: 2D list [[correlation, p_value]], or error message string.
    """
    # Helper function to normalize input to 2D list
    def to2d(data):
        return [[data]] if not isinstance(data, list) else data

    try:
      # Normalize inputs to 2D lists
      x = to2d(x)
      y = to2d(y)

      # Validate dimensions and structure
      if not isinstance(x, list) or not isinstance(y, list):
        return "Error: x and y must be lists"
      if not x or not y or not isinstance(x[0], list) or not isinstance(y[0], list):
        return "Error: x and y must be 2D lists"
      if len(x) < 2 or len(y) < 2:
        return "Error: x and y must have at least two rows"
      if len(x) != len(y):
        return "Error: x and y must have the same length"

      # Extract numeric values and validate
      try:
        x_flat = [float(row[0]) for row in x]
        y_flat = [float(row[0]) for row in y]
      except (ValueError, TypeError, IndexError):
        return "Error: x and y must contain numeric values"

      # Handle constant input: correlation undefined, return zeros
      if len(set(x_flat)) == 1 or len(set(y_flat)) == 1:
        return [[0.0, 0.0]]

      # Compute correlation and p-value
      result = scipy_spearmanr(x_flat, y_flat)
      corr = float(result.statistic)
      pval = float(result.pvalue)

      # Check for invalid results
      if math.isnan(corr) or math.isinf(corr) or math.isnan(pval) or math.isinf(pval):
        return "Error: Calculation resulted in non-finite values"

      return [[corr, pval]]
    except Exception as e:
      return f"Error: {str(e)}"

Online Calculator

x *

First set of observations (column vector)

y *

Second set of observations (column vector), same length as x