KENDALLTAU

Overview

The KENDALLTAU function calculates Kendall’s tau, a non-parametric measure of correlation for ordinal data. Unlike Pearson correlation, which measures linear relationships, Kendall’s tau assesses the strength of association between two rankings or ordinal variables by counting the number of concordant and discordant pairs of observations.

Kendall’s tau was introduced by Maurice G. Kendall in 1938 as “A New Measure of Rank Correlation” (Biometrika, Vol. 30). The statistic ranges from -1 to +1, where values close to 1 indicate strong agreement between rankings, values close to -1 indicate strong disagreement, and values near 0 suggest no association.

This implementation uses the SciPy library’s scipy.stats.kendalltau function and supports two variants:

  • Tau-b (default): Adjusts for tied ranks and is suitable when both variables may contain ties. It is computed as:
\tau_b = \frac{P - Q}{\sqrt{(P + Q + T)(P + Q + U)}}
  • Tau-c (Stuart’s tau-c): A variant normalized for rectangular tables, computed as:
\tau_c = \frac{2(P - Q)}{n^2 (m - 1) / m}

In these formulas, P is the number of concordant pairs, Q is the number of discordant pairs, T is the number of ties only in x, U is the number of ties only in y, n is the sample size, and m is the smaller of the number of unique values in x or y. Both variants reduce to Kendall’s original tau-a when no ties are present.

The function also returns a p-value for testing the null hypothesis that there is no association between the two variables (\tau = 0). For more information on Kendall’s tau, see the SciPy documentation and the Kendall rank correlation coefficient article on Wikipedia.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=KENDALLTAU(x, y, kendalltau_variant)
  • x (list[list], required): Array of rankings or observations. Must be the same length as y.
  • y (list[list], required): Array of rankings or observations. Must be the same length as x.
  • kendalltau_variant (str, optional, default: “b”): Defines which variant of Kendall’s tau is returned.

Returns (list[list]): 2D list [[tau, p_value]], or error message string.

Examples

Example 1: Kendall’s tau-b with tied values

Inputs:

x y kendalltau_variant
12 1 b
2 4
1 7
12 1
2 0

Excel formula:

=KENDALLTAU({12;2;1;12;2}, {1;4;7;1;0}, "b")

Expected output:

Result
-0.4714 0.2827

Example 2: Kendall’s tau-c with tied values

Inputs:

x y kendalltau_variant
12 1 c
2 4
1 7
12 1
2 0

Excel formula:

=KENDALLTAU({12;2;1;12;2}, {1;4;7;1;0}, "c")

Expected output:

Result
-0.48 0.2827

Example 3: Perfect positive correlation

Inputs:

x y kendalltau_variant
1 1 b
2 2
3 3
4 4

Excel formula:

=KENDALLTAU({1;2;3;4}, {1;2;3;4}, "b")

Expected output:

Result
1 0.0833

Example 4: Perfect negative correlation

Inputs:

x y kendalltau_variant
1 4 b
2 3
3 2
4 1

Excel formula:

=KENDALLTAU({1;2;3;4}, {4;3;2;1}, "b")

Expected output:

Result
-1 0.0833

Python Code

from scipy.stats import kendalltau as scipy_kendalltau

def kendalltau(x, y, kendalltau_variant='b'):
    """
    Calculate Kendall's tau, a correlation measure for ordinal data.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kendalltau.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Array of rankings or observations. Must be the same length as y.
        y (list[list]): Array of rankings or observations. Must be the same length as x.
        kendalltau_variant (str, optional): Defines which variant of Kendall's tau is returned. Valid options: Tau-b, Tau-c. Default is 'b'.

    Returns:
        list[list]: 2D list [[tau, p_value]], or error message string.
    """
    def to2d(val):
        return [[val]] if not isinstance(val, list) else val

    def flatten(arr):
        flat = []
        for row in arr:
            if isinstance(row, list):
                flat.extend(row)
            else:
                flat.append(row)
        return flat

    x = to2d(x)
    y = to2d(y)

    try:
        x_array = [float(val) for val in flatten(x)]
        y_array = [float(val) for val in flatten(y)]
    except (ValueError, TypeError):
        return "Invalid input: x and y must contain numeric values."

    if len(x_array) != len(y_array):
        return "Invalid input: x and y must have the same length."

    if len(x_array) < 2:
        return "Invalid input: arrays must contain at least 2 elements."

    if kendalltau_variant not in ["b", "c"]:
        return "Invalid input: kendalltau_variant must be 'b' or 'c'."

    try:
        result = scipy_kendalltau(x_array, y_array, variant=kendalltau_variant)
        return [[float(result.statistic), float(result.pvalue)]]
    except Exception as e:
        return f"scipy.stats.kendalltau error: {e}"

Online Calculator