KENDALLTAU
Overview
The KENDALLTAU
function calculates Kendall’s tau, a correlation measure for ordinal data. Kendall’s tau is a measure of the correspondence between two rankings, with values close to 1 indicating strong agreement and values close to -1 indicating strong disagreement. This implementation supports both tau-b (default) and tau-c variants, which differ in their normalization methods but produce identical hypothesis tests.
The calculation is based on the number of concordant and discordant pairs:
where is the number of concordant pairs, the number of discordant pairs, the number of tied pairs only in x, the number of tied pairs only in y, is the total number of samples, and is the number of unique values in either x or y, whichever is smaller. For more details, see the scipy.stats documentation .
This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel:
=KENDALLTAU(x, y, [variant])
x
(2D list, required): Array of rankings or observations. Must be the same length as y.y
(2D list, required): Array of rankings or observations. Must be the same length as x.variant
(string, optional, default=“b”): Defines which variant of Kendall’s tau is returned. Options are “b” (tau-b) or “c” (tau-c).
The function returns a 2D list with two elements: the tau statistic (float) and the p-value for a test of no association (float), or an error message (string) if the input is invalid.
Examples
Example 1: Basic Kendall’s Tau (tau-b)
This example calculates Kendall’s tau-b correlation between two sets of rankings.
Inputs:
x | y |
---|---|
12 | 1 |
2 | 4 |
1 | 7 |
12 | 1 |
2 | 0 |
Excel formula:
=KENDALLTAU({12;2;1;12;2}, {1;4;7;1;0})
Expected output:
Tau | P-value |
---|---|
-0.471405 | 0.282745 |
This shows a moderate negative correlation between the rankings.
Example 2: Kendall’s Tau-c
This example calculates Kendall’s tau-c correlation using the same data.
Inputs:
x | y |
---|---|
12 | 1 |
2 | 4 |
1 | 7 |
12 | 1 |
2 | 0 |
Excel formula:
=KENDALLTAU({12;2;1;12;2}, {1;4;7;1;0}, "c")
Expected output:
Tau | P-value |
---|---|
-0.48 | 0.282745 |
This shows tau-c has a different normalization but the same p-value.
Example 3: Perfect Positive Correlation
This example demonstrates perfect positive correlation.
Inputs:
x | y |
---|---|
1 | 1 |
2 | 2 |
3 | 3 |
4 | 4 |
Excel formula:
=KENDALLTAU({1;2;3;4}, {1;2;3;4})
Expected output:
Tau | P-value |
---|---|
1.0 | 0.083333 |
This shows perfect positive correlation with a significant p-value.
Example 4: Perfect Negative Correlation
This example demonstrates perfect negative correlation.
Inputs:
x | y |
---|---|
1 | 4 |
2 | 3 |
3 | 2 |
4 | 1 |
Excel formula:
=KENDALLTAU({1;2;3;4}, {4;3;2;1})
Expected output:
Tau | P-value |
---|---|
-1.0 | 0.083333 |
This shows perfect negative correlation with a significant p-value.
Python Code
from scipy.stats import kendalltau as scipy_kendalltau
def kendalltau(x, y, variant="b"):
"""
Calculate Kendall's tau, a correlation measure for ordinal data.
Args:
x: 2D list of rankings or observations. Must be the same length as y.
y: 2D list of rankings or observations. Must be the same length as x.
variant: Defines which variant of Kendall's tau is returned. Options are "b" (default) or "c".
Returns:
2D list containing [tau statistic, p-value] (list of floats),
or an error message (str) if input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Handle case where Excel passes single values as scalars
if not isinstance(x, list):
x = [[x]]
if not isinstance(y, list):
y = [[y]]
# Convert 2D lists to 1D arrays
try:
x_flat = []
for row in x:
if isinstance(row, list):
x_flat.extend(row)
else:
x_flat.append(row)
y_flat = []
for row in y:
if isinstance(row, list):
y_flat.extend(row)
else:
y_flat.append(row)
# Convert to numeric arrays
x_array = [float(val) for val in x_flat]
y_array = [float(val) for val in y_flat]
except (ValueError, TypeError):
return "Invalid input: x and y must contain numeric values."
# Check that arrays have the same length
if len(x_array) != len(y_array):
return "Invalid input: x and y must have the same length."
# Check minimum length
if len(x_array) < 2:
return "Invalid input: arrays must contain at least 2 elements."
# Validate variant parameter
if variant not in ["b", "c"]:
return "Invalid input: variant must be 'b' or 'c'."
try:
# Calculate Kendall's tau
result = scipy_kendalltau(x_array, y_array, variant=variant)
tau = float(result.statistic)
pvalue = float(result.pvalue)
# Return as 2D list (single row, two columns)
return [[tau, pvalue]]
except Exception as e:
return f"scipy.stats.kendalltau error: {e}"