WEIGHTEDTAU
Overview
The WEIGHTEDTAU
function computes a weighted version of Kendall’s tau correlation coefficient. The weighted tau gives more influence to exchanges involving high-importance elements, with importance determined by rank. The default parameters compute the additive hyperbolic version (τ_h), which provides the best balance between important and unimportant elements. The weighting is defined by a rank array that assigns importance to each element, and a weigher function that assigns weights based on ranks:
where for the additive hyperbolic case, and , are the ranks of elements and . For more details, see the scipy.stats documentation .
This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel:
=WEIGHTEDTAU(x, y, [rank], [additive])
x
(2D list, required): Array of scores. Must be a column vector (single column).y
(2D list, required): Array of scores of the same length as x. Must be a column vector (single column).rank
(bool, optional, default=True): If True, use decreasing lexicographical rank averaging (x,y) and (y,x). If False, use element indices as ranks.additive
(bool, optional, default=True): If True, compute weights by addition; if False, by multiplication.
The function returns the weighted tau correlation coefficient (float). The p-value is not available as the null distribution is unknown, so it returns NaN for statistical significance.
Examples
Example 1: Basic Weighted Tau (Default Parameters)
This example calculates the weighted tau using default additive hyperbolic weighting with rank averaging.
Inputs:
x | y | rank | additive |
---|---|---|---|
12 | 1 | True | True |
2 | 4 | ||
1 | 7 | ||
12 | 1 | ||
2 | 0 |
Excel formula:
=WEIGHTEDTAU({12;2;1;12;2}, {1;4;7;1;0})
Expected output:
Result |
---|
-0.567 |
This shows a moderate negative weighted correlation between the two variables.
Example 2: Multiplicative Weighting
This example uses multiplicative weighting instead of additive weighting.
Inputs:
x | y | rank | additive |
---|---|---|---|
12 | 1 | True | False |
2 | 4 | ||
1 | 7 | ||
12 | 1 | ||
2 | 0 |
Excel formula:
=WEIGHTEDTAU({12;2;1;12;2}, {1;4;7;1;0}, True, False)
Expected output:
Result |
---|
-0.622 |
The multiplicative weighting produces a slightly different correlation value.
Example 3: Index-Based Ranking
This example uses element indices directly as ranks instead of data-based ranking.
Inputs:
x | y | rank | additive |
---|---|---|---|
12 | 1 | False | True |
2 | 4 | ||
1 | 7 | ||
12 | 1 | ||
2 | 0 |
Excel formula:
=WEIGHTEDTAU({12;2;1;12;2}, {1;4;7;1;0}, False)
Expected output:
Result |
---|
-0.516 |
Using index-based ranking produces a different correlation pattern.
Example 4: Different Data Set
This example demonstrates the function with a different data set showing positive correlation.
Inputs:
x | y | rank | additive |
---|---|---|---|
1 | 2 | True | True |
2 | 3 | ||
3 | 4 | ||
4 | 5 | ||
5 | 6 |
Excel formula:
=WEIGHTEDTAU({1;2;3;4;5}, {2;3;4;5;6})
Expected output:
Result |
---|
1.000 |
This shows perfect positive weighted correlation for monotonically increasing data.
Python Code
from scipy.stats import weightedtau as scipy_weightedtau
def weightedtau(x, y, rank=True, additive=True):
"""
Compute a weighted version of Kendall's tau correlation coefficient.
Args:
x: 2D list containing scores as a column vector.
y: 2D list containing scores as a column vector, same length as x.
rank: If True, use decreasing lexicographical rank averaging (x,y) and (y,x).
If False, use element indices as ranks.
additive: If True, compute weights by addition; if False, by multiplication.
Returns:
The weighted tau correlation coefficient (float), or an error message (str) if input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Handle scalar inputs by converting to 2D lists
if not isinstance(x, list):
x = [[x]]
if not isinstance(y, list):
y = [[y]]
# Validate x input
if not isinstance(x, list) or len(x) == 0:
return "Invalid input: x must be a non-empty 2D list."
# Check if x is a 2D list and extract column vector
x_values = []
try:
if isinstance(x[0], list):
# 2D list - should be a column vector
if any(len(row) != 1 for row in x):
return "Invalid input: x must be a column vector (single column)."
x_values = [float(row[0]) for row in x]
else:
# 1D list - convert to column vector format
x_values = [float(val) for val in x]
except (ValueError, TypeError):
return "Invalid input: x must contain numeric values."
# Validate y input
if not isinstance(y, list) or len(y) == 0:
return "Invalid input: y must be a non-empty 2D list."
# Check if y is a 2D list and extract column vector
y_values = []
try:
if isinstance(y[0], list):
# 2D list - should be a column vector
if any(len(row) != 1 for row in y):
return "Invalid input: y must be a column vector (single column)."
y_values = [float(row[0]) for row in y]
else:
# 1D list - convert to column vector format
y_values = [float(val) for val in y]
except (ValueError, TypeError):
return "Invalid input: y must contain numeric values."
# Check if x and y have the same length
if len(x_values) != len(y_values):
return "Invalid input: x and y must have the same length."
# Check minimum length requirement
if len(x_values) < 2:
return "Invalid input: x and y must have at least 2 elements."
# Validate rank parameter
if not isinstance(rank, bool):
return "Invalid input: rank must be a boolean value."
# Validate additive parameter
if not isinstance(additive, bool):
return "Invalid input: additive must be a boolean value."
try:
# Call scipy's weightedtau function
result = scipy_weightedtau(x_values, y_values, rank=rank, additive=additive)
return result.statistic
except Exception as e:
return f"scipy.stats.weightedtau error: {e}"