SOMERSD
Overview
The SOMERSD
function calculates Somers’ D, an asymmetric measure of ordinal association between two variables. Like Kendall’s tau, Somers’ D measures the correspondence between two rankings, but it is normalized differently and is asymmetric: in general, . Somers’ D is useful for analyzing the strength and direction of association between an independent (row) variable and a dependent (column) variable, especially for ordinal data or contingency tables. The calculation can be performed on two lists of rankings or on a 2D contingency table. For more details, see the scipy.stats.somersd documentation .
This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel:
=SOMERSD(x, [y], [alternative])
x
(2D list, required): Either a 2D list of rankings (as a column vector) or a 2D contingency table. If a 2D list of rankings, must have at least two rows.y
(2D list, optional): 2D list of rankings (as a column vector), same number of rows asx
. Ignored ifx
is a contingency table.alternative
(str, optional, default=“two-sided”): Defines the alternative hypothesis. One of “two-sided”, “less”, or “greater”.
The function returns a 2D list with two elements: the Somers’ D statistic and the p-value, or an error message (string) if the input is invalid.
Examples
Example 1: Contingency Table (Hotel Cleanliness and Satisfaction)
Inputs:
x | |||||
---|---|---|---|---|---|
27 | 25 | 14 | 7 | 0 | |
7 | 14 | 18 | 35 | 12 | |
1 | 3 | 2 | 7 | 17 |
Excel formula:
=SOMERSD({27,25,14,7,0;7,14,18,35,12;1,3,2,7,17})
Expected output:
Statistic | P-value |
---|---|
0.603277 | 1.00071E-27 |
Example 2: Two Rankings (Perfect Agreement)
Inputs:
x | y |
---|---|
1 | 1 |
2 | 2 |
3 | 3 |
4 | 4 |
5 | 5 |
Excel formula:
=SOMERSD({1;2;3;4;5}, {1;2;3;4;5})
Expected output:
Statistic | P-value |
---|---|
1 | 0 |
Example 3: Two Rankings (Perfect Disagreement)
Inputs:
x | y |
---|---|
1 | 5 |
2 | 4 |
3 | 3 |
4 | 2 |
5 | 1 |
Excel formula:
=SOMERSD({1;2;3;4;5}, {5;4;3;2;1})
Expected output:
Statistic | P-value |
---|---|
-1 | 0 |
Example 4: Two Rankings (Random Association)
Inputs:
x | y |
---|---|
1 | 3 |
2 | 1 |
3 | 5 |
4 | 2 |
5 | 4 |
Excel formula:
=SOMERSD({1;2;3;4;5}, {3;1;5;2;4})
Expected output:
Statistic | P-value |
---|---|
0.2 | 0.3613104285261787 |
Python Code
from scipy.stats import somersd as scipy_somersd
def somersd(x, y=None, alternative="two-sided"):
"""
Calculate Somers' D, an asymmetric measure of ordinal association between two variables.
Args:
x: 2D list, either a column vector of rankings or a 2D contingency table. If rankings, must have at least two rows.
y: 2D list, optional. Column vector of rankings, same number of rows as x. Ignored if x is a contingency table.
alternative: str, optional. Defines the alternative hypothesis. One of "two-sided", "less", or "greater" (default: "two-sided").
Returns:
2D list: [[statistic, pvalue]], or an error message (str) if input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Validate x
if not isinstance(x, list) or len(x) < 2:
return "Invalid input: x must be a 2D list with at least two rows."
# Check if x is a contingency table (2D matrix) or a column vector
is_contingency = all(isinstance(row, list) and len(row) > 1 for row in x)
try:
if is_contingency:
# Validate all elements are numbers
for row in x:
for val in row:
float(val)
result = scipy_somersd(x, alternative=alternative)
else:
# x and y must be column vectors
x_vec = [float(row[0]) if isinstance(row, list) else float(row) for row in x]
if y is None:
return "Invalid input: y must be provided when x is a vector."
if not isinstance(y, list) or len(y) != len(x):
return "Invalid input: y must be a 2D list with the same number of rows as x."
y_vec = [float(row[0]) if isinstance(row, list) else float(row) for row in y]
result = scipy_somersd(x_vec, y_vec, alternative=alternative)
stat = result.statistic
pval = result.pvalue
# Disallow nan/inf
if any([stat is None, pval is None]):
return "Invalid result: statistic or pvalue is None."
if hasattr(stat, 'is_integer') and (stat != stat or abs(stat) == float('inf')):
return "Invalid result: statistic is NaN or infinite."
if hasattr(pval, 'is_integer') and (pval != pval or abs(pval) == float('inf')):
return "Invalid result: pvalue is NaN or infinite."
return [[stat, pval]]
except Exception as e:
return f"Error: {e}"