KRUSKAL
Overview
The KRUSKAL
function performs the Kruskal-Wallis H-test for independent samples, a non-parametric statistical test used to determine whether there are statistically significant differences between the medians of two or more groups. Unlike one-way ANOVA, Kruskal-Wallis does not assume normality or equal variances, making it suitable for ordinal or non-normally distributed data. The test statistic is calculated as:
where is the total number of observations, is the number of groups, is the number of observations in group , is the average rank of group , and is the overall average rank. For more details, see the scipy.stats.kruskal documentation .
This wrapper simplifies the function to accept only the most commonly used parameter: a 2D list of samples, with each column representing a group. Parameters such as nan_policy
, axis
, and keepdims
are excluded for Excel compatibility. This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel:
=KRUSKAL(samples)
samples
(2D list, required): Table where each column is a sample group and each row is an observation. Each group must have at least two values, and all groups must have the same number of values.
The function returns a single row with two values: the Kruskal-Wallis test statistic and the p-value, both as floats. If the input is invalid, it returns an error message (string).
Examples
Example 1: Basic Two Groups
Inputs:
samples | |
---|---|
1.2 | 1.1 |
2.1 | 1.4 |
2.3 | 1.2 |
Excel formula:
=KRUSKAL({1.2,1.1;2.1,1.4;2.3,1.2})
Expected output:
Statistic | p-value |
---|---|
1.765 | 0.184 |
Example 2: Three Groups
Inputs:
samples | ||
---|---|---|
1.2 | 1.1 | 2.0 |
2.1 | 1.4 | 2.2 |
2.3 | 1.2 | 2.4 |
Excel formula:
=KRUSKAL({1.2,1.1,2.0;2.1,1.4,2.2;2.3,1.2,2.4})
Expected output:
Statistic | p-value |
---|---|
4.235 | 0.120 |
Example 3: All Arguments Typical
Inputs:
samples | ||
---|---|---|
10 | 15 | 12 |
20 | 25 | 22 |
30 | 35 | 32 |
Excel formula:
=KRUSKAL({10,15,12;20,25,22;30,35,32})
Expected output:
Statistic | p-value |
---|---|
0.800 | 0.670 |
Example 4: Groups with Ties
Inputs:
samples | ||
---|---|---|
1 | 2 | 1 |
2 | 2 | 1 |
2 | 3 | 1 |
Excel formula:
=KRUSKAL({1,2,1;2,2,1;2,3,1})
Expected output:
Statistic | p-value |
---|---|
5.627 | 0.060 |
Python Code
from scipy.stats import kruskal as scipy_kruskal
from typing import List, Union
def kruskal(samples: List[List[float]]) -> Union[List[List[float]], str]:
"""
Computes the Kruskal-Wallis H-test for independent samples.
Args:
samples: 2D list of float values. Each column represents a sample group.
Returns:
2D list with a single row: [statistic, pvalue], or an error message (str) if input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Validate input type
if not isinstance(samples, list) or len(samples) < 2:
return "Invalid input: samples must be a 2D list with at least two columns (groups)."
# Check each column is a list with at least two elements
try:
columns = []
n_rows = None
for col in samples:
if not isinstance(col, list) or len(col) < 2:
return "Invalid input: each sample group must be a list with at least two values."
# Check all elements are float/int
for v in col:
if not isinstance(v, (int, float)):
return "Invalid input: all sample values must be numeric."
if n_rows is None:
n_rows = len(col)
elif n_rows != len(col):
return "Invalid input: all sample groups must have the same number of values."
columns.append([float(x) for x in col])
except Exception:
return "Invalid input: samples must be a 2D list of numeric values."
# Run test
try:
result = scipy_kruskal(*columns)
except Exception as e:
return f"scipy.stats.kruskal error: {e}"
# Convert np.float64 to float for output
stat = float(result.statistic)
pval = float(result.pvalue)
if any([x is None or not isinstance(x, (int, float)) for x in [stat, pval]]):
return "Invalid output from scipy.stats.kruskal."
return [[stat, pval]]