KRUSKAL

Overview

The KRUSKAL function performs the Kruskal-Wallis H-test, a non-parametric statistical test for comparing two or more independent samples to determine whether they originate from the same distribution. Named after William Kruskal and W. Allen Wallis, this test serves as a non-parametric alternative to one-way ANOVA when the assumption of normally distributed residuals cannot be met.

The test operates on ranks rather than raw values. All observations from all groups are combined and ranked from 1 to N (with tied values receiving the average of the ranks they would have obtained). The test statistic H is computed as:

H = \frac{12}{N(N+1)} \sum_{i=1}^{g} n_i \bar{r}_{i}^2 - 3(N+1)

where N is the total number of observations, g is the number of groups, n_i is the number of observations in group i, and \bar{r}_i is the average rank of all observations in group i. When ties are present, SciPy applies a correction factor to the H statistic.

This implementation uses the scipy.stats.kruskal function from SciPy, which returns both the H statistic (corrected for ties) and a p-value calculated under the assumption that H follows a chi-squared distribution with g-1 degrees of freedom.

The null hypothesis states that the population medians of all groups are equal. Rejecting the null hypothesis indicates that at least one sample stochastically dominates another, but does not identify which specific groups differ. Post-hoc tests such as Dunn’s test or pairwise Mann-Whitney tests with Bonferroni correction are typically used for follow-up comparisons. A common guideline is that each sample group should have at least 5 observations for the chi-squared approximation to be reliable. For more details, see the Wikipedia article on the Kruskal-Wallis test.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=KRUSKAL(samples)

samples (list[list], required): 2D list where each inner list represents a sample group of numeric values.

Returns (list[list]): 2D list [[statistic, p_value]], or error message string.

Example 1: Basic two groups comparison

Inputs:

samples
1.2	2.1	2.3
1.1	1.4	1.2

Excel formula:

=KRUSKAL({1.2,2.1,2.3;1.1,1.4,1.2})

Expected output:

Result
1.76471	0.184039

Example 2: Three groups comparison

Inputs:

samples
1.2	2.1	2.3
1.1	1.4	1.2
2	2.2	2.4

Excel formula:

=KRUSKAL({1.2,2.1,2.3;1.1,1.4,1.2;2,2.2,2.4})

Expected output:

Result
4.23529	0.120314

Example 3: Similar groups with high p-value

Inputs:

samples
10	20	30
15	25	35
12	22	32

Excel formula:

=KRUSKAL({10,20,30;15,25,35;12,22,32})

Expected output:

Result
0.8	0.67032

Example 4: Groups with tied values

Inputs:

samples
1	2	2
2	2	3
1	1	1

Excel formula:

=KRUSKAL({1,2,2;2,2,3;1,1,1})

Expected output:

Result
5.62667	0.0600046

Python Code

Show Code

from scipy.stats import kruskal as scipy_kruskal

def kruskal(samples):
    """
    Computes the Kruskal-Wallis H-test for independent samples.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kruskal.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        samples (list[list]): 2D list where each inner list represents a sample group of numeric values.

    Returns:
        list[list]: 2D list [[statistic, p_value]], or error message string.
    """
    def to2d(x):
      return [[x]] if not isinstance(x, list) else x

    try:
      samples = to2d(samples)

      if not isinstance(samples, list) or len(samples) < 2:
        return "Error: samples must be a 2D list with at least two groups."

      groups = []
      for group in samples:
        if not isinstance(group, list) or len(group) < 1:
          return "Error: each sample group must be a non-empty list."
        for v in group:
          if not isinstance(v, (int, float)):
            return "Error: all sample values must be numeric."
        groups.append([float(x) for x in group])

      result = scipy_kruskal(*groups)

      return [[float(result.statistic), float(result.pvalue)]]
    except Exception as e:
      return f"Error: {str(e)}"

Online Calculator

samples *

2D list where each inner list represents a sample group of numeric values.