MANNWHITNEYU

Overview

The MANNWHITNEYU function performs the Mann-Whitney U test (also known as the Wilcoxon rank-sum test or Mann-Whitney-Wilcoxon test), a nonparametric statistical test for comparing two independent samples. This test evaluates whether the distribution underlying one sample is stochastically different from the distribution underlying the other, making it particularly useful when data do not meet the normality assumptions required by parametric tests like the t-test.

The test was developed by Henry Mann and Donald Ransom Whitney in 1947, building on earlier work by Frank Wilcoxon. This implementation uses SciPy’s scipy.stats.mannwhitneyu function, which supports both exact p-value computation for small samples and asymptotic normal approximation for larger datasets.

The U statistic is calculated by ranking all observations from both samples together, then computing:

U_1 = R_1 - \frac{n_1(n_1 + 1)}{2}

where R_1 is the sum of ranks for the first sample and n_1 is its sample size. The function returns U_1 (the statistic for the first sample) along with the p-value. For large samples, the test uses a normal approximation with mean \mu_U = \frac{n_1 n_2}{2} and variance adjusted for ties.

The Mann-Whitney U test is appropriate when observations are independent, responses are at least ordinal, and the researcher wants to test whether one distribution is shifted relative to the other. It is robust to outliers and does not require equal variances, making it a preferred alternative to the independent samples t-test when parametric assumptions are violated. For more background on the test and its properties, see the Wikipedia article on the Mann-Whitney U test.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=MANNWHITNEYU(x, y, mwu_alternative)
  • x (list[list], required): First sample data as a 2D array of numeric values
  • y (list[list], required): Second sample data as a 2D array of numeric values
  • mwu_alternative (str, optional, default: “two-sided”): Defines the alternative hypothesis

Returns (list[list]): 2D list [[statistic, p_value]], or error message string.

Examples

Example 1: Basic two-sided test with small samples

Inputs:

x y mwu_alternative
1 4 two-sided
2 5
3 6

Excel formula:

=MANNWHITNEYU({1;2;3}, {4;5;6}, "two-sided")

Expected output:

Result
0 0.1

Example 2: One-sided test (greater) with non-overlapping samples

Inputs:

x y mwu_alternative
1 4 greater
2 5
3 6

Excel formula:

=MANNWHITNEYU({1;2;3}, {4;5;6}, "greater")

Expected output:

Result
0 1

Example 3: One-sided test (less) with reversed samples

Inputs:

x y mwu_alternative
4 1 less
5 2
6 3

Excel formula:

=MANNWHITNEYU({4;5;6}, {1;2;3}, "less")

Expected output:

Result
9 1

Example 4: Two-sided test with larger samples

Inputs:

x y mwu_alternative
1 5 two-sided
2 6
3 7
4 8

Excel formula:

=MANNWHITNEYU({1;2;3;4}, {5;6;7;8}, "two-sided")

Expected output:

Result
0 0.02857

Python Code

import math

from scipy.stats import mannwhitneyu as scipy_mannwhitneyu

def mannwhitneyu(x, y, mwu_alternative='two-sided'):
    """
    Performs the Mann-Whitney U rank test on two independent samples using scipy.stats.mannwhitneyu.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): First sample data as a 2D array of numeric values
        y (list[list]): Second sample data as a 2D array of numeric values
        mwu_alternative (str, optional): Defines the alternative hypothesis Valid options: Two-sided, Less, Greater. Default is 'two-sided'.

    Returns:
        list[list]: 2D list [[statistic, p_value]], or error message string.
    """
    def to2d(val):
        return [[val]] if not isinstance(val, list) else val

    x = to2d(x)
    y = to2d(y)

    # Validate x and y are 2D lists with at least two rows
    if not (isinstance(x, list) and all(isinstance(row, list) for row in x) and len(x) >= 2):
        return "Error: Invalid input: x must be a 2D list with at least two rows."
    if not (isinstance(y, list) and all(isinstance(row, list) for row in y) and len(y) >= 2):
        return "Error: Invalid input: y must be a 2D list with at least two rows."
    # Flatten x and y
    try:
        x_flat = [float(item) for row in x for item in row]
        y_flat = [float(item) for row in y for item in row]
    except Exception:
        return "Error: Invalid input: x and y must contain only numeric values."
    if len(x_flat) < 2 or len(y_flat) < 2:
        return "Error: Invalid input: x and y must each contain at least two values."
    # Validate alternative
    if mwu_alternative not in ('two-sided', 'less', 'greater'):
        return "Error: Invalid input: mwu_alternative must be 'two-sided', 'less', or 'greater'."
    # Run test
    try:
        res = scipy_mannwhitneyu(x_flat, y_flat, alternative=mwu_alternative)
        u_stat = float(res.statistic)
        p_val = float(res.pvalue)
        if math.isnan(u_stat) or math.isinf(u_stat) or math.isnan(p_val) or math.isinf(p_val):
            return "Error: Invalid result: U statistic or p-value is not finite."
        return [[u_stat, p_val]]
    except Exception as e:
        return f"Error: scipy.stats.mannwhitneyu error: {e}"

Online Calculator