MANNWHITNEYU

Overview

The MANNWHITNEYU function performs the Mann-Whitney U rank test on two independent samples to determine whether their population distributions differ. This non-parametric test is commonly used as an alternative to the independent t-test when data do not meet the assumptions of normality. The test ranks all values from both samples together and calculates the U statistic, which measures the difference in ranks between the two groups. The p-value indicates the probability of observing the data under the null hypothesis that the distributions are equal.

The calculation is based on the following equations:

U_1 = n_1 n_2 + \frac{n_1(n_1+1)}{2} - R_1 U_2 = n_1 n_2 + \frac{n_2(n_2+1)}{2} - R_2

where $n_1$ and $n_2$ are the sample sizes, and $R_1$ and $R_2$ are the sums of ranks for each sample. The test returns the U statistic and the p-value for the specified alternative hypothesis.

For more details, see the scipy.stats.mannwhitneyu documentation .

This wrapper simplifies the function by only supporting the most commonly used parameters: two sample arrays and the alternative hypothesis. Advanced options such as axis selection, method, continuity correction, NaN handling, and dimension keeping are not supported. Only the default asymptotic method is used.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:


=MANNWHITNEYU(x, y, [alternative])

x (2D list, required): First sample data. Must be a 2D array (rectangular range) with at least two rows.
y (2D list, required): Second sample data. Must be a 2D array (rectangular range) with at least two rows.
alternative (string, optional, default=‘two-sided’): Defines the alternative hypothesis. Must be one of 'two-sided', 'less', or 'greater'.

The function returns a 2D array with one row: [U statistic, p-value], both as floats. If the input is invalid, it returns an error message (string).

Examples

Example 1: Basic Two-Sided Test

Inputs:

x	y	alternative
1.0	4.0	two-sided
2.0	5.0
3.0	6.0

Excel formula:


=MANNWHITNEYU({1.0;2.0;3.0}, {4.0;5.0;6.0}, "two-sided")

Expected output:

U statistic	p-value
0.000	0.100

Example 2: Greater Alternative

Inputs:

x	y	alternative
1.0	4.0	greater
2.0	5.0
3.0	6.0

Excel formula:


=MANNWHITNEYU({1.0;2.0;3.0}, {4.0;5.0;6.0}, "greater")

Expected output:

U statistic	p-value
0.000	1.000

Example 3: Less Alternative

Inputs:

x	y	alternative
4.0	1.0	less
5.0	2.0
6.0	3.0

Excel formula:


=MANNWHITNEYU({4.0;5.0;6.0}, {1.0;2.0;3.0}, "less")

Expected output:

U statistic	p-value
9.000	1.000

Example 4: Larger Samples, Two-Sided

Inputs:

x	y	alternative
1.0	5.0	two-sided
2.0	6.0
3.0	7.0
4.0	8.0

Excel formula:


=MANNWHITNEYU({1.0;2.0;3.0;4.0}, {5.0;6.0;7.0;8.0}, "two-sided")

Expected output:

U statistic	p-value
0.000	0.029

Python Code


from scipy.stats import mannwhitneyu as scipy_mannwhitneyu
from typing import List, Union
 
def mannwhitneyu(x: List[List[float]], y: List[List[float]], alternative: str = 'two-sided') -> Union[List[List[float]], str]:
    """
    Performs the Mann-Whitney U rank test on two independent samples.
 
    Args:
        x: 2D list of float values. First sample data.
        y: 2D list of float values. Second sample data.
        alternative: Defines the alternative hypothesis ('two-sided', 'less', 'greater'). Default is 'two-sided'.
 
    Returns:
        2D list with one row: [U statistic, p-value]. Returns an error message (str) if input is invalid.
 
    This example function is provided as-is without any representation of accuracy.
    """
    # Validate x and y are 2D lists with at least two rows
    if not (isinstance(x, list) and all(isinstance(row, list) for row in x) and len(x) >= 2):
        return "Invalid input: x must be a 2D list with at least two rows."
    if not (isinstance(y, list) and all(isinstance(row, list) for row in y) and len(y) >= 2):
        return "Invalid input: y must be a 2D list with at least two rows."
    # Flatten x and y
    try:
        x_flat = [float(item) for row in x for item in row]
        y_flat = [float(item) for row in y for item in row]
    except Exception:
        return "Invalid input: x and y must contain only numeric values."
    if len(x_flat) < 2 or len(y_flat) < 2:
        return "Invalid input: x and y must each contain at least two values."
    # Validate alternative
    if alternative not in ('two-sided', 'less', 'greater'):
        return "Invalid input: alternative must be 'two-sided', 'less', or 'greater'."
    # Run test
    try:
        res = scipy_mannwhitneyu(x_flat, y_flat, alternative=alternative)
        u_stat = float(res.statistic)
        p_val = float(res.pvalue)
        # Disallow nan/inf
        if any([
            u_stat != u_stat or p_val != p_val,  # NaN check
            abs(u_stat) == float('inf') or abs(p_val) == float('inf')
        ]):
            return "Invalid result: U statistic or p-value is not finite."
        return [[u_stat, p_val]]
    except Exception as e:
        return f"scipy.stats.mannwhitneyu error: {e}"

Example Workbook

Link to Workbook