RANKSUMS

Overview

The RANKSUMS function computes the Wilcoxon rank-sum statistic and p-value for two independent samples. This nonparametric test, also known as the Mann-Whitney U test, is used to determine whether one of two samples of independent observations tends to have larger values than the other. It is especially useful when the data do not meet the assumptions required for parametric tests like the t-test. The test statistic is based on the ranks of the combined data:

W = \sum_{i=1}^{n_1} R_i

where $R_i$ is the rank of the $i$ -th observation in the first sample, and $n_1$ is the size of the first sample. The p-value is computed according to the specified alternative hypothesis. For more details, see the scipy.stats.ranksums documentation .

This wrapper exposes only the most commonly used parameters: the two sample arrays and the alternative hypothesis. Parameters related to axis selection, NaN handling, and broadcasting are omitted for simplicity in the Excel context. This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:


=RANKSUMS(x, y, [alternative])

x (2D list, required): First sample data. Must be a 2D array (rectangular range) with at least two rows.
y (2D list, required): Second sample data. Must be a 2D array (rectangular range) with at least two rows.
alternative (string, optional, default="two-sided"): Defines the alternative hypothesis. Must be one of "two-sided", "less", or "greater".

The function returns a single-row 2D array: [statistic, pvalue], both floats. If the input is invalid, an error message (string) is returned.

Examples

Example 1: Basic Case (Two Rows Each)

Inputs:

x		y
1.0	2.0	5.0	6.0
3.0	4.0	7.0	8.0

Excel formula:


=RANKSUMS({1,2;3,4}, {5,6;7,8})

Expected output:

statistic	pvalue
-2.309	0.021

Example 2: With Alternative ‘less’

Inputs:

x		y		alternative
1.0	2.0	5.0	6.0	less
3.0	4.0	7.0	8.0

Excel formula:


=RANKSUMS({1,2;3,4}, {5,6;7,8}, "less")

Expected output:

statistic	pvalue
-2.309	0.010

Example 3: With Alternative ‘greater’

Inputs:

x		y		alternative
1.0	2.0	5.0	6.0	greater
3.0	4.0	7.0	8.0

Excel formula:


=RANKSUMS({1,2;3,4}, {5,6;7,8}, "greater")

Expected output:

statistic	pvalue
-2.309	0.990

Example 4: Larger Samples

Inputs:

x		y
1.0	2.0	7.0	8.0
3.0	4.0	9.0	10.0
5.0	6.0	11.0	12.0

Excel formula:


=RANKSUMS({1,2;3,4;5,6}, {7,8;9,10;11,12})

Expected output:

statistic	pvalue
-2.882	0.004

Python Code


from scipy.stats import ranksums as scipy_ranksums
from typing import List, Union
 
def ranksums(x: List[List[float]], y: List[List[float]], alternative: str = 'two-sided') -> Union[List[List[float]], str]:
    """
    Computes the Wilcoxon rank-sum statistic and p-value for two independent samples.
 
    Args:
        x: 2D list of float values. First sample data.
        y: 2D list of float values. Second sample data.
        alternative: Defines the alternative hypothesis ('two-sided', 'less', 'greater'). Default is 'two-sided'.
 
    Returns:
        2D list with one row: [statistic, pvalue], or an error message (str) if input is invalid.
 
    This example function is provided as-is without any representation of accuracy.
    """
    # Validate x and y are 2D lists with at least two rows
    if not (isinstance(x, list) and all(isinstance(row, list) for row in x)) or len(x) < 2:
        return "Invalid input: x must be a 2D list with at least two rows."
    if not (isinstance(y, list) and all(isinstance(row, list) for row in y)) or len(y) < 2:
        return "Invalid input: y must be a 2D list with at least two rows."
    # Flatten x and y
    try:
        x_flat = [float(item) for row in x for item in row]
        y_flat = [float(item) for row in y for item in row]
    except Exception:
        return "Invalid input: x and y must contain only numeric values."
    if len(x_flat) < 2 or len(y_flat) < 2:
        return "Invalid input: x and y must each contain at least two values."
    # Validate alternative
    if alternative not in ('two-sided', 'less', 'greater'):
        return "Invalid input: alternative must be 'two-sided', 'less', or 'greater'."
    try:
        result = scipy_ranksums(x_flat, y_flat, alternative=alternative)
        stat, pvalue = result.statistic, result.pvalue
        # Disallow nan/inf
        if any([
            stat is None, pvalue is None,
            isinstance(stat, float) and (stat != stat or stat in (float('inf'), float('-inf'))),
            isinstance(pvalue, float) and (pvalue != pvalue or pvalue in (float('inf'), float('-inf')))
        ]):
            return "Invalid result: statistic or pvalue is nan or inf."
        return [[stat, pvalue]]
    except Exception as e:
        return f"scipy.stats.ranksums error: {e}"

Example Workbook

Link to Workbook