F_ONEWAY

Overview

The F_ONEWAY function performs a one-way analysis of variance (ANOVA) test to determine whether there are statistically significant differences between the means of two or more independent groups. This test is commonly used in experimental design to compare treatment effects, group performance, or any scenario where multiple independent samples need to be compared simultaneously.

One-way ANOVA tests the null hypothesis that all group population means are equal. The test statistic, known as the F-statistic, is computed as the ratio of between-group variance to within-group variance:

F = \frac{\text{MS}_{\text{between}}}{\text{MS}_{\text{within}}} = \frac{\sum_{i} n_i (\bar{x}_i - \bar{x})^2 / (k - 1)}{\sum_{i} \sum_{j} (x_{ij} - \bar{x}_i)^2 / (N - k)}

where k is the number of groups, n_i is the sample size of group i, \bar{x}_i is the mean of group i, \bar{x} is the overall mean, and N is the total sample size. A large F-statistic indicates that between-group variability exceeds within-group variability, suggesting that at least one group mean differs from the others.

This implementation uses the scipy.stats.f_oneway function from SciPy, which follows the algorithm described in Heiman (2001). The function returns both the F-statistic and the associated p-value from the F-distribution.

The ANOVA test relies on three key assumptions: (1) independence of observations, (2) normally distributed populations, and (3) homoscedasticity (equal population variances across groups). When the equal variance assumption is violated, consider using Welch’s ANOVA or the non-parametric Kruskal-Wallis H-test. For further reading on ANOVA methodology, see McDonald’s Handbook of Biological Statistics.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=F_ONEWAY(samples)
  • samples (list[list], required): 2D array of numeric values where each column represents a group/sample. Must have at least two columns (groups) and three rows.

Returns (list[list]): A 2D list with a single row containing [F statistic, p-value]. str: An error message if input is invalid.

Examples

Example 1: Demo case 1

Inputs:

samples
1.1 2.2
1.3 2.4
1.2 2.3

Excel formula:

=F_ONEWAY({1.1,2.2;1.3,2.4;1.2,2.3})

Expected output:

Result
181.5 0.00018

Example 2: Demo case 2

Inputs:

samples
1.1 2.2 3.1
1.3 2.4 3.3
1.2 2.3 3.2

Excel formula:

=F_ONEWAY({1.1,2.2,3.1;1.3,2.4,3.3;1.2,2.3,3.2})

Expected output:

Result
301 0

Example 3: Demo case 3

Inputs:

samples
1 2 3 4
1.1 2.1 3.1 4.1
0.9 1.9 2.9 3.9

Excel formula:

=F_ONEWAY({1,2,3,4;1.1,2.1,3.1,4.1;0.9,1.9,2.9,3.9})

Expected output:

Result
500 0

Example 4: Demo case 4

Inputs:

samples
10 20
12 22
11 21

Excel formula:

=F_ONEWAY({10,20;12,22;11,21})

Expected output:

Result
150 0.00026

Python Code

import math

from scipy.stats import f_oneway as scipy_f_oneway

def f_oneway(samples):
    """
    Performs a one-way ANOVA test for two or more independent samples.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        samples (list[list]): 2D array of numeric values where each column represents a group/sample. Must have at least two columns (groups) and three rows.

    Returns:
        list[list]: A 2D list with a single row containing [F statistic, p-value]. str: An error message if input is invalid.
    """
    def to2d(x):
        return [[x]] if not isinstance(x, list) else x

    samples = to2d(samples)

    # Validate samples is a 2D list with at least two columns (groups) and three rows
    if not (isinstance(samples, list) and all(isinstance(row, list) for row in samples)):
        return "Invalid input: samples must be a 2D list."
    n_rows = len(samples)
    if n_rows < 3:
        return "Invalid input: samples must have at least three rows."
    n_cols = len(samples[0]) if n_rows > 0 else 0
    if n_cols < 2:
        return "Invalid input: samples must have at least two columns (groups)."
    # Check all elements are numeric and each group has at least two values
    try:
        groups = []
        for col in range(n_cols):
            group = [float(samples[row][col]) for row in range(n_rows)]
            if len(group) < 2:
                return "Invalid input: each group must contain at least two numeric values."
            groups.append(group)
    except Exception:
        return "Invalid input: samples must contain only numeric values."
    # Call scipy.stats.f_oneway
    try:
        result = scipy_f_oneway(*groups)
        stat = float(result.statistic)
        pvalue = float(result.pvalue)
    except Exception as e:
        return f"scipy.stats.f_oneway error: {e}"

    if math.isnan(stat) or math.isinf(stat) or math.isnan(pvalue) or math.isinf(pvalue):
        return "Invalid result: output contains nan or inf."

    return [[stat, pvalue]]

Online Calculator