Skip to Content

TUKEY_HSD

Overview

The TUKEY_HSD function performs Tukey’s Honest Significant Difference (HSD) test for pairwise comparisons of means across multiple groups, following a one-way ANOVA. This test is commonly used in statistics to determine which group means are significantly different from each other after finding a significant result in ANOVA. The calculation is based on the studentized range distribution and assumes equal variances among groups. The function wraps scipy.stats.tukey_hsd, but simplifies input to a single 2D list where each column is a group, and only supports the equal variance case (equal_var=True). The Games-Howell test (equal_var=False) is not supported in this wrapper.

The test statistic is:

q=xˉixˉjMSEnq = \frac{\lvert \bar{x}_i - \bar{x}_j \rvert}{\sqrt{\frac{MSE}{n}}}

where xˉi\bar{x}_i and xˉj\bar{x}_j are group means, MSEMSE is the mean squared error from ANOVA, and nn is the number of observations per group.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:

=TUKEY_HSD(samples, [equal_var])
  • samples (2D list, required): Table of values, where each column is a group/sample and each row is an observation. Must have at least two columns and two rows.
  • equal_var (bool, optional, default=TRUE): If TRUE, assumes equal variances (Tukey-HSD/Tukey-Kramer). If FALSE, returns an error (Games-Howell not supported).

The function returns a 2D array of p-values (float or None) for each pairwise group comparison. If the input is invalid, it returns an error message (string). Each cell in the output array represents the p-value for the comparison between two groups; diagonal cells are always 1.0.

Examples

Example 1: Basic Two Groups

Inputs:

samplesequal_var
1.22.3TRUE
1.52.1
1.32.2
1.42.4

Excel formula:

=TUKEY_HSD({1.2,2.3;1.5,2.1;1.3,2.2;1.4,2.4})

Expected output:

Group 1Group 2
Group 11.0000.000
Group 20.0001.000

Example 2: Three Groups, Equal Variance

Inputs:

samplesequal_var
1.22.33.1TRUE
1.52.13.2
1.32.23.3
1.42.43.4

Excel formula:

=TUKEY_HSD({1.2,2.3,3.1;1.5,2.1,3.2;1.3,2.2,3.3;1.4,2.4,3.4})

Expected output:

Group 1Group 2Group 3
Group 11.0000.0000.000
Group 20.0001.0000.000
Group 30.0000.0001.000

Example 3: Three Groups, Unequal Variance (Not Supported)

Inputs:

samplesequal_var
1.22.33.1FALSE
1.52.13.2
1.32.23.3
1.42.43.4

Excel formula:

=TUKEY_HSD({1.2,2.3,3.1;1.5,2.1,3.2;1.3,2.2,3.3;1.4,2.4,3.4}, FALSE)

Expected output:

Result
Invalid input: Games-Howell test (equal_var=FALSE) is not supported by scipy.stats.tukey_hsd.

Example 4: All Arguments Specified

Inputs:

samplesequal_var
5.16.27.3TRUE
5.26.17.2
5.36.37.1
5.46.47.4

Excel formula:

=TUKEY_HSD({5.1,6.2,7.3;5.2,6.1,7.2;5.3,6.3,7.1;5.4,6.4,7.4}, TRUE)

Expected output:

Group 1Group 2Group 3
Group 11.0000.0000.000
Group 20.0001.0000.000
Group 30.0000.0001.000

Python Code

from scipy.stats import tukey_hsd as scipy_tukey_hsd from typing import List, Optional, Union def tukey_hsd(samples: List[List[float]], equal_var: bool = True) -> Union[List[List[Optional[float]]], str]: """ Performs Tukey's HSD test for equality of means over multiple treatments. Args: samples: 2D list of float values. Each column is a group/sample. Must be a 2D list with at least two columns and two rows. equal_var: If True, assumes equal variances (Tukey-HSD/Tukey-Kramer). If False, uses Games-Howell test. Returns: 2D list of p-values for each pairwise comparison, or an error message (str) if input is invalid. This example function is provided as-is without any representation of accuracy. """ # Validate samples if not isinstance(samples, list) or len(samples) < 2 or not all(isinstance(row, list) for row in samples): return "Invalid input: samples must be a 2D list with at least two rows." n_rows = len(samples) n_cols = len(samples[0]) if n_rows > 0 else 0 if n_cols < 2 or n_rows < 2: return "Invalid input: samples must be a 2D list with at least two columns and two rows." # Check all columns have same length for row in samples: if len(row) != n_cols: return "Invalid input: all rows in samples must have the same number of columns." # Transpose to columns (groups) try: groups = [[float(samples[row][col]) for row in range(n_rows)] for col in range(n_cols)] except Exception: return "Invalid input: samples must contain only numeric values." # Only Tukey-HSD is supported if not equal_var: return "Invalid input: Games-Howell test (equal_var=False) is not supported by scipy.stats.tukey_hsd." # Run Tukey HSD try: result = scipy_tukey_hsd(*groups) except Exception as e: return f"scipy.stats.tukey_hsd error: {e}" # Extract p-values matrix try: pvals = result.pvalue # pvals is a numpy array, convert to 2D list pvals_list = pvals.tolist() # Replace nan/inf with None for i in range(len(pvals_list)): for j in range(len(pvals_list[i])): v = pvals_list[i][j] if v is None: continue if isinstance(v, float): if v != v or v == float('inf') or v == float('-inf'): pvals_list[i][j] = None return pvals_list except Exception as e: return f"Error extracting p-values: {e}"

Example Workbook

Link to Workbook

Last updated on