TTEST_IND_FROM_STATS
Overview
The TTEST_IND_FROM_STATS
function performs an independent two-sample t-test using summary statistics (means, standard deviations, and sample sizes) for each group, rather than requiring raw data. This is useful when only summary statistics are available, such as in published studies or aggregated reports. The function supports both the standard t-test (assuming equal variances) and Welch’s t-test (for unequal variances), and allows specification of the alternative hypothesis. The calculation is based on the following equations:
where are the sample means, are the corrected sample standard deviations, and are the sample sizes. For more details, see the scipy.stats.ttest_ind_from_stats documentation .
This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel:
=TTEST_IND_FROM_STATS(mean_one, std_one, nobs_one, mean_two, std_two, nobs_two, [equal_var], [alternative])
mean_one
(2D list, required): Mean(s) of sample 1. Each row is a separate test.std_one
(2D list, required): Corrected sample standard deviation(s) of sample 1. Each row is a separate test.nobs_one
(2D list, required): Number(s) of observations in sample 1. Each row is a separate test.mean_two
(2D list, required): Mean(s) of sample 2. Each row is a separate test.std_two
(2D list, required): Corrected sample standard deviation(s) of sample 2. Each row is a separate test.nobs_two
(2D list, required): Number(s) of observations in sample 2. Each row is a separate test.equal_var
(bool, optional, default=TRUE
): IfTRUE
, assumes equal population variances. IfFALSE
, performs Welch’s t-test.alternative
(string, optional, default="two-sided"
): Defines the alternative hypothesis. Must be one of"two-sided"
,"less"
, or"greater"
.
The function returns a 2D array with two columns: the t-statistic and p-value for each test. If the input is invalid, an error message (string) is returned.
Examples
Example 1: Basic Equal Variance, Two-Sided Test
Inputs:
mean_one | std_one | nobs_one | mean_two | std_two | nobs_two | equal_var | alternative |
---|---|---|---|---|---|---|---|
1.0 | 1.0 | 10 | 1.5 | 1.0 | 10 | TRUE | two-sided |
2.0 | 1.0 | 10 | 2.5 | 1.0 | 10 |
Excel formula:
=TTEST_IND_FROM_STATS({1.0;2.0},{1.0;1.0},{10;10},{1.5;2.5},{1.0;1.0},{10;10})
Expected output:
t-statistic | p-value |
---|---|
-1.118 | 0.278 |
-1.118 | 0.278 |
Example 2: Welch’s Test, One-Sided (Less)
Inputs:
mean_one | std_one | nobs_one | mean_two | std_two | nobs_two | equal_var | alternative |
---|---|---|---|---|---|---|---|
1.0 | 1.0 | 10 | 1.5 | 1.0 | 10 | FALSE | less |
2.0 | 1.0 | 10 | 2.5 | 1.0 | 10 |
Excel formula:
=TTEST_IND_FROM_STATS({1.0;2.0},{1.0;1.0},{10;10},{1.5;2.5},{1.0;1.0},{10;10},FALSE,"less")
Expected output:
t-statistic | p-value |
---|---|
-1.118 | 0.139 |
-1.118 | 0.139 |
Example 3: Greater Alternative, Single Row
Inputs:
mean_one | std_one | nobs_one | mean_two | std_two | nobs_two | equal_var | alternative |
---|---|---|---|---|---|---|---|
2.0 | 1.0 | 10 | 1.0 | 1.0 | 10 | TRUE | greater |
3.0 | 1.0 | 10 | 2.0 | 1.0 | 10 |
Excel formula:
=TTEST_IND_FROM_STATS({2.0;3.0},{1.0;1.0},{10;10},{1.0;2.0},{1.0;1.0},{10;10},TRUE,"greater")
Expected output:
t-statistic | p-value |
---|---|
2.236 | 0.019 |
2.236 | 0.019 |
Example 4: All Arguments Specified
Inputs:
mean_one | std_one | nobs_one | mean_two | std_two | nobs_two | equal_var | alternative |
---|---|---|---|---|---|---|---|
5.0 | 2.0 | 20 | 4.0 | 2.0 | 20 | TRUE | two-sided |
6.0 | 2.0 | 20 | 5.0 | 2.0 | 20 |
Excel formula:
=TTEST_IND_FROM_STATS({5.0;6.0},{2.0;2.0},{20;20},{4.0;5.0},{2.0;2.0},{20;20},TRUE,"two-sided")
Expected output:
t-statistic | p-value |
---|---|
1.581 | 0.122 |
1.581 | 0.122 |
Python Code
from scipy.stats import ttest_ind_from_stats as scipy_ttest_ind_from_stats
from typing import List, Union
def ttest_ind_from_stats(
mean_one: List[List[float]],
std_one: List[List[float]],
nobs_one: List[List[int]],
mean_two: List[List[float]],
std_two: List[List[float]],
nobs_two: List[List[int]],
equal_var: bool = True,
alternative: str = 'two-sided'
) -> Union[List[List[float]], str]:
"""
Performs a t-test for means of two independent samples using summary statistics.
Args:
mean_one: 2D list of float values. Mean(s) of sample 1.
std_one: 2D list of float values. Corrected sample standard deviation(s) of sample 1.
nobs_one: 2D list of int values. Number(s) of observations in sample 1.
mean_two: 2D list of float values. Mean(s) of sample 2.
std_two: 2D list of float values. Corrected sample standard deviation(s) of sample 2.
nobs_two: 2D list of int values. Number(s) of observations in sample 2.
equal_var: If True, assumes equal population variances. If False, performs Welch’s t-test. Default is True.
alternative: Defines the alternative hypothesis ('two-sided', 'less', 'greater'). Default is 'two-sided'.
Returns:
2D list with two columns: [t-statistic, p-value] for each test, or an error message (str) if input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Helper to validate 2D list input
def validate_2d_list(arr, name, type_):
if not isinstance(arr, list) or len(arr) < 2:
return f"Invalid input: {name} must be a 2D list with at least two rows."
for row in arr:
if not isinstance(row, list) or len(row) < 1:
return f"Invalid input: {name} must be a 2D list."
for val in row:
if not isinstance(val, type_):
try:
type_(val)
except Exception:
return f"Invalid input: {name} must contain {type_.__name__} values."
return None
# Validate all inputs
for arr, name, type_ in [
(mean_one, "mean_one", float),
(std_one, "std_one", float),
(nobs_one, "nobs_one", int),
(mean_two, "mean_two", float),
(std_two, "std_two", float),
(nobs_two, "nobs_two", int)
]:
err = validate_2d_list(arr, name, type_)
if err:
return err
if not isinstance(equal_var, bool):
return "Invalid input: equal_var must be a boolean."
if alternative not in ['two-sided', 'less', 'greater']:
return "Invalid input: alternative must be 'two-sided', 'less', or 'greater'."
# Flatten all arrays to match shapes
def flatten(arr):
return [val for row in arr for val in row]
try:
m1 = flatten(mean_one)
s1 = flatten(std_one)
n1 = flatten(nobs_one)
m2 = flatten(mean_two)
s2 = flatten(std_two)
n2 = flatten(nobs_two)
if not (len(m1) == len(s1) == len(n1) == len(m2) == len(s2) == len(n2)):
return "Invalid input: all input arrays must have the same number of elements."
results = []
for i in range(len(m1)):
res = scipy_ttest_ind_from_stats(
mean1=m1[i], std1=s1[i], nobs1=n1[i],
mean2=m2[i], std2=s2[i], nobs2=n2[i],
equal_var=equal_var, alternative=alternative
)
t_stat, p_val = res.statistic, res.pvalue
# Disallow nan/inf
if any([
isinstance(t_stat, float) and (t_stat != t_stat or t_stat in [float('inf'), float('-inf')]),
isinstance(p_val, float) and (p_val != p_val or p_val in [float('inf'), float('-inf')])
]):
return "Invalid result: t-statistic or p-value is nan or inf."
results.append([float(t_stat), float(p_val)])
return results
except Exception as e:
return f"scipy.stats.ttest_ind_from_stats error: {e}"