Hypothesis Tests

Overview

Hypothesis testing is the formal procedure used by statisticians to accept or reject statistical hypotheses. It rigorously determines whether an observed effect in data is real or likely due to random chance.

The process involves: 1. Null Hypothesis (H_0): The default position (e.g., “There is no difference between the groups”). 2. Alternative Hypothesis (H_1): The claim we are testing for (e.g., “Group A is different from Group B”). 3. P-value: The probability of obtaining results at least as extreme as the observed results, assuming H_0 is true. If p < \alpha (usually 0.05), we reject H_0.

Figure 1: The T-Distribution: Critical regions for a one-sided T-test (df=10). If the calculated t-statistic falls into the red region, we reject the null hypothesis at the 5% significance level.

Types of Tests

Parametric vs Non-Parametric

  • Parametric tests (like T-tests and ANOVA) assume data follows a specific distribution (usually Normal). They have more power when assumptions are met.
  • Non-parametric tests (like Mann-Whitney U or Kruskal-Wallis) make fewer assumptions and are used when data is skewed or ordinal.

Native Excel Capabilities

Excel includes several standard tests: - T.TEST: Handles paired, equal variance, and unequal variance t-tests. - Z.TEST: For large samples with known variance. - CHISQ.TEST: For goodness of fit.

However, Python provides a much broader suite: - Normality Tests: SHAPIRO and NORMALTEST allow you to verify assumptions before running parametric tests (not built into Excel formulas). - Non-Parametric Alternatives: Robust tests like MANNWHITNEYU and WILCOXON (signed-rank) are critical for real-world non-normal data but are missing from standard Excel functions. - Detailed Output: Python functions often return full objects with test statistics, p-values, and degrees of freedom, rather than just a single p-value.

Association Correlation

Tool Description
BARNARD_EXACT Perform Barnard’s exact test on a 2x2 contingency table.
BOSCHLOO_EXACT Perform Boschloo’s exact test on a 2x2 contingency table.
CHI2_CONTINGENCY Perform the chi-square test of independence for variables in a contingency table.
FISHER_EXACT Perform Fisher’s exact test on a 2x2 contingency table.
KENDALLTAU Calculate Kendall’s tau, a correlation measure for ordinal data.
PAGE_TREND_TEST Perform Page’s L trend test for monotonic trends across treatments.
PEARSONR Calculate the Pearson correlation coefficient and p-value for two datasets.
POINTBISERIALR Calculate a point biserial correlation coefficient and its p-value.
SIEGELSLOPES Compute the Siegel repeated medians estimator for robust linear regression using scipy.stats.siegelslopes.
SOMERSD Calculate Somers’ D, an asymmetric measure of ordinal association between two variables.
SPEARMANR Calculate a Spearman rank-order correlation coefficient with associated p-value.
THEILSLOPES Compute the Theil-Sen estimator for a set of points (robust linear regression).
WEIGHTEDTAU Compute a weighted version of Kendall’s tau correlation coefficient.

Independent Sample

Tool Description
ALEXANDERGOVERN Performs the Alexander-Govern test for equality of means across multiple independent samples with possible heterogeneity of variance.
ANDERSON_KSAMP Performs the k-sample Anderson-Darling test to determine if samples are drawn from the same population.
ANSARI Performs the Ansari-Bradley test for equal scale parameters (non-parametric) using scipy.stats.ansari.
BRUNNERMUNZEL Computes the Brunner-Munzel nonparametric test for two independent samples.
BWS_TEST Performs the Baumgartner-Weiss-Schindler test on two independent samples.
CVM_2SAMP Performs the two-sample Cramér-von Mises test using scipy.stats.cramervonmises_2samp.
DUNNETT Performs Dunnett’s test for multiple comparisons of means against a control group.
EPPS_SINGLE_2SAMP Compute the Epps-Singleton test statistic and p-value for two samples.
F_ONEWAY Performs a one-way ANOVA test for two or more independent samples.
FLIGNER Performs the Fligner-Killeen test for equality of variances across multiple samples.
FRIEDMANCHISQUARE Computes the Friedman test for repeated samples.
KRUSKAL Computes the Kruskal-Wallis H-test for independent samples.
KS_2SAMP Performs the two-sample Kolmogorov-Smirnov test for goodness of fit.
LEVENE Performs the Levene test for equality of variances across multiple samples.
MANNWHITNEYU Performs the Mann-Whitney U rank test on two independent samples using scipy.stats.mannwhitneyu.
MEDIAN_TEST Performs Mood’s median test to determine if two or more independent samples come from populations with the same median.
MOOD Perform Mood’s two-sample test for scale parameters.
POISSON_MEANS_TEST Performs the Poisson means test (E-test) to compare the means of two Poisson distributions.
RANKSUMS Computes the Wilcoxon rank-sum statistic and p-value for two independent samples.
TTEST_IND Performs the independent two-sample t-test for the means of two groups.
TTEST_IND_STATS Perform a t-test for means of two independent samples using summary statistics.

One Sample

Tool Description
BINOMTEST Perform a binomial test for the probability of success in a Bernoulli experiment.
JARQUE_BERA Perform the Jarque-Bera goodness of fit test for normality.
KSTEST Performs the one-sample Kolmogorov-Smirnov test for goodness of fit.
KURTOSISTEST Test whether the kurtosis of a sample is different from that of a normal distribution.
NORMALTEST Test whether a sample differs from a normal distribution (omnibus test).
QUANTILE_TEST Perform a quantile test to determine if a population quantile equals a hypothesized value.
SHAPIRO Perform the Shapiro-Wilk test for normality.
SKEWTEST Test whether the skewness of a sample is different from that of a normal distribution.
TTEST_1SAMP Perform a one-sample t-test for the mean of a group of scores.