KURTOSISTEST
Overview
The KURTOSISTEST function tests whether the kurtosis of a sample differs significantly from that of a normal distribution. Kurtosis is a measure of the “tailedness” of a probability distribution—specifically, the propensity to produce extreme values (outliers) relative to a normal distribution. This test is commonly used in normality testing and exploratory data analysis.
This implementation uses scipy.stats.kurtosistest from the SciPy library. The test follows the methodology described by Anscombe and Glynn (1983), which transforms the sample kurtosis into a z-score that approximately follows a standard normal distribution under the null hypothesis.
A normal distribution has an excess kurtosis of zero (kurtosis = 3 in the Pearson definition). Distributions with positive excess kurtosis are called leptokurtic and have heavier tails producing more outliers (e.g., Student’s t-distribution, Laplace distribution). Distributions with negative excess kurtosis are platykurtic with lighter tails producing fewer outliers (e.g., uniform distribution). For more background, see the Wikipedia article on kurtosis.
The function returns two values:
- z-statistic: A standardized score indicating how far the sample kurtosis deviates from the expected kurtosis of a normal distribution
- p-value: The probability of observing such an extreme statistic if the sample were drawn from a normal distribution
A small p-value (typically < 0.05) suggests that the sample kurtosis is significantly different from normal, indicating the data may come from a non-normal distribution. The test requires at least 20 observations to produce reliable results, as the asymptotic approximation underlying the z-score transformation is only valid for larger samples.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=KURTOSISTEST(data)
data(list[list], required): 2D array of sample data values. Must contain at least 20 numeric values.
Returns (list[list]): 2D list [[statistic, p_value]], or error message string.
Examples
Example 1: Uniform distribution 1 to 20
Inputs:
| data |
|---|
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
| 15 |
| 16 |
| 17 |
| 18 |
| 19 |
| 20 |
Excel formula:
=KURTOSISTEST({1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20})
Expected output:
| Result | |
|---|---|
| -1.7058 | 0.088 |
Example 2: High kurtosis with extreme outlier
Inputs:
| data |
|---|
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 100 |
Excel formula:
=KURTOSISTEST({1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;100})
Expected output:
| Result | |
|---|---|
| 4.694 | 0 |
Example 3: Sequence with high outlier
Inputs:
| data |
|---|
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
| 15 |
| 16 |
| 17 |
| 18 |
| 19 |
| 100 |
Excel formula:
=KURTOSISTEST({1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;100})
Expected output:
| Result | |
|---|---|
| 4.4793 | 0 |
Example 4: Normal-like data low kurtosis
Inputs:
| data |
|---|
| 2.1 |
| 3.5 |
| 4.2 |
| 5.1 |
| 5.8 |
| 6.2 |
| 6.5 |
| 7 |
| 7.3 |
| 7.8 |
| 8.2 |
| 8.5 |
| 9 |
| 9.3 |
| 9.8 |
| 10.5 |
| 11.2 |
| 12.1 |
| 13.5 |
| 15 |
Excel formula:
=KURTOSISTEST({2.1;3.5;4.2;5.1;5.8;6.2;6.5;7;7.3;7.8;8.2;8.5;9;9.3;9.8;10.5;11.2;12.1;13.5;15})
Expected output:
| Result | |
|---|---|
| 0.0345 | 0.9725 |
Python Code
import math
from scipy.stats import kurtosistest as scipy_kurtosistest
def kurtosistest(data):
"""
Test whether the kurtosis of a sample is different from that of a normal distribution.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kurtosistest.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of sample data values. Must contain at least 20 numeric values.
Returns:
list[list]: 2D list [[statistic, p_value]], or error message string.
"""
def to2d(x):
return [[x]] if not isinstance(x, list) else x
data = to2d(data)
if not isinstance(data, list) or not all(isinstance(row, list) for row in data):
return "Error: Invalid input: data must be a 2D list."
flat = []
for row in data:
for val in row:
try:
flat.append(float(val))
except (TypeError, ValueError):
return "Error: Invalid input: data must contain numeric values."
if len(flat) < 20:
return "Error: Invalid input: data must contain at least 20 numeric values."
try:
z, p = scipy_kurtosistest(flat)
except Exception as e:
return f"Error: scipy.stats.kurtosistest error: {e}"
if math.isnan(z) or math.isnan(p) or math.isinf(z) or math.isinf(p):
return "Error: Result contains NaN or infinity due to insufficient variance in data."
return [[float(z), float(p)]]