SKEWNESS
Overview
The SKEWNESS function calculates the skewness of a dataset, a measure of the asymmetry of a probability distribution about its mean. Skewness is one of the key shape statistics used in descriptive statistics, alongside kurtosis, to characterize how data deviate from a normal distribution.
For a symmetric distribution, the skewness value is zero. A positive skewness indicates that the distribution has a longer right tail (right-skewed), while a negative skewness indicates a longer left tail (left-skewed). For normally distributed data, the skewness should be approximately zero.
This function uses the SciPy scipy.stats.skew implementation, which computes the Fisher-Pearson coefficient of skewness. The biased sample skewness (default) is calculated as:
g_1 = \frac{m_3}{m_2^{3/2}}
where m_i = \frac{1}{N}\sum_{n=1}^{N}(x_n - \bar{x})^i is the biased sample ith central moment and \bar{x} is the sample mean.
When the bias parameter is set to FALSE, the function applies a correction factor to produce the adjusted Fisher-Pearson standardized moment coefficient:
G_1 = \frac{\sqrt{N(N-1)}}{N-2} \cdot g_1
This adjusted formula (G_1) is the version used in Microsoft Excel’s SKEW.P and SKEW functions, as well as statistical packages like Minitab, SAS, and SPSS. The bias correction requires at least 3 data points.
Skewness is commonly used in financial analysis for assessing skewness risk, in quality control to detect process asymmetries, and in data science to determine whether data transformations are needed before applying statistical models that assume normality. For more theoretical background, see the Wikipedia article on skewness.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=SKEWNESS(data, bias)
data(list[list], required): 2D array of numeric values. Non-numeric values are ignored.bias(bool, optional, default: true): If True, calculations are not corrected for statistical bias.
Returns (float): Skewness value (float), or error message string.
Examples
Example 1: Right skewed data returns positive skewness
Inputs:
| data | bias |
|---|---|
| 1 | true |
| 2 | |
| 3 | |
| 4 | |
| 10 |
Excel formula:
=SKEWNESS({1;2;3;4;10}, TRUE)
Expected output:
1.138
Example 2: Left skewed data returns negative skewness
Inputs:
| data | bias |
|---|---|
| 1 | true |
| 7 | |
| 8 | |
| 9 | |
| 10 |
Excel formula:
=SKEWNESS({1;7;8;9;10}, TRUE)
Expected output:
-1.138
Example 3: Symmetric data returns zero skewness (biased)
Inputs:
| data | bias |
|---|---|
| 1 | true |
| 2 | |
| 3 | |
| 4 | |
| 5 |
Excel formula:
=SKEWNESS({1;2;3;4;5}, TRUE)
Expected output:
0
Example 4: Symmetric data returns zero skewness (unbiased)
Inputs:
| data | bias |
|---|---|
| 1 | false |
| 2 | |
| 3 | |
| 4 | |
| 5 |
Excel formula:
=SKEWNESS({1;2;3;4;5}, FALSE)
Expected output:
0
Python Code
import math
from scipy.stats import skew as scipy_skew
def skewness(data, bias=True):
"""
Calculate the skewness of a dataset.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skew.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric values. Non-numeric values are ignored.
bias (bool, optional): If True, calculations are not corrected for statistical bias. Default is True.
Returns:
float: Skewness value (float), or error message string.
"""
def to2d(x):
return [[x]] if not isinstance(x, list) else x
data = to2d(data)
if not isinstance(data, list) or not all(isinstance(row, list) for row in data) or not data:
return "Invalid input: data must be a non-empty 2D list."
if not isinstance(bias, bool):
return "Invalid input: bias must be a boolean."
flat_data = []
for row in data:
for item in row:
try:
val = float(item)
if math.isfinite(val):
flat_data.append(val)
except (ValueError, TypeError):
continue
if len(flat_data) < 3 and not bias:
return "Invalid input: At least 3 data points are required for unbiased skewness calculation."
if len(flat_data) < 2:
return "Invalid input: At least 2 data points are required for skewness calculation."
try:
result = scipy_skew(flat_data, bias=bias)
except Exception as e:
return f"scipy.stats.skew error: {e}"
return float(result)