BINNED_STATISTIC
Overview
The BINNED_STATISTIC
function computes a binned statistic (mean, sum, median, etc.) for one or more sets of data. It generalizes the histogram function to support various statistics, allowing you to analyze the distribution of values within specified bins. For more details, see the scipy.stats.binned_statistic documentation .
This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel:
=BINNED_STATISTIC(x, values, [statistic], [bins], [lowerlimit], [upperlimit])
x
(2D list, required): Data to bin (independent variable).values
(2D list, required): Data to compute statistics on (dependent variable).statistic
(str, optional, default=‘mean’): Statistic to compute (‘mean’, ‘sum’, ‘median’, ‘count’, ‘std’, ‘min’, ‘max’).bins
(int, optional, default=10): Number of bins.lowerlimit
(float, optional): Lower bound for the bin range.upperlimit
(float, optional): Upper bound for the bin range.
The function returns a 2D list of statistic values for each bin, or an error message (string) if the input is invalid.
Examples
Example 1: Mean in 2 Bins
Inputs:
x | values | statistic | bins | lowerlimit | upperlimit |
---|---|---|---|---|---|
1 | 1.0 | mean | 2 | ||
1 | 1.0 | ||||
2 | 2.0 | ||||
5 | 1.5 | ||||
7 | 3.0 |
Excel formula:
=BINNED_STATISTIC({1;1;2;5;7}, {1.0;1.0;2.0;1.5;3.0}, "mean", 2)
Expected output:
Result |
---|
1.33333333 |
2.25 |
Example 2: Sum in 2 Bins
Inputs:
x | values | statistic | bins | lowerlimit | upperlimit |
---|---|---|---|---|---|
1 | 1.0 | sum | 2 | ||
1 | 1.0 | ||||
2 | 2.0 | ||||
5 | 1.5 | ||||
7 | 3.0 |
Excel formula:
=BINNED_STATISTIC({1;1;2;5;7}, {1.0;1.0;2.0;1.5;3.0}, "sum", 2)
Expected output:
Result |
---|
4.0 |
4.5 |
Example 3: Median in 3 Bins, Custom Range
Inputs:
x | values | statistic | bins | lowerlimit | upperlimit |
---|---|---|---|---|---|
1 | 10 | median | 3 | 1 | 7 |
2 | 20 | ||||
3 | 30 | ||||
4 | 40 | ||||
5 | 50 | ||||
6 | 60 | ||||
7 | 70 |
Excel formula:
=BINNED_STATISTIC({1;2;3;4;5;6;7}, {10;20;30;40;50;60;70}, "median", 3, 1, 7)
Expected output:
Result |
---|
15.0 |
35.0 |
60.0 |
Example 4: Count in 2 Bins
Inputs:
x | values | statistic | bins | lowerlimit | upperlimit |
---|---|---|---|---|---|
1 | 10 | count | 2 | ||
2 | 20 | ||||
3 | 30 | ||||
4 | 40 | ||||
5 | 50 |
Excel formula:
=BINNED_STATISTIC({1;2;3;4;5}, {10;20;30;40;50}, "count", 2)
Expected output:
Result |
---|
2.0 |
3.0 |
Python Code
from scipy.stats import binned_statistic as scipy_binned_statistic
def binned_statistic(x, values, statistic='mean', bins=10, lowerlimit=None, upperlimit=None):
"""
Computes a binned statistic (mean, sum, median, etc.) for the input data.
Args:
x: 2D list or scalar of float values. Data to bin.
values: 2D list or scalar of float values. Data to compute statistics on.
statistic: Statistic to compute ('mean', 'sum', 'median', 'count', 'std', 'min', 'max'). Default is 'mean'.
bins: Number of bins (integer, default 10).
lowerlimit: Lower bound for the bin range (optional).
upperlimit: Upper bound for the bin range (optional).
Returns:
2D list of statistic values for each bin, or an error message (str) if input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Convert scalars to 2D lists if needed
if not isinstance(x, list):
x = [[x]]
if not isinstance(values, list):
values = [[values]]
# Flatten 2D lists to 1D
try:
flat_x = [float(item) for row in x for item in (row if isinstance(row, list) else [row])]
flat_values = [float(item) for row in values for item in (row if isinstance(row, list) else [row])]
except Exception:
return "Invalid input: x and values must be 2D lists or scalars of numbers."
if len(flat_x) == 0 or len(flat_values) == 0:
return "Invalid input: x and values must not be empty."
if len(flat_x) != len(flat_values):
return "Invalid input: x and values must have the same length."
if statistic not in ('mean', 'sum', 'median', 'count', 'std', 'min', 'max'):
return "Invalid input: statistic must be one of 'mean', 'sum', 'median', 'count', 'std', 'min', 'max'."
try:
nbins = int(bins)
except Exception:
return "Invalid input: bins must be an integer."
if nbins < 1:
return "Invalid input: bins must be >= 1."
# Prepare range
range_arg = None
if lowerlimit is not None and upperlimit is not None:
try:
range_arg = (float(lowerlimit), float(upperlimit))
except Exception:
return "Invalid input: lowerlimit and upperlimit must be numbers."
try:
if range_arg:
res = scipy_binned_statistic(flat_x, flat_values, statistic=statistic, bins=nbins, range=range_arg)
else:
res = scipy_binned_statistic(flat_x, flat_values, statistic=statistic, bins=nbins)
stat = res.statistic
except Exception as e:
return f"scipy.stats.binned_statistic error: {e}"
# Return as 2D list (column vector)
return [[float(x)] if x is not None else [None] for x in stat]