Skip to Content

CUMFREQ

Overview

The CUMFREQ function computes the cumulative frequency histogram of the input data by dividing it into a specified number of bins and counting the cumulative number of observations up to each bin. This is useful for understanding the distribution of values in a dataset and for identifying percentiles or cumulative counts across ranges. It is a wrapper around SciPy’s scipy.stats.cumfreq function.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:

=CUMFREQ(data, numbins, [lowerlimit], [upperlimit])
  • data (2D list or scalar, required): Input data to analyze, as a column vector or a single value.
  • numbins (integer, required): Number of bins to use for the histogram.
  • lowerlimit (float, optional): Lower bound for the histogram range.
  • upperlimit (float, optional): Upper bound for the histogram range.

The function returns a single column of cumulative counts as a 2D list, where each row corresponds to a bin.

Examples

Example 1: Default range

Inputs:

datanumbins
14
2
3
4

Excel formula:

=CUMFREQ({1;2;3;4}, 4)

Expected output:

Result
1
2
3
4

Example 2: Duplicate values

Inputs:

datanumbins
12
1
2
2

Excel formula:

=CUMFREQ({1;1;2;2}, 2)

Expected output:

Result
2
4

Example 3: More bins

Inputs:

datanumbins
15
2
3
4
5

Excel formula:

=CUMFREQ({1;2;3;4;5}, 5)

Expected output:

Result
1
2
3
4
5

Example 4: Custom range

Inputs:

datanumbinslowerlimitupperlimit
1204
2
3

Excel formula:

=CUMFREQ({1;2;3}, 2, 0, 4)

Expected output:

Result
1
3

Python Code

from scipy.stats import cumfreq as _cumfreq def cumfreq(data, numbins=10, lowerlimit=None, upperlimit=None): """ Returns the cumulative frequency histogram for the input data. Args: data: 2D list or scalar of float values. Input data to analyze. numbins: Number of bins to use for the histogram (default 10). lowerlimit: Lower bound for the histogram range (optional). upperlimit: Upper bound for the histogram range (optional). Returns: 2D list of cumulative counts for each bin, or an error message as a string. This example function is provided as-is without any representation of accuracy. """ # Validate data try: if not isinstance(data, list): flat = [float(data)] else: if all(isinstance(row, list) and len(row) >= 1 for row in data): flat = [float(row[0]) for row in data] else: return "Invalid input: data must be a 2D list or scalar." except Exception: return "Invalid input: data must be numeric." # Require at least two data points if len(flat) < 2: return "Invalid input: data must contain at least two values." # Validate numbins try: nb = int(numbins) if nb <= 0: return "Invalid input: numbins must be a positive integer." except Exception: return "Invalid input: numbins must be an integer." # Validate limits drl = None if lowerlimit is not None or upperlimit is not None: if lowerlimit is None or upperlimit is None: return "Invalid input: both lowerlimit and upperlimit must be provided." try: low = float(lowerlimit) up = float(upperlimit) except Exception: return "Invalid input: limits must be numeric." drl = (low, up) # Compute cumulative frequency try: if drl: res = _cumfreq(flat, numbins=nb, defaultreallimits=drl) else: res = _cumfreq(flat, numbins=nb) counts = res.cumcount.tolist() except Exception as e: return f"scipy.stats.cumfreq error: {e}" # Convert to 2D list return [[c] for c in counts]

Example Workbook

Link to Workbook

Last updated on