Skip to Content

ZIPF

Overview

The ZIPF function computes values related to the Zipf (Zeta) discrete distribution, which describes the probability of integer values following a power law. This function can return the probability mass function (PMF), cumulative distribution function (CDF), survival function (SF), inverse CDF (quantile/ICDF), inverse SF (ISF), mean, variance, standard deviation, or median for a given value.

Excel does not provide a native Zipf distribution function. The Python function in Excel provided here supports all major distribution features, including PMF, CDF, survival function, inverse CDF (quantile), inverse survival function, and distribution statistics (mean, median, variance, standard deviation).

For more details, see the scipy.stats.zipf documentation.

Usage

To use the function in Excel:

=ZIPF(k, a, [mode], [loc])
  • k (float or 2D list, required): Value(s) at which to evaluate the distribution. For PMF, CDF, SF, ICDF, and ISF, this is the integer value (k >= 1).
  • a (float, required): Distribution parameter (a > 1).
  • mode (str, optional, default=“pmf”): Output type. One of "pmf", "cdf", "sf", "icdf", "isf", "mean", "var", "std", or "median".
  • loc (float, optional, default=0): Location parameter (shifts the distribution).

The function returns a scalar or 2D list of floats (for array input), or an error message (string) if the input is invalid. The output depends on the selected mode:

  • pmf: Probability mass function at k.
  • cdf: Cumulative distribution function at k.
  • sf: Survival function (1 - CDF) at k.
  • icdf: Inverse CDF (quantile) for probability k.
  • isf: Inverse survival function for probability k.
  • mean: Mean of the distribution.
  • var: Variance of the distribution.
  • std: Standard deviation of the distribution.
  • median: Median of the distribution.

Examples

Example 1: PMF at k=2, a=2.5

Inputs:

kamodeloc
22.5pmf0

Excel formula:

=ZIPF(2, 2.5, "pmf", 0)

Expected output:

Result
0.1318

Example 2: CDF at k=2, a=2.5

Inputs:

kamodeloc
22.5cdf0

Excel formula:

=ZIPF(2, 2.5, "cdf", 0)

Expected output:

Result
0.8772

Example 3: Survival Function at k=2, a=2.5

Inputs:

kamodeloc
22.5sf0

Excel formula:

=ZIPF(2, 2.5, "sf", 0)

Expected output:

Result
0.1228

Example 4: Inverse CDF (ICDF) for probability k=0.5, a=2.5

Inputs:

kamodeloc
0.52.5icdf0

Excel formula:

=ZIPF(0.5, 2.5, "icdf", 0)

Expected output:

Result
1

Example 5: Mean, Variance, Std, Median

Inputs:

kamodeloc
12.5mean0
12.5var0
12.5std0
12.5median0

Excel formulas:

=ZIPF(1, 2.5, "mean", 0) =ZIPF(1, 2.5, "var", 0) =ZIPF(1, 2.5, "std", 0) =ZIPF(1, 2.5, "median", 0)

Expected outputs:

Result
1.9474
inf
inf
1

Python Code

from scipy.stats import zipf as scipy_zipf def zipf(k, a, mode="pmf", loc=0): """ Compute Zipf distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median. Args: k: Value(s) at which to evaluate (float or 2D list). a: Distribution parameter (float, a > 1). mode: Output type: 'pmf', 'cdf', 'sf', 'icdf', 'isf', 'mean', 'var', 'std', or 'median'. loc: Location parameter (float, default 0). Returns: Scalar or 2D list of floats, or error message (str) if invalid. Never returns inf as a float; returns 'inf' or '-inf' as a string instead. """ import math # Validate a try: a_val = float(a) if not (a_val > 1): return "Invalid input: a must be > 1." except Exception: return "Invalid input: a must be a number." # Validate loc try: loc_val = float(loc) except Exception: return "Invalid input: loc must be a number." # Validate mode valid_modes = ["pmf", "cdf", "sf", "icdf", "isf", "mean", "var", "std", "median"] if not isinstance(mode, str) or mode not in valid_modes: return f"Invalid input: mode must be one of {valid_modes}." # Helper to process k (scalar or 2D list) def process_k(val): try: return float(val) except Exception: return None # Helper to convert inf to string def inf_to_str(val): if isinstance(val, float) and math.isinf(val): return "inf" if val > 0 else "-inf" return val # Handle statistics if mode == "mean": result = scipy_zipf.mean(a_val, loc=loc_val) return inf_to_str(result) if mode == "var": result = scipy_zipf.var(a_val, loc=loc_val) return inf_to_str(result) if mode == "std": result = scipy_zipf.std(a_val, loc=loc_val) return inf_to_str(result) if mode == "median": result = scipy_zipf.median(a_val, loc=loc_val) return inf_to_str(result) # PMF, CDF, SF, ICDF, ISF def compute(val): kval = process_k(val) if kval is None: return "Invalid input: k must be a number." if mode == "pmf": out = float(scipy_zipf.pmf(kval, a_val, loc=loc_val)) elif mode == "cdf": out = float(scipy_zipf.cdf(kval, a_val, loc=loc_val)) elif mode == "sf": out = float(scipy_zipf.sf(kval, a_val, loc=loc_val)) elif mode == "icdf": out = float(scipy_zipf.ppf(kval, a_val, loc=loc_val)) elif mode == "isf": out = float(scipy_zipf.isf(kval, a_val, loc=loc_val)) else: return "Invalid mode." return inf_to_str(out) # 2D list or scalar if isinstance(k, list): # 2D list if not all(isinstance(row, list) for row in k): return "Invalid input: k must be a scalar or 2D list." result = [] for row in k: result_row = [] for val in row: out = compute(val) if isinstance(out, str) and (out.startswith("Invalid") or out == "inf" or out == "-inf"): result_row.append(out) else: result_row.append(out) result.append(result_row) return result else: return compute(k)

Live Demo

Example Workbook

Link to Workbook

Last updated on