NBINOM
Overview
The NBINOM function computes values from the negative binomial distribution, a discrete probability distribution that models the number of failures that occur before achieving a specified number of successes in a sequence of independent Bernoulli trials. This distribution is commonly used in statistical modeling for count data, particularly when the data exhibits overdispersion (variance exceeds the mean).
The negative binomial distribution describes a sequence of independent and identically distributed (i.i.d.) Bernoulli trials, repeated until a predefined, non-random number of successes occurs. The probability mass function (PMF) for the number of failures k is defined as:
f(k) = \binom{k + n - 1}{n - 1} p^n (1 - p)^k
where k \geq 0 is the number of failures, n > 0 is the number of successes, and 0 < p \leq 1 is the probability of success on each trial.
This implementation uses SciPy’s scipy.stats.nbinom module, which provides methods for the probability mass function (PMF), cumulative distribution function (CDF), survival function (SF), and their inverses (ICDF/ISF), as well as descriptive statistics including mean, variance, standard deviation, and median. The underlying computations leverage routines from the Boost C++ Libraries for enhanced numerical accuracy.
An alternative parameterization expresses the distribution in terms of the mean number of failures \mu to achieve n successes, where the probability of success relates to the mean as p = n / (n + \mu). This formulation is particularly useful in regression contexts and when modeling phenomena with heterogeneous event rates, such as insurance claim counts or ecological abundance data.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=NBINOM(k, n, p, nbinom_mode, loc)
k(list[list], required): Value(s) at which to evaluate the distribution (number of failures for pmf/cdf/sf, probability for icdf/isf, ignored for statistics modes).n(int, required): Number of successes (must be >= 0).p(float, required): Probability of success (0 < p <= 1).nbinom_mode(str, optional, default: “pmf”): Output type to compute.loc(float, optional, default: 0): Location parameter that shifts the distribution.
Returns (float): Distribution result (float), or error message string.
Examples
Example 1: PMF with required arguments only
Inputs:
| k | n | p |
|---|---|---|
| 3 | 5 | 0.5 |
Excel formula:
=NBINOM(3, 5, 0.5)
Expected output:
0.1367
Example 2: CDF with mode parameter
Inputs:
| k | n | p | nbinom_mode |
|---|---|---|---|
| 3 | 5 | 0.5 | cdf |
Excel formula:
=NBINOM(3, 5, 0.5, "cdf")
Expected output:
0.3633
Example 3: Survival function with all parameters
Inputs:
| k | n | p | nbinom_mode | loc |
|---|---|---|---|---|
| 3 | 5 | 0.5 | sf | 0 |
Excel formula:
=NBINOM(3, 5, 0.5, "sf", 0)
Expected output:
0.6367
Example 4: Mean statistics mode
Inputs:
| k | n | p | nbinom_mode |
|---|---|---|---|
| 0 | 5 | 0.5 | mean |
Excel formula:
=NBINOM(0, 5, 0.5, "mean")
Expected output:
5
Python Code
from scipy.stats import nbinom as scipy_nbinom
def nbinom(k, n, p, nbinom_mode='pmf', loc=0):
"""
Compute Negative Binomial distribution values using scipy.stats.nbinom.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.nbinom.html
This example function is provided as-is without any representation of accuracy.
Args:
k (list[list]): Value(s) at which to evaluate the distribution (number of failures for pmf/cdf/sf, probability for icdf/isf, ignored for statistics modes).
n (int): Number of successes (must be >= 0).
p (float): Probability of success (0 < p <= 1).
nbinom_mode (str, optional): Output type to compute. Valid options: PMF, CDF, SF, ICDF, ISF, Mean, Var, Std, Median. Default is 'pmf'.
loc (float, optional): Location parameter that shifts the distribution. Default is 0.
Returns:
float: Distribution result (float), or error message string.
"""
def to2d(x):
return [[x]] if not isinstance(x, list) else x
# Validate n
try:
n_val = float(n)
if n_val < 0:
return "Invalid input: n must be >= 0."
if n_val != int(n_val):
return "Invalid input: n must be an integer."
n_val = int(n_val)
except Exception:
return "Invalid input: n must be an integer."
# Validate p
try:
p_val = float(p)
if not (0 < p_val <= 1):
return "Invalid input: p must be between 0 (exclusive) and 1 (inclusive)."
except Exception:
return "Invalid input: p must be a number."
# Validate loc
try:
loc_val = float(loc)
except Exception:
return "Invalid input: loc must be a number."
# Validate nbinom_mode
valid_modes = {"pmf", "cdf", "sf", "icdf", "isf", "mean", "var", "std", "median"}
if not isinstance(nbinom_mode, str) or nbinom_mode not in valid_modes:
return f"Invalid input: nbinom_mode must be one of {sorted(valid_modes)}."
# Helper to process k (scalar or 2D list)
def process_k(val):
try:
return float(val)
except Exception:
return None
# Handle statistics modes
if nbinom_mode == "mean":
return float(scipy_nbinom.mean(n_val, p_val, loc=loc_val))
if nbinom_mode == "var":
return float(scipy_nbinom.var(n_val, p_val, loc=loc_val))
if nbinom_mode == "std":
return float(scipy_nbinom.std(n_val, p_val, loc=loc_val))
if nbinom_mode == "median":
return float(scipy_nbinom.median(n_val, p_val, loc=loc_val))
# PMF, CDF, SF, ICDF, ISF
def compute(val):
kval = process_k(val)
if kval is None:
return "Invalid input: k must be a number."
if nbinom_mode in ["icdf", "isf"]:
if not (0 <= kval <= 1):
return "Invalid input: probability must be between 0 and 1."
if nbinom_mode == "pmf":
return float(scipy_nbinom.pmf(kval, n_val, p_val, loc=loc_val))
elif nbinom_mode == "cdf":
return float(scipy_nbinom.cdf(kval, n_val, p_val, loc=loc_val))
elif nbinom_mode == "sf":
return float(scipy_nbinom.sf(kval, n_val, p_val, loc=loc_val))
elif nbinom_mode == "icdf":
return float(scipy_nbinom.ppf(kval, n_val, p_val, loc=loc_val))
elif nbinom_mode == "isf":
return float(scipy_nbinom.isf(kval, n_val, p_val, loc=loc_val))
# Normalize k to 2D list
k = to2d(k)
# Validate k is 2D list
if not isinstance(k, list) or not all(isinstance(row, list) for row in k):
return "Invalid input: k must be a scalar or 2D list."
result = []
for row in k:
result_row = []
for val in row:
out = compute(val)
if isinstance(out, str):
return out
result_row.append(out)
result.append(result_row)
# Return scalar if input was scalar (single element)
if len(result) == 1 and len(result[0]) == 1:
return result[0][0]
return result