Skip to Content

NHYPERGEOM

Overview

The NHYPERGEOM function computes values related to the Negative Hypergeometric distribution, a discrete probability distribution describing the number of successes (e.g., red balls) observed before a fixed number of failures (e.g., blue balls) are drawn from a finite population without replacement. This function can return the probability mass function (PMF), cumulative distribution function (CDF), survival function (SF), inverse CDF (quantile/ICDF), inverse SF (ISF), mean, variance, standard deviation, or median for a given value.

Excel does not provide a native Negative Hypergeometric function. The Python function in Excel provided here supports PMF, CDF, SF, ICDF, ISF, and distribution statistics (mean, median, variance, standard deviation).

For more details, see the scipy.stats.nhypergeom documentation.

Usage

To use the function in Excel:

=NHYPERGEOM(k, M, n, r, [mode], [loc])
  • k (float or 2D list, required): Value(s) at which to evaluate the distribution. For PMF, CDF, SF, ICDF, and ISF, this is the integer value (0 ≤ k ≤ n). For statistics modes, this is ignored and can be set to 0.
  • M (int, required): Total number of balls (M ≥ 1).
  • n (int, required): Number of red balls (0 ≤ n ≤ M).
  • r (int, required): Number of blue balls to draw (0 ≤ r ≤ M-n).
  • mode (str, optional, default=“pmf”): Output type. One of "pmf", "cdf", "sf", "icdf", "isf", "mean", "var", "std", or "median".
  • loc (float, optional, default=0): Location parameter (shifts the distribution).

The function returns a scalar or 2D list of floats (for array input), or an error message (string) if the input is invalid. The output depends on the selected mode:

  • pmf: Probability mass function at k.
  • cdf: Cumulative distribution function at k.
  • sf: Survival function (1 - CDF) at k.
  • icdf: Inverse CDF (quantile) for probability k.
  • isf: Inverse survival function for probability k.
  • mean: Mean of the distribution.
  • var: Variance of the distribution.
  • std: Standard deviation of the distribution.
  • median: Median of the distribution.

Examples

Example 1: PMF at k=3, M=20, n=7, r=12

Inputs:

kMnrmodeloc
320712pmf0

Excel formula:

=NHYPERGEOM(3, 20, 7, 12, "pmf", 0)

Expected output:

Result
0.0235

Example 2: CDF at k=3, M=20, n=7, r=12

Inputs:

kMnrmodeloc
320712cdf0

Excel formula:

=NHYPERGEOM(3, 20, 7, 12, "cdf", 0)

Expected output:

Result
0.0307

Example 3: Survival Function at k=3, M=20, n=7, r=12

Inputs:

kMnrmodeloc
320712sf0

Excel formula:

=NHYPERGEOM(3, 20, 7, 12, "sf", 0)

Expected output:

Result
0.9693

Example 4: Inverse CDF (ICDF) for probability k=0.5, M=20, n=7, r=12

Inputs:

kMnrmodeloc
0.520712icdf0

Excel formula:

=NHYPERGEOM(0.5, 20, 7, 12, "icdf", 0)

Expected output:

Result
6

Example 5: Mean, Variance, Std, Median

Inputs:

kMnrmodeloc
020712mean0
020712var0
020712std0
020712median0

Excel formulas:

=NHYPERGEOM(0, 20, 7, 12, "mean", 0) =NHYPERGEOM(0, 20, 7, 12, "var", 0) =NHYPERGEOM(0, 20, 7, 12, "std", 0) =NHYPERGEOM(0, 20, 7, 12, "median", 0)

Expected outputs:

Result
6.0
1.2
1.0954
6.0

Python Code

from scipy.stats import nhypergeom as scipy_nhypergeom def nhypergeom(k, M, n, r, mode="pmf", loc=0): """ Compute Negative Hypergeometric distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median. Args: k: Value(s) at which to evaluate (float or 2D list). M: Total number of balls (int, M >= 1). n: Number of red balls (int, 0 <= n <= M). r: Number of blue balls to draw (int, 0 <= r <= M-n). mode: Output type: 'pmf', 'cdf', 'sf', 'icdf', 'isf', 'mean', 'var', 'std', or 'median'. loc: Location parameter (float, default 0). Returns: Scalar or 2D list of floats, or error message (str) if invalid. """ # Validate M try: M_val = int(M) if not (M_val >= 1): return "Invalid input: M must be >= 1." except Exception: return "Invalid input: M must be an integer." # Validate n try: n_val = int(n) if not (0 <= n_val <= M_val): return "Invalid input: n must be between 0 and M." except Exception: return "Invalid input: n must be an integer." # Validate r try: r_val = int(r) if not (0 <= r_val <= M_val - n_val): return "Invalid input: r must be between 0 and M-n." except Exception: return "Invalid input: r must be an integer." # Validate loc try: loc_val = float(loc) except Exception: return "Invalid input: loc must be a number." # Validate mode valid_modes = ["pmf", "cdf", "sf", "icdf", "isf", "mean", "var", "std", "median"] if not isinstance(mode, str) or mode not in valid_modes: return f"Invalid input: mode must be one of {valid_modes}." # Helper to process k (scalar or 2D list) def process_k(val): try: return float(val) except Exception: return None # Handle statistics if mode == "mean": return scipy_nhypergeom.mean(M_val, n_val, r_val, loc=loc_val) if mode == "var": return scipy_nhypergeom.var(M_val, n_val, r_val, loc=loc_val) if mode == "std": return scipy_nhypergeom.std(M_val, n_val, r_val, loc=loc_val) if mode == "median": return scipy_nhypergeom.median(M_val, n_val, r_val, loc=loc_val) # PMF, CDF, SF, ICDF, ISF def compute(val): kval = process_k(val) if kval is None: return "Invalid input: k must be a number." if mode == "pmf": return float(scipy_nhypergeom.pmf(kval, M_val, n_val, r_val, loc=loc_val)) elif mode == "cdf": return float(scipy_nhypergeom.cdf(kval, M_val, n_val, r_val, loc=loc_val)) elif mode == "sf": return float(scipy_nhypergeom.sf(kval, M_val, n_val, r_val, loc=loc_val)) elif mode == "icdf": return float(scipy_nhypergeom.ppf(kval, M_val, n_val, r_val, loc=loc_val)) elif mode == "isf": return float(scipy_nhypergeom.isf(kval, M_val, n_val, r_val, loc=loc_val)) # 2D list or scalar if isinstance(k, list): # 2D list if not all(isinstance(row, list) for row in k): return "Invalid input: k must be a scalar or 2D list." result = [] for row in k: result_row = [] for val in row: out = compute(val) if isinstance(out, str): return out result_row.append(out) result.append(result_row) return result else: return compute(k)

Live Demo

Example Workbook

Link to Workbook

Last updated on