Probability Distributions

Overview

Probability distributions are mathematical functions that describe the likelihood of different possible values of a variable. They are the building blocks of statistics, enabling us to model uncertainty, calculate risks, and simulate complex systems.

This library provides a comprehensive suite of distributions from SciPy, covering continuous, discrete, and multivariate cases.

Core Functions

For each distribution (e.g., NORM, BETA, POISSON), we typically provide four standard methods:

  1. PDF / PMF: Probability Density Function (continuous) or Probability Mass Function (discrete). Height of the curve.
  2. CDF: Cumulative Distribution Function. Area under the curve from -\infty to x. P(X \le x).
  3. PPF: Percent Point Function (Inverse CDF). Given a probability p, finds x such that P(X \le x) = p.
  4. RVS: Random Variate Sample. Generates random numbers from the distribution.
Figure 1: Distribution Concepts: The PDF (Blue) shows likelihood density. The CDF (Red) shows cumulative probability. The PPF (not shown) is the inverse of the CDF.

Native Excel Capabilities

Excel supports a solid range of basic distributions (NORM.DIST, T.DIST, BINOM.DIST, POISSON.DIST, GAMMA.DIST).

However, it lacks: - Multivariate Distributions: No native support for Multivariate Normal, Dirichlet, or Wishart distributions. - Advanced Continuous: No built-in support for Von Mises, Laplace, Levy, or Generalized Extreme Value distributions. - Advanced Discrete: No support for Zero-Inflated models, Plancke, or Skellam distributions. - Consistency: Excel function names and parameters can be inconsistent (e.g., BETA.DIST vs BETADIST legacy).

The Python functions provided here unify these under a consistent API (following SciPy conventions) and extend the coverage to over 100 distributions.

Continuous Distributions

Tool Description
BETA Wrapper for scipy.stats.beta distribution providing multiple statistical methods.
CAUCHY Wrapper for scipy.stats.cauchy distribution providing multiple statistical methods.
CHISQ Compute various statistics and functions for the chi-squared distribution from scipy.stats.chi2.
EXPON Exponential distribution function wrapping scipy.stats.expon.
F_DIST Unified interface to the main methods of the F-distribution, including PDF, CDF, inverse CDF, survival function, and distribution statistics.
LAPLACE Laplace distribution function supporting multiple methods.
LOGNORM Compute lognormal distribution statistics and evaluations.
NORM Normal (Gaussian) distribution function supporting multiple methods.
PARETO Generalized Pareto distribution function supporting multiple methods.
T_DIST Student’s t distribution function supporting multiple methods from scipy.stats.t.
UNIFORM Uniform distribution function supporting multiple methods.
WEIBULL_MIN Compute various functions of the Weibull minimum distribution using scipy.stats.weibull_min.

Discrete Distributions

Tool Description
BERNOULLI Calculates properties of a Bernoulli discrete random variable.
BETABINOM Compute Beta-binomial distribution values from scipy.stats.betabinom.
BETANBINOM Compute Beta-negative-binomial distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
BINOM Compute Binomial distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
BOLTZMANN Compute Boltzmann distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
DLAPLACE Compute Discrete Laplace distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
GEOM Compute Geometric distribution values using scipy.stats.geom.
HYPERGEOM Compute Hypergeometric distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
LOGSER Compute Log-Series distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
NBINOM Compute Negative Binomial distribution values using scipy.stats.nbinom.
NHYPERGEOM Compute Negative Hypergeometric distribution values using scipy.stats.nhypergeom.
PLANCK Compute Planck distribution values using scipy.stats.planck.
POISSON_DIST Compute Poisson distribution values using scipy.stats.poisson.
RANDINT Compute Uniform discrete distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
SKELLAM Compute Skellam distribution values using scipy.stats.skellam.
YULESIMON Compute Yule-Simon distribution values using scipy.stats.yulesimon.
ZIPF Compute Zipf distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.
ZIPFIAN Compute Zipfian distribution values: PMF, CDF, SF, ICDF, ISF, mean, variance, std, or median.

Multivariate Distributions

Tool Description
DIRICHLET Computes the PDF, log-PDF, mean, variance, covariance, entropy, or draws random samples from a Dirichlet distribution.
DIRICHLET_MULTINOM Computes the probability mass function, log probability mass function, mean, variance, or covariance of the Dirichlet multinomial distribution.
MATRIX_NORMAL Computes the PDF, log-PDF, or draws random samples from a matrix normal distribution.
MULTINOMIAL Compute the probability mass function, log-PMF, entropy, covariance, or draw random samples from a multinomial distribution.
MULTIVARIATE_NORMAL Computes the PDF, CDF, log-PDF, log-CDF, entropy, or draws random samples from a multivariate normal distribution.
MULTIVARIATE_T Computes the PDF, CDF, or draws random samples from a multivariate t-distribution.
MV_HYPERGEOM Computes probability mass function, log-PMF, mean, variance, covariance, or draws random samples from a multivariate hypergeometric distribution.
ORTHO_GROUP Draws random samples of orthogonal matrices from the O(N) Haar distribution using scipy.stats.ortho_group.
RANDOM_CORRELATION Generates a random correlation matrix with specified eigenvalues.
SPECIAL_ORTHO_GROUP Draws random samples from the special orthogonal group SO(N), returning orthogonal matrices with determinant +1.
UNIFORM_DIRECTION Draws random unit vectors uniformly distributed on the surface of a hypersphere in the specified dimension.
UNITARY_GROUP Generate a random unitary matrix of dimension N from the Haar distribution.
VONMISES_FISHER Computes the PDF, log-PDF, entropy, or draws random samples from a von Mises-Fisher distribution on the unit hypersphere.
WISHART Computes the PDF, log-PDF, or draws random samples from the Wishart distribution using scipy.stats.wishart.