DIRICHLET
Overview
The DIRICHLET
function computes properties and samples from the Dirichlet distribution, a multivariate generalization of the Beta distribution. The Dirichlet is commonly used in Bayesian statistics, machine learning, and probability for modeling proportions that sum to 1, such as probabilities across multiple categories. The probability density function (PDF) for a -dimensional Dirichlet is:
where is the multivariate Beta function, , , and .
Supported methods include PDF, log-PDF, mean, variance, covariance, entropy, and random sampling. For more details, see the scipy.stats.dirichlet documentation .
This wrapper simplifies the function to only require the most common parameters (x
, alpha
, and method
), and excludes random seed and sample size options. This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel:
=DIRICHLET(x, alpha, [method])
x
(2D list, required for ‘pdf’ and ‘logpdf’): Table of columns, each row is a point at which to evaluate the function. Each row must sum to 1 and have non-negative values.alpha
(2D column vector, required): Table with one column and rows, specifying the concentration parameters. All values must be positive.method
(string, optional, default=pdf
): Which property to compute. One ofpdf
,logpdf
,mean
,var
,cov
,entropy
,rvs
.
The function returns a 2D list of results for each input point, or an error message (string
) if the input is invalid. For pdf
and logpdf
, each input row returns a single value. For mean
, var
, and rvs
, returns a column vector. For cov
, returns a matrix. For entropy
, returns a single value.
Examples
Example 1: PDF Calculation
Inputs:
x | alpha | ||
---|---|---|---|
0.2 | 0.8 | 1.0 | 2.0 |
Excel formula:
=DIRICHLET({0.2,0.8}, {1.0;2.0})
Expected output:
Result |
---|
1.600 |
Example 2: Log-PDF Calculation
Inputs:
x | alpha | ||
---|---|---|---|
0.2 | 0.8 | 1.0 | 2.0 |
Excel formula:
=DIRICHLET({0.2,0.8}, {1.0;2.0}, "logpdf")
Expected output:
Result |
---|
0.470 |
Example 3: Mean Calculation
Inputs:
alpha | |
---|---|
1.0 | 2.0 |
Excel formula:
=DIRICHLET(, {1.0;2.0}, "mean")
Expected output:
Result |
---|
0.333 |
0.667 |
Example 4: Covariance Matrix
Inputs:
alpha | |
---|---|
1.0 | 2.0 |
Excel formula:
=DIRICHLET(, {1.0;2.0}, "cov")
Expected output:
0.056 | -0.056 |
-0.056 | 0.056 |
Python Code
from scipy.stats import dirichlet as scipy_dirichlet
from typing import List, Optional, Union
def dirichlet(x: Optional[List[List[float]]] = None, alpha: List[List[float]] = None, method: str = 'pdf') -> Union[List[List[Optional[float]]], str]:
"""
Computes the PDF, log-PDF, mean, variance, covariance, entropy, or draws random samples from a Dirichlet distribution.
Args:
x: 2D list of float values. Points at which to evaluate the function (required for 'pdf' and 'logpdf' methods).
alpha: 2D list of float values (column vector). Concentration parameters of the distribution. Required.
method: Which method to compute (str): 'pdf', 'logpdf', 'mean', 'var', 'cov', 'entropy', 'rvs'. Default is 'pdf'.
Returns:
2D list of results for each input point, or an error message (str) if input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Validate alpha
if not isinstance(alpha, list) or len(alpha) < 1 or not all(isinstance(row, list) and len(row) == 1 for row in alpha):
return "Invalid input: alpha must be a 2D list (column vector) with at least one row."
try:
alpha_vec = [float(row[0]) for row in alpha]
except Exception:
return "Invalid input: alpha must contain numeric values."
if any(a <= 0 for a in alpha_vec):
return "Invalid input: all alpha values must be positive."
d = len(alpha_vec)
# Validate method
valid_methods = {'pdf', 'logpdf', 'mean', 'var', 'cov', 'entropy', 'rvs'}
if method not in valid_methods:
return f"Invalid method: {method}. Must be one of {sorted(valid_methods)}."
dist = scipy_dirichlet(alpha_vec)
# PDF and logPDF require x
if method in {'pdf', 'logpdf'}:
if x is None or not isinstance(x, list) or len(x) < 1 or not all(isinstance(row, list) and len(row) == d for row in x):
return f"Invalid input: x must be a 2D list with each row of length {d}."
results = []
for row in x:
try:
vals = [float(v) for v in row]
if any(v < 0 or v > 1 for v in vals) or abs(sum(vals) - 1.0) > 1e-8:
results.append(None)
continue
if method == 'pdf':
res = dist.pdf(vals)
else:
res = dist.logpdf(vals)
# Disallow nan/inf
if res is None or not isinstance(res, (int, float)) or res != res or abs(res) == float('inf'):
results.append(None)
else:
results.append(float(res))
except Exception:
results.append(None)
# Always return 2D list of shape (n, 1)
return [[r] for r in results]
# Mean, var, cov, entropy
if method == 'mean':
res = dist.mean()
return [[float(v)] for v in res]
if method == 'var':
res = dist.var()
return [[float(v)] for v in res]
if method == 'cov':
cov = dist.cov()
# Return as 2D list
return [[float(cov[i, j]) for j in range(d)] for i in range(d)]
if method == 'entropy':
res = dist.entropy()
return [[float(res)]]
if method == 'rvs':
# Draw one sample
try:
sample = dist.rvs(size=1)
# sample is shape (1, d)
return [[float(v)] for v in sample[0]]
except Exception as e:
return f"scipy.dirichlet rvs error: {e}"
return "Unknown error."