DIRICHLET

Overview

The DIRICHLET function computes properties of the Dirichlet distribution, a multivariate probability distribution that generalizes the Beta distribution to multiple dimensions. It is commonly used in Bayesian statistics as a conjugate prior for the multinomial distribution and plays a central role in topic modeling algorithms like Latent Dirichlet Allocation (LDA).

The Dirichlet distribution is defined over the probability simplex, meaning all sample points \mathbf{x} = (x_1, \ldots, x_K) must satisfy \sum_{i=1}^{K} x_i = 1 with 0 < x_i < 1. The distribution is parameterized by a vector of concentration parameters \boldsymbol{\alpha} = (\alpha_1, \ldots, \alpha_K), where each \alpha_i > 0. These parameters control the shape of the distribution: higher values concentrate probability mass toward specific components, while values less than 1 push mass toward the edges of the simplex.

The probability density function is given by:

f(\mathbf{x}; \boldsymbol{\alpha}) = \frac{1}{B(\boldsymbol{\alpha})} \prod_{i=1}^{K} x_i^{\alpha_i - 1}

where B(\boldsymbol{\alpha}) is the multivariate Beta function:

B(\boldsymbol{\alpha}) = \frac{\prod_{i=1}^{K} \Gamma(\alpha_i)}{\Gamma\left(\sum_{i=1}^{K} \alpha_i\right)}

and \Gamma denotes the Gamma function.

This implementation uses SciPy’s dirichlet class from the scipy.stats module. The function supports multiple methods: pdf and logpdf for density evaluation, mean and var for marginal moments, cov for the full covariance matrix, entropy for differential entropy, and rvs for random sampling.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=DIRICHLET(x, alpha, dirichlet_method)
  • x (list[list], optional, default: null): Points at which to evaluate the distribution. Each row must sum to 1 with non-negative values. Required for ‘pdf’ and ‘logpdf’ methods.
  • alpha (list[list], optional, default: null): Concentration parameters as a column vector. All values must be positive.
  • dirichlet_method (str, optional, default: “pdf”): Method to compute. Options are ‘pdf’, ‘logpdf’, ‘mean’, ‘var’, ‘cov’, ‘entropy’, ‘rvs’.

Returns (list[list]): 2D list of results, or error message string.

Examples

Example 1: PDF calculation for 2D Dirichlet

Inputs:

x alpha dirichlet_method
0.2 0.8 1 pdf
2

Excel formula:

=DIRICHLET({0.2,0.8}, {1;2}, "pdf")

Expected output:

Result
1.6

Example 2: Log-PDF calculation for 2D Dirichlet

Inputs:

x alpha dirichlet_method
0.2 0.8 1 logpdf
2

Excel formula:

=DIRICHLET({0.2,0.8}, {1;2}, "logpdf")

Expected output:

Result
0.47

Example 3: Mean of 2D Dirichlet distribution

Inputs:

alpha dirichlet_method
1 mean
2

Excel formula:

=DIRICHLET({1;2}, "mean")

Expected output:

Result
0.3333
0.6667

Example 4: Covariance matrix of 2D Dirichlet

Inputs:

alpha dirichlet_method
1 cov
2

Excel formula:

=DIRICHLET({1;2}, "cov")

Expected output:

Result
0.05556 -0.05556
-0.05556 0.05556

Python Code

from scipy.stats import dirichlet as scipy_dirichlet

def dirichlet(x=None, alpha=None, dirichlet_method='pdf'):
    """
    Computes the PDF, log-PDF, mean, variance, covariance, entropy, or draws random samples from a Dirichlet distribution.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.dirichlet.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list], optional): Points at which to evaluate the distribution. Each row must sum to 1 with non-negative values. Required for 'pdf' and 'logpdf' methods. Default is None.
        alpha (list[list], optional): Concentration parameters as a column vector. All values must be positive. Default is None.
        dirichlet_method (str, optional): Method to compute. Options are 'pdf', 'logpdf', 'mean', 'var', 'cov', 'entropy', 'rvs'. Valid options: PDF, Log-PDF, Mean, Variance, Covariance, Entropy, Random Sample. Default is 'pdf'.

    Returns:
        list[list]: 2D list of results, or error message string.
    """
    def to2d(val):
        return [[val]] if not isinstance(val, list) else val

    # Normalize alpha to 2D
    if alpha is None:
        return "Invalid input: alpha is required."
    alpha = to2d(alpha)
    if not isinstance(alpha, list) or len(alpha) < 1 or not all(isinstance(row, list) and len(row) == 1 for row in alpha):
        return "Invalid input: alpha must be a 2D list (column vector) with at least one row."
    try:
        alpha_vec = [float(row[0]) for row in alpha]
    except (TypeError, ValueError):
        return "Invalid input: alpha must contain numeric values."
    if any(a <= 0 for a in alpha_vec):
        return "Invalid input: all alpha values must be positive."
    d = len(alpha_vec)
    # Validate method
    valid_methods = {'pdf', 'logpdf', 'mean', 'var', 'cov', 'entropy', 'rvs'}
    if dirichlet_method not in valid_methods:
        return f"Invalid method: {dirichlet_method}. Must be one of {sorted(valid_methods)}."
    dist = scipy_dirichlet(alpha_vec)
    # PDF and logPDF require x
    if dirichlet_method in {'pdf', 'logpdf'}:
        if x is None:
            return f"Invalid input: x is required for '{dirichlet_method}' method."
        x = to2d(x)
        if not isinstance(x, list) or len(x) < 1 or not all(isinstance(row, list) and len(row) == d for row in x):
            return f"Invalid input: x must be a 2D list with each row of length {d}."
        results = []
        for row in x:
            try:
                vals = [float(v) for v in row]
                if any(v < 0 or v > 1 for v in vals) or abs(sum(vals) - 1.0) > 1e-8:
                    results.append("")
                    continue
                if dirichlet_method == 'pdf':
                    res = dist.pdf(vals)
                else:
                    res = dist.logpdf(vals)
                # Disallow nan/inf
                if res is None or not isinstance(res, (int, float)) or res != res or abs(res) == float('inf'):
                    results.append("")
                else:
                    results.append(float(res))
            except Exception:
                results.append("")
        # Always return 2D list of shape (n, 1)
        return [[r] for r in results]
    # Mean, var, cov, entropy
    if dirichlet_method == 'mean':
        res = dist.mean()
        return [[float(v)] for v in res]
    if dirichlet_method == 'var':
        res = dist.var()
        return [[float(v)] for v in res]
    if dirichlet_method == 'cov':
        cov = dist.cov()
        # Return as 2D list
        return [[float(cov[i, j]) for j in range(d)] for i in range(d)]
    if dirichlet_method == 'entropy':
        res = dist.entropy()
        return [[float(res)]]
    if dirichlet_method == 'rvs':
        # Draw one sample
        try:
            sample = dist.rvs(size=1)
            # sample is shape (1, d)
            return [[float(v)] for v in sample[0]]
        except Exception as e:
            return f"scipy.dirichlet rvs error: {e}"
    return "Unknown error."

Online Calculator