MULTINOMIAL

Overview

The MULTINOMIAL function computes probabilities and statistics for the multinomial distribution, a generalization of the binomial distribution to scenarios with more than two possible outcomes. This distribution models experiments where each of n independent trials results in exactly one of k mutually exclusive categories, each with a fixed probability.

The multinomial distribution is fundamental in statistical applications including categorical data analysis, text classification (bag-of-words models), genetics (allele frequency modeling), and any scenario involving counts across multiple categories. For detailed documentation, see the SciPy multinomial reference and the SciPy GitHub repository.

The probability mass function (PMF) gives the probability of observing a specific combination of counts (x_1, x_2, \ldots, x_k) across k categories:

f(x_1, \ldots, x_k) = \frac{n!}{x_1! \cdots x_k!} p_1^{x_1} \cdots p_k^{x_k}

where n is the total number of trials, p_i is the probability of category i, and each x_i is a nonnegative integer with \sum_{i=1}^{k} x_i = n. The multinomial coefficient \frac{n!}{x_1! \cdots x_k!} counts the number of ways to arrange the outcomes.

This function supports multiple calculation methods: pmf returns the probability mass function value, logpmf returns the natural logarithm of the PMF (useful for numerical stability with small probabilities), entropy computes the Shannon entropy of the distribution, cov returns the covariance matrix between category counts, and rvs generates random samples from the distribution.

When k = 2, the multinomial distribution reduces to the binomial distribution. For sampling without replacement from finite populations, see the related multivariate hypergeometric distribution.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=MULTINOMIAL(x, n, p, multinomial_method, size)
  • x (list[list], optional, default: null): 2D array of nonnegative integers representing counts for each category. Each row must sum to n. Required for ‘pmf’ and ‘logpmf’ methods.
  • n (int, optional, default: null): Total number of trials (nonnegative integer).
  • p (list[list], optional, default: null): 2D array of probabilities for each category. All values must sum to 1.
  • multinomial_method (str, optional, default: “pmf”): The calculation method to use.
  • size (int, optional, default: null): Number of random samples to draw when using ‘rvs’ method.

Returns (list[list]): 2D list of results, or error message string.

Examples

Example 1: PMF for multinomial outcome

Inputs:

x n p multinomial_method
2 1 2 5 0.3 0.2 0.5 pmf

Excel formula:

=MULTINOMIAL({2,1,2}, 5, {0.3,0.2,0.5}, "pmf")

Expected output:

Result
0.135

Example 2: Log-PMF for multinomial outcome

Inputs:

x n p multinomial_method
2 1 2 5 0.3 0.2 0.5 logpmf

Excel formula:

=MULTINOMIAL({2,1,2}, 5, {0.3,0.2,0.5}, "logpmf")

Expected output:

Result
-2.002

Example 3: Entropy of multinomial distribution

Inputs:

n p multinomial_method
5 0.3 0.2 0.5 entropy

Excel formula:

=MULTINOMIAL(5, {0.3,0.2,0.5}, "entropy")

Expected output:

Result
2.592

Example 4: Covariance matrix of multinomial distribution

Inputs:

n p multinomial_method
5 0.3 0.2 0.5 cov

Excel formula:

=MULTINOMIAL(5, {0.3,0.2,0.5}, "cov")

Expected output:

Result
1.05 -0.3 -0.75
-0.3 0.8 -0.5
-0.75 -0.5 1.25

Python Code

from scipy.stats import multinomial as scipy_multinomial

def multinomial(x=None, n=None, p=None, multinomial_method='pmf', size=None):
    """
    Compute the probability mass function, log-PMF, entropy, covariance, or draw random samples from a multinomial distribution.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multinomial.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list], optional): 2D array of nonnegative integers representing counts for each category. Each row must sum to n. Required for 'pmf' and 'logpmf' methods. Default is None.
        n (int, optional): Total number of trials (nonnegative integer). Default is None.
        p (list[list], optional): 2D array of probabilities for each category. All values must sum to 1. Default is None.
        multinomial_method (str, optional): The calculation method to use. Valid options: PMF, Log-PMF, Entropy, Covariance, Random Samples. Default is 'pmf'.
        size (int, optional): Number of random samples to draw when using 'rvs' method. Default is None.

    Returns:
        list[list]: 2D list of results, or error message string.
    """
    def to2d(val):
        return [[val]] if not isinstance(val, list) else val

    def is_nonneg_int(val):
        if isinstance(val, bool):
            return False
        if isinstance(val, int):
            return val >= 0
        if isinstance(val, float):
            return val >= 0 and val == int(val)
        return False

    # Validate n
    if n is None or not is_nonneg_int(n):
        return "Invalid input: n must be a nonnegative integer."
    n = int(n)

    # Validate p
    if p is None:
        return "Invalid input: p must be a 2D list of probabilities."
    p = to2d(p)
    if not isinstance(p, list) or len(p) == 0:
        return "Invalid input: p must be a 2D list of probabilities."
    if not all(isinstance(row, list) and len(row) > 0 for row in p):
        return "Invalid input: p must be a 2D list of probabilities."
    # Flatten all rows into a single probability vector
    probs = []
    for row in p:
        for val in row:
            if not isinstance(val, (int, float)) or val < 0 or val > 1:
                return "Invalid input: probabilities in p must be between 0 and 1."
            probs.append(float(val))
    if abs(sum(probs) - 1.0) > 1e-8:
        return "Invalid input: probabilities in p must sum to 1."

    # Validate multinomial_method
    valid_methods = {'pmf', 'logpmf', 'entropy', 'cov', 'rvs'}
    if multinomial_method not in valid_methods:
        return f"Invalid input: multinomial_method must be one of {sorted(valid_methods)}."

    # Validate x for pmf/logpmf
    if multinomial_method in ['pmf', 'logpmf']:
        if x is None:
            return "Invalid input: x must be a 2D list of nonnegative integers for pmf/logpmf."
        x = to2d(x)
        if not isinstance(x, list) or len(x) == 0:
            return "Invalid input: x must be a 2D list of nonnegative integers for pmf/logpmf."
        if not all(isinstance(row, list) and len(row) == len(probs) for row in x):
            return "Invalid input: x must be a 2D list with same number of columns as p."
        for row in x:
            if not all(is_nonneg_int(val) for val in row):
                return "Invalid input: x must contain nonnegative integers."
            if sum(int(v) for v in row) != n:
                return "Invalid input: sum of each row in x must equal n."

    # Validate size for rvs
    if multinomial_method == 'rvs' and size is not None:
        if isinstance(size, bool) or not isinstance(size, (int, float)) or size <= 0 or size != int(size):
            return "Invalid input: size must be a positive integer."
        size = int(size)

    # Compute results
    try:
        dist = scipy_multinomial(n, probs)
        if multinomial_method == 'pmf':
            result = [[float(dist.pmf([int(v) for v in row]))] for row in x]
        elif multinomial_method == 'logpmf':
            result = [[float(dist.logpmf([int(v) for v in row]))] for row in x]
        elif multinomial_method == 'entropy':
            result = [[float(dist.entropy())]]
        elif multinomial_method == 'cov':
            cov = dist.cov()
            result = [[float(val) for val in row] for row in cov.tolist()]
        elif multinomial_method == 'rvs':
            samples = dist.rvs(size=size if size is not None else 1)
            if samples.ndim == 1:
                result = [[int(val) for val in samples]]
            else:
                result = [[int(val) for val in row] for row in samples]
        else:
            return "Invalid multinomial_method."
    except Exception as e:
        return f"scipy.stats.multinomial error: {e}"
    return result

Online Calculator