BERNOULLI

Overview

The BERNOULLI function calculates properties of a Bernoulli discrete random variable, the simplest probability distribution representing a single trial with exactly two possible outcomes: success (1) or failure (0). This distribution forms the foundation of many statistical concepts and is named after Swiss mathematician Jacob Bernoulli.

A Bernoulli distribution is characterized by a single parameter p, representing the probability of success. The probability mass function (PMF) is defined as:

f(k; p) = \begin{cases} 1 - p & \text{if } k = 0 \\ p & \text{if } k = 1 \end{cases}

where k \in \{0, 1\} and 0 \leq p \leq 1.

The distribution has the following statistical properties:

  • Mean: \mu = p
  • Variance: \sigma^2 = p(1 - p)
  • Skewness: \frac{1 - 2p}{\sqrt{p(1-p)}}
  • Kurtosis: \frac{1 - 6p(1-p)}{p(1-p)}

This implementation uses SciPy’s bernoulli module, which extends the rv_discrete class. The function supports multiple calculation methods including PMF, cumulative distribution function (CDF), mean, variance, skewness, and kurtosis—providing comprehensive statistical analysis capabilities for binary outcome experiments.

Common applications include modeling coin flips, success/failure outcomes in quality control, binary classification thresholds, and as building blocks for more complex distributions like the binomial distribution (which represents multiple Bernoulli trials).

This example function is provided as-is without any representation of accuracy.

Excel Usage

=BERNOULLI(p, k, bernoulli_method, loc)
  • p (float, required): Probability of success (0 <= p <= 1).
  • k (list[list], optional, default: null): Value(s) at which to evaluate the distribution. For PMF/CDF/SF, this is the value (0 or 1). For ICDF/ISF, this is the probability.
  • bernoulli_method (str, optional, default: “pmf”): The calculation method to use.
  • loc (float, optional, default: 0): Location parameter that shifts the distribution.

Returns (float): The result of the chosen method, or str error message if input is invalid.

Examples

Example 1: PMF at k=1 with p=0.5

Inputs:

p k bernoulli_method loc
0.5 1 pmf 0

Excel formula:

=BERNOULLI(0.5, 1, "pmf", 0)

Expected output:

Result
0.5

Example 2: CDF at k=0 with p=0.7

Inputs:

p k bernoulli_method loc
0.7 0 cdf 0

Excel formula:

=BERNOULLI(0.7, 0, "cdf", 0)

Expected output:

Result
0.3

Example 3: Survival function at k=0

Inputs:

p k bernoulli_method loc
0.5 0 sf 0

Excel formula:

=BERNOULLI(0.5, 0, "sf", 0)

Expected output:

Result
0.5

Example 4: Inverse CDF at 0.8

Inputs:

p k bernoulli_method loc
0.5 0.8 icdf 0

Excel formula:

=BERNOULLI(0.5, 0.8, "icdf", 0)

Expected output:

Result
1

Example 5: Mean of distribution with p=0.2

Inputs:

p bernoulli_method loc
0.2 mean 0

Excel formula:

=BERNOULLI(0.2, "mean", 0)

Expected output:

0.2

Example 6: Variance of distribution with p=0.9

Inputs:

p bernoulli_method loc
0.9 var 0

Excel formula:

=BERNOULLI(0.9, "var", 0)

Expected output:

0.09

Example 7: PMF with vector input

Inputs:

p k bernoulli_method loc
0.5 0 1 pmf 0

Excel formula:

=BERNOULLI(0.5, {0,1}, "pmf", 0)

Expected output:

Result
0.5 0.5

Python Code

from scipy.stats import bernoulli as scipy_bernoulli

def bernoulli(p, k=None, bernoulli_method='pmf', loc=0):
    """
    Calculates properties of a Bernoulli discrete random variable.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.bernoulli.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        p (float): Probability of success (0 <= p <= 1).
        k (list[list], optional): Value(s) at which to evaluate the distribution. For PMF/CDF/SF, this is the value (0 or 1). For ICDF/ISF, this is the probability. Default is None.
        bernoulli_method (str, optional): The calculation method to use. Valid options: PMF, CDF, SF, ICDF, ISF, Mean, Variance, Skewness, Kurtosis. Default is 'pmf'.
        loc (float, optional): Location parameter that shifts the distribution. Default is 0.

    Returns:
        float: The result of the chosen method, or str error message if input is invalid.
    """
    def to2d(x):
        return [[x]] if not isinstance(x, list) else x

    # Validate p
    try:
        p_val = float(p)
    except (ValueError, TypeError):
        return "Invalid input: p must be a number."

    if not (0 <= p_val <= 1):
        return "Invalid input: p must be between 0 and 1."

    # Validate loc
    try:
        loc_val = float(loc)
    except (ValueError, TypeError):
        return "Invalid input: loc must be a number."

    # Validate method
    valid_methods = {"pmf", "cdf", "sf", "icdf", "isf", "mean", "var", "skew", "kurt"}
    if bernoulli_method not in valid_methods:
        return f"Invalid method. Choose from {', '.join(sorted(valid_methods))}."

    dist = scipy_bernoulli(p_val, loc=loc_val)

    # Handle statistics (independent of k)
    if bernoulli_method == "mean":
        return float(dist.mean())
    elif bernoulli_method == "var":
        return float(dist.var())
    elif bernoulli_method == "skew":
        return float(dist.skew())
    elif bernoulli_method == "kurt":
        return float(dist.kurtosis())

    # Handle methods requiring k
    if k is None:
        return f"Invalid input: k is required for {bernoulli_method} method."

    k_list = to2d(k)

    if not isinstance(k_list, list) or not all(isinstance(row, list) for row in k_list):
        return "Invalid input: k must be a scalar or 2D list."

    def compute(val):
        try:
            kval = float(val)
        except (ValueError, TypeError):
            return "Invalid input: k must be a number."

        if bernoulli_method == "pmf":
            return float(dist.pmf(kval))
        elif bernoulli_method == "cdf":
            return float(dist.cdf(kval))
        elif bernoulli_method == "sf":
            return float(dist.sf(kval))
        elif bernoulli_method == "icdf":
             if not (0 <= kval <= 1):
                return "Invalid input: probability must be between 0 and 1 for icdf."
             return float(dist.ppf(kval))
        elif bernoulli_method == "isf":
             if not (0 <= kval <= 1):
                return "Invalid input: probability must be between 0 and 1 for isf."
             return float(dist.isf(kval))
        return "Unknown error"

    result = []
    for row in k_list:
        result_row = []
        for val in row:
            out = compute(val)
            if isinstance(out, str):
                return out
            result_row.append(out)
        result.append(result_row)

    # Return scalar if input was scalar
    if not isinstance(k, list) and len(result) == 1 and len(result[0]) == 1:
        return result[0][0]

    return result

Online Calculator