MULTIVARIATE_NORMAL
Overview
The MULTIVARIATE_NORMAL function evaluates probability density, cumulative distribution, and related statistical measures for the multivariate normal distribution (also known as the multivariate Gaussian distribution). This distribution is the generalization of the one-dimensional normal distribution to higher dimensions, making it fundamental in multivariate statistics, machine learning, and quantitative finance for modeling correlated random variables.
This implementation uses the SciPy library’s scipy.stats.multivariate_normal module. For complete documentation, see the official SciPy multivariate_normal reference. The source code is available on GitHub.
The multivariate normal distribution is parameterized by a mean vector \mu and a covariance matrix \Sigma. The covariance matrix must be symmetric and positive semi-definite, where diagonal elements represent variances of individual dimensions and off-diagonal elements represent covariances between dimensions. The probability density function (PDF) is defined as:
f(x) = \frac{1}{\sqrt{(2\pi)^k \det(\Sigma)}} \exp\left(-\frac{1}{2}(x - \mu)^T \Sigma^{-1} (x - \mu)\right)
where k is the dimensionality (rank of \Sigma), x is the point being evaluated, \mu is the mean vector, and \Sigma is the covariance matrix. For singular covariance matrices, SciPy extends this definition using the pseudo-determinant and pseudo-inverse (see Wikipedia: Multivariate normal distribution - Degenerate case).
The function supports multiple computation methods: pdf for probability density, cdf for cumulative distribution, logpdf and logcdf for their logarithmic counterparts (useful for numerical stability with very small probabilities), entropy for differential entropy of the distribution, and rvs for generating random samples. When the mean or covariance is not specified, the function defaults to a zero mean vector and identity covariance matrix, respectively.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=MULTIVARIATE_NORMAL(x, mean, cov, mvn_method, size)
x(list[list], required): 2D array of points at which to evaluate the distribution. Each row is a point, each column is a dimension.mean(list[list], optional, default: null): Mean vector of the distribution as a column vector. Defaults to zero vector if not provided.cov(list[list], optional, default: null): Covariance matrix of the distribution. Must be square and positive semi-definite. Defaults to identity matrix if not provided.mvn_method(str, optional, default: “pdf”): The method to compute. Options are pdf, cdf, logpdf, logcdf, entropy, or rvs.size(int, optional, default: null): Number of random samples to draw when method is rvs.
Returns (list[list]): 2D list of results, or error message string.
Examples
Example 1: Demo case 1
Inputs:
| x | mean | cov | mvn_method | ||
|---|---|---|---|---|---|
| 0 | 0 | 0 | 1 | 0 | |
| 1 | 1 | 0 | 0 | 1 |
Excel formula:
=MULTIVARIATE_NORMAL({0,0;1,1}, {0;0}, {1,0;0,1}, "pdf")
Expected output:
| Result |
|---|
| 0.1592 |
| 0.0585 |
Example 2: CDF with nonzero mean
Inputs:
| x | mean | cov | mvn_method | ||
|---|---|---|---|---|---|
| 0 | 0 | 1 | 1 | 0 | cdf |
| 1 | 1 | 1 | 0 | 1 |
Excel formula:
=MULTIVARIATE_NORMAL({0,0;1,1}, {1;1}, {1,0;0,1}, "cdf")
Expected output:
| Result |
|---|
| 0.02517 |
| 0.25 |
Example 3: Log-PDF calculation
Inputs:
| x | mean | cov | mvn_method | ||
|---|---|---|---|---|---|
| 0 | 0 | 0 | 1 | 0 | logpdf |
| 1 | 1 | 0 | 0 | 1 |
Excel formula:
=MULTIVARIATE_NORMAL({0,0;1,1}, {0;0}, {1,0;0,1}, "logpdf")
Expected output:
| Result |
|---|
| -1.8379 |
| -2.8379 |
Example 4: Entropy calculation
Inputs:
| x | mean | cov | mvn_method | ||
|---|---|---|---|---|---|
| 0 | 0 | 0 | 1 | 0 | entropy |
| 1 | 1 | 0 | 0 | 1 |
Excel formula:
=MULTIVARIATE_NORMAL({0,0;1,1}, {0;0}, {1,0;0,1}, "entropy")
Expected output:
| Result |
|---|
| 2.8379 |
| 2.8379 |
Python Code
from scipy.stats import multivariate_normal as scipy_multivariate_normal
import numpy as np
import math
def multivariate_normal(x, mean=None, cov=None, mvn_method='pdf', size=None):
"""
Computes the PDF, CDF, log-PDF, log-CDF, entropy, or draws random samples from a multivariate normal distribution.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multivariate_normal.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): 2D array of points at which to evaluate the distribution. Each row is a point, each column is a dimension.
mean (list[list], optional): Mean vector of the distribution as a column vector. Defaults to zero vector if not provided. Default is None.
cov (list[list], optional): Covariance matrix of the distribution. Must be square and positive semi-definite. Defaults to identity matrix if not provided. Default is None.
mvn_method (str, optional): The method to compute. Options are pdf, cdf, logpdf, logcdf, entropy, or rvs. Valid options: PDF, CDF, Log-PDF, Log-CDF, Entropy, Random Samples. Default is 'pdf'.
size (int, optional): Number of random samples to draw when method is rvs. Default is None.
Returns:
list[list]: 2D list of results, or error message string.
"""
def to2d(val):
return [[val]] if not isinstance(val, list) else val
# Validate x
x = to2d(x)
if not isinstance(x, list) or not all(isinstance(row, list) for row in x):
return "Invalid input: x must be a 2D list of floats."
if len(x) == 0 or len(x[0]) == 0:
return "Invalid input: x must be a non-empty 2D list."
n_dim = len(x[0])
# Validate mean
if mean is None:
mean_vec = [0.0] * n_dim
else:
mean = to2d(mean)
if not isinstance(mean, list) or not all(isinstance(row, list) for row in mean):
return "Invalid input: mean must be a 2D list (column vector)."
if len(mean) != n_dim or any(len(row) != 1 for row in mean):
return "Invalid input: mean must be a column vector with same dimension as x."
try:
mean_vec = [float(row[0]) for row in mean]
except Exception:
return "Invalid input: mean must contain numeric values."
# Validate cov
if cov is None:
cov_mat = [[float(i == j) for j in range(n_dim)] for i in range(n_dim)]
else:
cov = to2d(cov)
if not isinstance(cov, list) or len(cov) != n_dim or any(len(row) != n_dim for row in cov):
return "Invalid input: cov must be a square 2D list with shape (n_dim, n_dim)."
try:
cov_mat = [[float(val) for val in row] for row in cov]
except Exception:
return "Invalid input: cov must contain numeric values."
# Validate mvn_method
valid_methods = {'pdf', 'cdf', 'logpdf', 'logcdf', 'entropy', 'rvs'}
if mvn_method not in valid_methods:
return f"Invalid method: {mvn_method}. Must be one of {sorted(valid_methods)}."
# Validate size
if mvn_method == 'rvs':
if size is None:
return "Invalid input: size must be specified for method 'rvs'."
if not isinstance(size, int) or size <= 0:
return "Invalid input: size must be a positive integer."
# Try to create the distribution
try:
dist = scipy_multivariate_normal(mean=mean_vec, cov=cov_mat)
except Exception as e:
return f"scipy.multivariate_normal error: {e}"
# Compute result
try:
if mvn_method == 'pdf':
result = [[float(dist.pdf(row))] for row in x]
elif mvn_method == 'cdf':
result = [[float(dist.cdf(row))] for row in x]
elif mvn_method == 'logpdf':
result = [[float(dist.logpdf(row))] for row in x]
elif mvn_method == 'logcdf':
result = [[float(dist.logcdf(row))] for row in x]
elif mvn_method == 'entropy':
ent = float(dist.entropy())
result = [[ent] for _ in x]
elif mvn_method == 'rvs':
samples = dist.rvs(size=size)
# Ensure samples is 2D list with shape (size, n_dim)
samples = np.atleast_1d(samples)
if size == 1:
samples = samples.reshape(1, -1)
elif n_dim == 1:
samples = samples.reshape(-1, 1)
result = [[float(val) for val in row] for row in samples]
else:
return f"Invalid method: {mvn_method}."
except Exception as e:
return f"scipy.multivariate_normal error: {e}"
# Check for invalid values
for row in result:
for val in row:
if isinstance(val, float) and (math.isnan(val) or math.isinf(val)):
return "Invalid output: result contains NaN or Inf."
return result