PARETO

Overview

The PARETO function computes various statistical measures for the Pareto distribution, a continuous probability distribution named after the Italian economist Vilfredo Pareto. Originally developed to describe the distribution of wealth in society, the Pareto distribution is a power-law distribution that models phenomena where a small proportion of inputs accounts for a large proportion of outputs—commonly known as the “80-20 rule” or Pareto principle.

This implementation uses SciPy’s scipy.stats.pareto module, which provides a comprehensive set of statistical functions for the Pareto Type I distribution.

The probability density function (PDF) for the Pareto distribution is defined as:

f(x, b) = \frac{b}{x^{b+1}}

for x \geq 1 and shape parameter b > 0 (in standardized form). With location (loc) and scale parameters, the distribution is shifted and scaled accordingly, where x is replaced by (x - \text{loc}) / \text{scale}.

The cumulative distribution function (CDF) is:

F(x, b) = 1 - \frac{1}{x^b}

The shape parameter b (also denoted \alpha in statistical literature) is called the tail index or Pareto index. It controls the “heaviness” of the distribution’s tail—smaller values produce heavier tails, meaning more probability mass in extreme values. The mean of the distribution is finite only when b > 1, and the variance exists only when b > 2.

The Pareto distribution appears in many real-world contexts: income and wealth distributions, city population sizes, file sizes in internet traffic, insurance claim severities, oil reserve valuations, and earthquake magnitudes (via the Gutenberg-Richter law). For more theoretical background, see the Wikipedia article on Pareto distribution.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=PARETO(value, b, loc, scale, pareto_method)
  • value (float, optional, default: null): Input value for the distribution method. For pdf, cdf, sf, the x value. For icdf, isf, the probability (0-1). Not required for mean, median, var, std.
  • b (float, optional, default: 1): Shape parameter of the Pareto distribution. Must be greater than 0.
  • loc (float, optional, default: 0): Location parameter of the distribution.
  • scale (float, optional, default: 1): Scale parameter of the distribution. Must be greater than 0.
  • pareto_method (str, optional, default: “pdf”): The distribution method to compute.

Returns (float): Distribution result (float), or error message string.

Examples

Example 1: PDF at x=2 with shape b=3

Inputs:

value b loc scale pareto_method
2 3 0 1 pdf

Excel formula:

=PARETO(2, 3, 0, 1, "pdf")

Expected output:

0.1875

Example 2: CDF at x=2 with shape b=3

Inputs:

value b loc scale pareto_method
2 3 0 1 cdf

Excel formula:

=PARETO(2, 3, 0, 1, "cdf")

Expected output:

0.875

Example 3: Inverse CDF (quantile) at probability 0.875

Inputs:

value b loc scale pareto_method
0.875 3 0 1 icdf

Excel formula:

=PARETO(0.875, 3, 0, 1, "icdf")

Expected output:

2

Example 4: Mean of Pareto distribution with shape b=3

Inputs:

b loc scale pareto_method
3 0 1 mean

Excel formula:

=PARETO(3, 0, 1, "mean")

Expected output:

1.5

Python Code

from scipy.stats import pareto as scipy_pareto
import math

def pareto(value=None, b=1, loc=0, scale=1, pareto_method='pdf'):
    """
    Generalized Pareto distribution function supporting multiple methods.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pareto.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        value (float, optional): Input value for the distribution method. For pdf, cdf, sf, the x value. For icdf, isf, the probability (0-1). Not required for mean, median, var, std. Default is None.
        b (float, optional): Shape parameter of the Pareto distribution. Must be greater than 0. Default is 1.
        loc (float, optional): Location parameter of the distribution. Default is 0.
        scale (float, optional): Scale parameter of the distribution. Must be greater than 0. Default is 1.
        pareto_method (str, optional): The distribution method to compute. Valid options: PDF, CDF, ICDF, SF, ISF, Mean, Median, Variance, Std Dev. Default is 'pdf'.

    Returns:
        float: Distribution result (float), or error message string.
    """
    valid_methods = {'pdf', 'cdf', 'icdf', 'sf', 'isf', 'mean', 'median', 'var', 'std'}

    if not isinstance(pareto_method, str):
        return "Invalid input: pareto_method must be a string."

    method = pareto_method.lower()
    if method not in valid_methods:
        return f"Invalid method: {pareto_method}. Must be one of {', '.join(sorted(valid_methods))}."

    try:
        b = float(b)
        loc = float(loc)
        scale = float(scale)
    except Exception:
        return "Invalid input: b, loc, and scale must be numbers."
    if b <= 0:
        return "Invalid input: b must be > 0."
    if scale <= 0:
        return "Invalid input: scale must be > 0."
    dist = scipy_pareto(b, loc, scale)
    # Methods that require value
    if method in ['pdf', 'cdf', 'icdf', 'sf', 'isf']:
        if value is None:
            return f"Invalid input: missing required argument 'value' for method '{method}'."
        try:
            value = float(value)
        except Exception:
            return "Invalid input: value must be a number."
        try:
            if method == 'pdf':
                result = dist.pdf(value)
            elif method == 'cdf':
                result = dist.cdf(value)
            elif method == 'sf':
                result = dist.sf(value)
            elif method == 'isf':
                if not (0 <= value <= 1):
                    return "Invalid input: value (probability) must be between 0 and 1 for isf."
                result = dist.isf(value)
            elif method == 'icdf':
                if not (0 <= value <= 1):
                    return "Invalid input: value (probability) must be between 0 and 1 for icdf."
                result = dist.ppf(value)
        except Exception as e:
            return f"scipy.stats.pareto error: {e}"
        if isinstance(result, float):
            if math.isnan(result):
                return "Result is NaN (not a number)"
            if math.isinf(result):
                return "inf" if result > 0 else "-inf"
        return result
    # Methods that do not require value
    try:
        if method == 'mean':
            result = dist.mean()
        elif method == 'median':
            result = dist.median()
        elif method == 'var':
            result = dist.var()
        elif method == 'std':
            result = dist.std()
    except Exception as e:
        return f"scipy.stats.pareto error: {e}"
    if isinstance(result, float):
        if math.isnan(result):
            return "Result is NaN (not a number)"
        if math.isinf(result):
            return "inf" if result > 0 else "-inf"
    return result

Online Calculator