Skip to Content

EPPS_SINGLETON_2SAMP

Overview

The EPPS_SINGLETON_2SAMP function performs the Epps-Singleton test to compare whether two independent samples come from the same distribution, using empirical characteristic functions. This test is more general than the Kolmogorov-Smirnov or t-test, and is suitable for both discrete and continuous data, especially when sample sizes are at least 25. The test statistic is computed as:

ES=nmj=1kϕx(tj)ϕy(tj)2ES = n \cdot m \cdot \sum_{j=1}^k \left| \phi_x(t_j) - \phi_y(t_j) \right|^2

where ϕx(t)\phi_x(t) and ϕy(t)\phi_y(t) are the empirical characteristic functions of samples xx and yy evaluated at points tjt_j, and nn, mm are the sample sizes. For more details, see the scipy.stats.epps_singleton_2samp documentation.

This wrapper exposes only the most commonly used parameters: x, y, and optionally t (points for evaluation). Parameters related to axis, NaN handling, and broadcasting are omitted for Excel compatibility. This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:

=EPPS_SINGLETON_2SAMP(x, y, [t])
  • x (2D list, required): First sample, as a column or matrix. Must have at least five rows.
  • y (2D list, required): Second sample, as a column or matrix. Must have at least five rows.
  • t (2D list, optional, default=[[0.4, 0.8]]): Points where the empirical characteristic function is evaluated.

The function returns a single-row 2D array: [statistic, pvalue] (both floats), or an error message (string) if the input is invalid.

Examples

Example 1: Basic Case

Inputs:

xy
1.02.0
2.03.0
3.04.0
4.05.0
5.06.0

Excel formula:

=EPPS_SINGLETON_2SAMP({1;2;3;4;5}, {2;3;4;5;6})

Expected output:

statisticpvalue
0.9140.923

Example 2: With Custom t

Inputs:

xyt
1.02.00.51.0
2.03.0
3.04.0
4.05.0
5.06.0

Excel formula:

=EPPS_SINGLETON_2SAMP({1;2;3;4;5}, {2;3;4;5;6}, {0.5,1.0})

Expected output:

statisticpvalue
0.9090.923

Example 3: Different Samples

Inputs:

xy
10.015.0
20.025.0
30.035.0
40.045.0
50.055.0

Excel formula:

=EPPS_SINGLETON_2SAMP({10;20;30;40;50}, {15;25;35;45;55})

Expected output:

statisticpvalue
0.4020.982

Example 4: Larger Samples

Inputs:

xy
1.02.0
2.03.0
3.04.0
4.05.0
5.06.0
6.07.0
7.08.0
8.09.0
9.010.0
10.011.0

Excel formula:

=EPPS_SINGLETON_2SAMP({1;2;3;4;5;6;7;8;9;10}, {2;3;4;5;6;7;8;9;10;11})

Expected output:

statisticpvalue
0.9590.916

Python Code

from scipy.stats import epps_singleton_2samp as scipy_epps_singleton_2samp from typing import List, Optional, Union def epps_singleton_2samp(x: List[List[float]], y: List[List[float]], t: Optional[List[List[float]]] = None) -> Union[List[List[float]], str]: """ Computes the Epps-Singleton test statistic and p-value for two samples. Args: x: 2D list of float values. First sample, must have at least five observations. y: 2D list of float values. Second sample, must have at least five observations. t: Optional 2D list of float values. Points where the empirical characteristic function is evaluated. Default is [[0.4, 0.8]]. Returns: 2D list with one row: [statistic, pvalue], or an error message (str) if input is invalid. This example function is provided as-is without any representation of accuracy. """ # Validate x and y are 2D lists with at least five rows if not (isinstance(x, list) and all(isinstance(row, list) for row in x) and len(x) >= 5): return "Invalid input: x must be a 2D list with at least five rows." if not (isinstance(y, list) and all(isinstance(row, list) for row in y) and len(y) >= 5): return "Invalid input: y must be a 2D list with at least five rows." # Flatten x and y try: x_flat = [float(item) for row in x for item in row] y_flat = [float(item) for row in y for item in row] except Exception: return "Invalid input: x and y must contain only numeric values." if len(x_flat) < 5 or len(y_flat) < 5: return "Invalid input: each sample must contain at least five values." # Validate t if t is not None: if not (isinstance(t, list) and all(isinstance(row, list) for row in t)): return "Invalid input: t must be a 2D list of floats." try: t_flat = [float(item) for row in t for item in row] except Exception: return "Invalid input: t must contain only numeric values." if len(t_flat) == 0: return "Invalid input: t must contain at least one value." else: t_flat = [0.4, 0.8] # Call scipy.stats.epps_singleton_2samp try: result = scipy_epps_singleton_2samp(x_flat, y_flat, t=t_flat) stat = float(result.statistic) pvalue = float(result.pvalue) except Exception as e: return f"scipy.stats.epps_singleton_2samp error: {e}" # Check for nan/inf if any([isinstance(val, float) and (val != val or val in [float('inf'), float('-inf')]) for val in [stat, pvalue]]): return "Invalid result: statistic or pvalue is nan or inf." return [[stat, pvalue]]

Example Workbook

Link to Workbook

Last updated on