Autocorrelation And Stationarity Tests
Overview
Autocorrelation and stationarity are central ideas in time-series analysis because they describe how observations depend on prior values and whether those relationships remain stable over time. In practical modeling, many forecasting and inference methods assume some form of weak stationarity, so analysts need reliable diagnostics before fitting ARIMA-style models or interpreting residuals. This category focuses on tools that quantify serial dependence, test unit-root behavior, and evaluate whether remaining autocorrelation is statistically meaningful. For background, see Autocorrelation, Stationary process, and Unit root.
The unifying concepts are lag structure, covariance/correlation decomposition, and hypothesis testing under dependence. Correlation-based diagnostics summarize dependence at lag k using normalized measures such as \rho_k, while covariance-based diagnostics preserve scale in \gamma_k. Stationarity tests then formalize competing hypotheses about persistence: some tests take a unit root as the null, and others take stationarity as the null, so they are best interpreted jointly. Portmanteau statistics aggregate evidence across multiple lags, often using forms like
Q = n(n+2)\sum_{k=1}^{m}\frac{\rho_k^2}{n-k},
to check whether residual serial correlation remains.
These functions are implemented with statsmodels, especially statsmodels.tsa.stattools, a widely used Python toolkit for econometrics and statistical time-series modeling. The library provides consistent APIs for autocorrelation diagnostics, unit-root testing, and lag-selection options, which makes it suitable for both exploratory workflows and production-grade model validation.
For univariate dependence structure, ACF, ACOVF, and PACF provide complementary views of lag dynamics. ACF reports normalized serial correlation across lags, ACOVF reports scale-preserving autocovariance, and PACF isolates direct lag-k effects after controlling intermediate lags. Together, they support AR/MA order identification, residual checking, and feature engineering for lagged predictors. Q_STAT extends this by converting an autocorrelation sequence into Ljung-Box statistics and p-values to test whether correlation structure remains jointly significant.
For lead-lag relationships between two series, CCF and CCOVF quantify cross-series dependence across offsets. CCOVF measures raw co-movement and is useful when magnitude is important, while CCF normalizes by variance to compare effect strength on a common scale. These tools are commonly used in transfer-function modeling, signal alignment, and exploratory checks of whether one process systematically leads or follows another.
For stationarity and unit-root diagnostics, ADFULLER, KPSS, RURTEST, and ZIVOT_ANDREWS cover complementary assumptions. ADFULLER tests a unit-root null against stationarity alternatives, whereas KPSS inverts the null to stationarity, making the pair especially useful for triangulation. RURTEST adds a range-based unit-root diagnostic, and ZIVOT_ANDREWS allows one unknown structural break when testing persistence. In practice, analysts compare outcomes across these tests rather than relying on a single p-value, especially when trend shifts or regime changes may distort standard stationarity conclusions.
ACF
This function estimates the autocorrelation function (ACF) of a univariate time series for lag values from 0 up to a selected maximum lag.
The lag-k autocorrelation is the normalized covariance between observations separated by k periods:
\rho_k = \frac{\operatorname{Cov}(x_t, x_{t-k})}{\operatorname{Var}(x_t)}
Optional confidence intervals and Ljung-Box diagnostics can be included to help assess whether observed autocorrelations are statistically significant.
Excel Usage
=ACF(x, adjusted, nlags, qstat, fft, alpha, bartlett_confint, missing)
x(list[list], required): Time-series observations as a 2D range.adjusted(bool, optional, default: false): Use denominator n-k instead of n in covariance normalization.nlags(int, optional, default: null): Maximum lag to compute; null uses statsmodels default.qstat(bool, optional, default: false): Return Ljung-Box Q statistics and p-values for lags above zero.fft(bool, optional, default: true): Use FFT-based computation for improved speed on long series.alpha(float, optional, default: null): Significance level for confidence intervals; null disables intervals.bartlett_confint(bool, optional, default: true): Use Bartlett formula for confidence interval standard errors.missing(str, optional, default: “none”): Missing-data handling mode.
Returns (list[list]): 2D table with columns lag, acf, conf_low, conf_high, q_stat, and p_value.
Example 1: ACF for a short increasing series
Inputs:
| x | adjusted | nlags | qstat | fft | alpha | bartlett_confint | missing | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | false | 3 | false | true | true | none |
Excel formula:
=ACF({1,2,3,4,5,6}, FALSE, 3, FALSE, TRUE, , TRUE, "none")
Expected output:
| Result | |||||
|---|---|---|---|---|---|
| 0 | 1 | ||||
| 1 | 0.5 | ||||
| 2 | 0.0571429 | ||||
| 3 | -0.271429 |
Example 2: ACF with confidence intervals
Inputs:
| x | adjusted | nlags | qstat | fft | alpha | bartlett_confint | missing | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | false | 4 | false | true | 0.05 | true | none |
Excel formula:
=ACF({2,1,2,1,2,1,2,1}, FALSE, 4, FALSE, TRUE, 0.05, TRUE, "none")
Expected output:
| Result | |||||
|---|---|---|---|---|---|
| 0 | 1 | 1 | 1 | ||
| 1 | -0.875 | -1.56795 | -0.182048 | ||
| 2 | 0.75 | -0.35248 | 1.85248 | ||
| 3 | -0.625 | -1.95002 | 0.700016 | ||
| 4 | 0.5 | -0.959729 | 1.95973 |
Example 3: ACF with Ljung-Box statistics
Inputs:
| x | adjusted | nlags | qstat | fft | alpha | bartlett_confint | missing | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | false | 4 | true | true | true | none |
Excel formula:
=ACF({1,0,1,0,1,0,1,0,1}, FALSE, 4, TRUE, TRUE, , TRUE, "none")
Expected output:
| Result | |||||
|---|---|---|---|---|---|
| 0 | 1 | ||||
| 1 | -0.888889 | 9.77778 | 0.00176634 | ||
| 2 | 0.772222 | 18.2115 | 0.000111023 | ||
| 3 | -0.666667 | 25.5449 | 0.0000118766 | ||
| 4 | 0.544444 | 31.414 | 0.00000252021 |
Example 4: ACF using adjusted denominator
Inputs:
| x | adjusted | nlags | qstat | fft | alpha | bartlett_confint | missing | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 4 | 6 | 8 | 7 | 5 | 4 | 6 | true | 3 | false | false | true | none |
Excel formula:
=ACF({3,4,6,8,7,5,4,6}, TRUE, 3, FALSE, FALSE, , TRUE, "none")
Expected output:
| Result | |||||
|---|---|---|---|---|---|
| 0 | 1 | ||||
| 1 | 0.423181 | ||||
| 2 | -0.505241 | ||||
| 3 | -0.909434 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import acf as sm_acf
def acf(x, adjusted=False, nlags=None, qstat=False, fft=True, alpha=None, bartlett_confint=True, missing='none'):
"""
Compute autocorrelation values across lags with optional confidence intervals and Ljung-Box statistics.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.acf.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
adjusted (bool, optional): Use denominator n-k instead of n in covariance normalization. Default is False.
nlags (int, optional): Maximum lag to compute; null uses statsmodels default. Default is None.
qstat (bool, optional): Return Ljung-Box Q statistics and p-values for lags above zero. Default is False.
fft (bool, optional): Use FFT-based computation for improved speed on long series. Default is True.
alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.
bartlett_confint (bool, optional): Use Bartlett formula for confidence interval standard errors. Default is True.
missing (str, optional): Missing-data handling mode. Default is 'none'.
Returns:
list[list]: 2D table with columns lag, acf, conf_low, conf_high, q_stat, and p_value.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
if missing not in ("none", "raise", "conservative", "drop"):
return "Error: missing must be one of 'none', 'raise', 'conservative', or 'drop'"
series = to1d(x)
if len(series) < 2:
return "Error: x must contain at least two numeric values"
result = sm_acf(
np.asarray(series, dtype=float),
adjusted=adjusted,
nlags=nlags,
qstat=qstat,
fft=fft,
alpha=alpha,
bartlett_confint=bartlett_confint,
missing=missing,
)
acf_vals = None
confint = None
q_vals = None
p_vals = None
if isinstance(result, tuple):
if len(result) == 4:
acf_vals, confint, q_vals, p_vals = result
elif len(result) == 3:
acf_vals, q_vals, p_vals = result
elif len(result) == 2:
acf_vals, confint = result
else:
return "Error: Unexpected output format from statsmodels acf"
else:
acf_vals = result
acf_arr = np.asarray(acf_vals, dtype=float)
conf_arr = np.asarray(confint, dtype=float) if confint is not None else None
q_arr = np.asarray(q_vals, dtype=float) if q_vals is not None else None
p_arr = np.asarray(p_vals, dtype=float) if p_vals is not None else None
table = []
for lag in range(len(acf_arr)):
low = ""
high = ""
q_stat_val = ""
p_val = ""
if conf_arr is not None and lag < len(conf_arr):
low = float(conf_arr[lag][0])
high = float(conf_arr[lag][1])
if q_arr is not None and lag > 0 and (lag - 1) < len(q_arr):
q_stat_val = float(q_arr[lag - 1])
if p_arr is not None and lag > 0 and (lag - 1) < len(p_arr):
p_val = float(p_arr[lag - 1])
table.append([lag, float(acf_arr[lag]), low, high, q_stat_val, p_val])
return table
except Exception as e:
return f"Error: {str(e)}"Online Calculator
ACOVF
This function computes the autocovariance function (ACOVF), which describes covariance between a series and lagged versions of itself.
For lag k, autocovariance is defined as:
\gamma_k = \operatorname{Cov}(x_t, x_{t-k})
The output is useful for understanding serial dependence magnitude before normalization into autocorrelation.
Excel Usage
=ACOVF(x, adjusted, demean, fft, missing, nlag)
x(list[list], required): Time-series observations as a 2D range.adjusted(bool, optional, default: false): Use denominator n-k instead of n in covariance estimation.demean(bool, optional, default: true): Subtract sample mean before covariance estimation.fft(bool, optional, default: true): Use FFT-based convolution.missing(str, optional, default: “none”): Missing-data handling mode.nlag(int, optional, default: null): Maximum lag to return; null returns full available range.
Returns (list[list]): 2D table with columns lag and acovf.
Example 1: Autocovariance with default settings
Inputs:
| x | adjusted | demean | fft | missing | nlag | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | false | true | true | none |
Excel formula:
=ACOVF({1,2,3,4,5,6}, FALSE, TRUE, TRUE, "none", )
Expected output:
| Result | |
|---|---|
| 0 | 2.91667 |
| 1 | 1.45833 |
| 2 | 0.166667 |
| 3 | -0.791667 |
| 4 | -1.25 |
| 5 | -1.04167 |
Example 2: Autocovariance with adjusted denominator
Inputs:
| x | adjusted | demean | fft | missing | nlag | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 5 | 4 | 6 | 5 | 7 | 6 | true | true | true | none | 4 |
Excel formula:
=ACOVF({3,5,4,6,5,7,6}, TRUE, TRUE, TRUE, "none", 4)
Expected output:
| Result | |
|---|---|
| 0 | 1.55102 |
| 1 | 0.115646 |
| 2 | 0.791837 |
| 3 | -0.80102 |
| 4 | -0.312925 |
Example 3: Autocovariance without demeaning
Inputs:
| x | adjusted | demean | fft | missing | nlag | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2 | 3 | 3 | 4 | 4 | 5 | false | false | false | none | 3 |
Excel formula:
=ACOVF({2,2,3,3,4,4,5}, FALSE, FALSE, FALSE, "none", 3)
Expected output:
| Result | |
|---|---|
| 0 | 11.8571 |
| 1 | 9.57143 |
| 2 | 8 |
| 3 | 5.85714 |
Example 4: Autocovariance limited to selected lags
Inputs:
| x | adjusted | demean | fft | missing | nlag | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 4 | 2 | 5 | 3 | 6 | 4 | 7 | false | true | true | none | 2 |
Excel formula:
=ACOVF({1,4,2,5,3,6,4,7}, FALSE, TRUE, TRUE, "none", 2)
Expected output:
| Result | |
|---|---|
| 0 | 3.5 |
| 1 | -0.625 |
| 2 | 2 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import acovf as sm_acovf
def acovf(x, adjusted=False, demean=True, fft=True, missing='none', nlag=None):
"""
Estimate autocovariance values of a time series across lags.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.acovf.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
adjusted (bool, optional): Use denominator n-k instead of n in covariance estimation. Default is False.
demean (bool, optional): Subtract sample mean before covariance estimation. Default is True.
fft (bool, optional): Use FFT-based convolution. Default is True.
missing (str, optional): Missing-data handling mode. Default is 'none'.
nlag (int, optional): Maximum lag to return; null returns full available range. Default is None.
Returns:
list[list]: 2D table with columns lag and acovf.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
if missing not in ("none", "raise", "conservative", "drop"):
return "Error: missing must be one of 'none', 'raise', 'conservative', or 'drop'"
series = to1d(x)
if len(series) < 2:
return "Error: x must contain at least two numeric values"
acovf_vals = sm_acovf(
np.asarray(series, dtype=float),
adjusted=adjusted,
demean=demean,
fft=fft,
missing=missing,
nlag=nlag,
)
arr = np.asarray(acovf_vals, dtype=float)
return [[lag, float(arr[lag])] for lag in range(len(arr))]
except Exception as e:
return f"Error: {str(e)}"Online Calculator
ADFULLER
This function applies the Augmented Dickey-Fuller (ADF) test to evaluate whether a univariate time series contains a unit root.
The null hypothesis is that the series is non-stationary with a unit root, while the alternative is stationarity.
It returns the test statistic, p-value, lag usage, sample size, and critical values used for interpretation.
Excel Usage
=ADFULLER(x, maxlag, regression, autolag, store, regresults)
x(list[list], required): Time-series observations as a 2D range.maxlag(int, optional, default: null): Maximum lag included in the test regression; null uses default rule.regression(str, optional, default: “c”): Deterministic terms included in the test regression (c, ct, ctt, or n).autolag(str, optional, default: “AIC”): Automatic lag-selection criterion (AIC, BIC, t-stat, or none).store(bool, optional, default: false): Return result storage object in addition to scalar outputs.regresults(bool, optional, default: false): Return full regression results when available.
Returns (list[list]): 2D key-value table summarizing ADF statistics and critical values.
Example 1: ADF test with default regression and autolag
Inputs:
| x | maxlag | regression | autolag | store | regresults | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.2 | 1.1 | 1.3 | 1.25 | 1.35 | 1.3 | 1.4 | 1.38 | 1.45 | c | AIC | false | false |
Excel formula:
=ADFULLER({1,1.2,1.1,1.3,1.25,1.35,1.3,1.4,1.38,1.45}, , "c", "AIC", FALSE, FALSE)
Expected output:
| Result | |
|---|---|
| adf_stat | -2.1749 |
| p_value | 0.21548 |
| usedlag | 3 |
| nobs | 6 |
| critical_1% | -5.35426 |
| critical_5% | -3.64624 |
| critical_10% | -2.9012 |
| icbest | -37.0965 |
Example 2: ADF test with constant and trend regression
Inputs:
| x | maxlag | regression | autolag | store | regresults | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2.1 | 2.05 | 2.2 | 2.15 | 2.25 | 2.3 | 2.35 | 2.32 | 2.4 | 2 | ct | BIC | false | false |
Excel formula:
=ADFULLER({2,2.1,2.05,2.2,2.15,2.25,2.3,2.35,2.32,2.4}, 2, "ct", "BIC", FALSE, FALSE)
Expected output:
| Result | |
|---|---|
| adf_stat | -5.66528 |
| p_value | 0.0000107685 |
| usedlag | 0 |
| nobs | 9 |
| critical_1% | -5.49966 |
| critical_5% | -4.07211 |
| critical_10% | -3.4935 |
| icbest | -26.0036 |
Example 3: ADF test using fixed lag count
Inputs:
| x | maxlag | regression | autolag | store | regresults | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 2.9 | 3.1 | 3 | 3.2 | 3.1 | 3.3 | 3.25 | 3.35 | 3.3 | 1 | c | none | false | false |
Excel formula:
=ADFULLER({3,2.9,3.1,3,3.2,3.1,3.3,3.25,3.35,3.3}, 1, "c", "none", FALSE, FALSE)
Expected output:
| Result | |
|---|---|
| adf_stat | -1.29158 |
| p_value | 0.633016 |
| usedlag | 1 |
| nobs | 8 |
| critical_1% | -4.66519 |
| critical_5% | -3.36719 |
| critical_10% | -2.80296 |
Example 4: ADF test without deterministic terms
Inputs:
| x | maxlag | regression | autolag | store | regresults | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.95 | 1.02 | 0.98 | 1.01 | 0.99 | 1.03 | 1 | 1.04 | 1.01 | 1 | n | AIC | false | false |
Excel formula:
=ADFULLER({1,0.95,1.02,0.98,1.01,0.99,1.03,1,1.04,1.01}, 1, "n", "AIC", FALSE, FALSE)
Expected output:
| Result | |
|---|---|
| adf_stat | 2.85276 |
| p_value | 0.999591 |
| usedlag | 1 |
| nobs | 8 |
| critical_1% | -2.90189 |
| critical_5% | -1.96617 |
| critical_10% | -1.57649 |
| icbest | -46.7118 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import adfuller as sm_adfuller
def adfuller(x, maxlag=None, regression='c', autolag='AIC', store=False, regresults=False):
"""
Run the Augmented Dickey-Fuller unit-root test for stationarity diagnostics.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.adfuller.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
maxlag (int, optional): Maximum lag included in the test regression; null uses default rule. Default is None.
regression (str, optional): Deterministic terms included in the test regression (c, ct, ctt, or n). Default is 'c'.
autolag (str, optional): Automatic lag-selection criterion (AIC, BIC, t-stat, or none). Default is 'AIC'.
store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.
regresults (bool, optional): Return full regression results when available. Default is False.
Returns:
list[list]: 2D key-value table summarizing ADF statistics and critical values.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
if regression not in ("c", "ct", "ctt", "n"):
return "Error: regression must be one of 'c', 'ct', 'ctt', or 'n'"
if autolag in ("none", "None", "", None):
autolag_arg = None
elif autolag in ("AIC", "BIC", "t-stat"):
autolag_arg = autolag
else:
return "Error: autolag must be 'AIC', 'BIC', 't-stat', or 'none'"
series = to1d(x)
if len(series) < 4:
return "Error: x must contain at least four numeric values"
result = sm_adfuller(
np.asarray(series, dtype=float),
maxlag=maxlag,
regression=regression,
autolag=autolag_arg,
store=store,
regresults=regresults,
)
adf_stat = float(result[0])
p_value = float(result[1])
usedlag = int(result[2])
nobs_used = int(result[3])
crit_values = result[4]
rows = [
["adf_stat", adf_stat],
["p_value", p_value],
["usedlag", usedlag],
["nobs", nobs_used],
]
if isinstance(crit_values, dict):
for key in ("1%", "5%", "10%"):
if key in crit_values:
rows.append([f"critical_{key}", float(crit_values[key])])
if len(result) > 5 and isinstance(result[5], (int, float, np.floating)):
rows.append(["icbest", float(result[5])])
return rows
except Exception as e:
return f"Error: {str(e)}"Online Calculator
CCF
This function estimates the cross-correlation function (CCF) between two univariate time series, reporting how strongly one series is linearly related to lagged values of another.
For lag k, the cross-correlation compares x_{t+k} with y_t after normalization by series variability.
Optional confidence intervals can be returned to assess whether cross-correlations differ materially from zero.
Excel Usage
=CCF(x, y, adjusted, fft, nlags, alpha)
x(list[list], required): First time series as a 2D range.y(list[list], required): Second time series as a 2D range.adjusted(bool, optional, default: true): Use denominator n-k instead of n for normalization.fft(bool, optional, default: true): Use FFT-based convolution.nlags(int, optional, default: null): Number of lags to compute; null uses statsmodels default.alpha(float, optional, default: null): Significance level for confidence intervals; null disables intervals.
Returns (list[list]): 2D table with columns lag, ccf, conf_low, and conf_high.
Example 1: Cross-correlation with default options
Inputs:
| x | y | adjusted | fft | nlags | alpha | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 1 | 1 | 2 | 3 | 5 | 8 | true | true | 5 |
Excel formula:
=CCF({1,2,3,4,5,6}, {1,1,2,3,5,8}, TRUE, TRUE, 5, )
Expected output:
| Result | |||
|---|---|---|---|
| 0 | 0.938953 | ||
| 1 | 0.359932 | ||
| 2 | -0.166273 | ||
| 3 | -0.625969 | ||
| 4 | -1.09545 |
Example 2: Cross-correlation with confidence intervals
Inputs:
| x | y | adjusted | fft | nlags | alpha | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | true | true | 6 | 0.05 |
Excel formula:
=CCF({2,1,2,1,2,1,2,1}, {1,2,1,2,1,2,1,2}, TRUE, TRUE, 6, 0.05)
Expected output:
| Result | |||
|---|---|---|---|
| 0 | -1 | -1.69295 | -0.307048 |
| 1 | 1 | 0.307048 | 1.69295 |
| 2 | -1 | -1.69295 | -0.307048 |
| 3 | 1 | 0.307048 | 1.69295 |
| 4 | -1 | -1.69295 | -0.307048 |
| 5 | 1 | 0.307048 | 1.69295 |
Example 3: Cross-correlation without FFT
Inputs:
| x | y | adjusted | fft | nlags | alpha | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 5 | 4 | 6 | 5 | 7 | 6 | 2 | 4 | 3 | 5 | 4 | 6 | 5 | false | false | 5 |
Excel formula:
=CCF({3,5,4,6,5,7,6}, {2,4,3,5,4,6,5}, FALSE, FALSE, 5, )
Expected output:
| Result | |||
|---|---|---|---|
| 0 | 1 | ||
| 1 | 0.0639098 | ||
| 2 | 0.364662 | ||
| 3 | -0.295113 | ||
| 4 | -0.0864662 |
Example 4: Cross-correlation using a small lag count
Inputs:
| x | y | adjusted | fft | nlags | alpha | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 3 | 2 | 4 | 3 | 5 | 4 | 2 | 1 | 3 | 2 | 4 | 3 | 5 | true | true | 3 |
Excel formula:
=CCF({1,3,2,4,3,5,4}, {2,1,3,2,4,3,5}, TRUE, TRUE, 3, )
Expected output:
| Result | |||
|---|---|---|---|
| 0 | 0.289474 | ||
| 1 | 0.508772 | ||
| 2 | -0.160526 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import ccf as sm_ccf
def ccf(x, y, adjusted=True, fft=True, nlags=None, alpha=None):
"""
Compute cross-correlation between two time series across nonnegative lags.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.ccf.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): First time series as a 2D range.
y (list[list]): Second time series as a 2D range.
adjusted (bool, optional): Use denominator n-k instead of n for normalization. Default is True.
fft (bool, optional): Use FFT-based convolution. Default is True.
nlags (int, optional): Number of lags to compute; null uses statsmodels default. Default is None.
alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.
Returns:
list[list]: 2D table with columns lag, ccf, conf_low, and conf_high.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
x_vals = to1d(x)
y_vals = to1d(y)
if len(x_vals) < 2 or len(y_vals) < 2:
return "Error: x and y must each contain at least two numeric values"
if len(x_vals) != len(y_vals):
return "Error: x and y must have the same number of numeric values"
result = sm_ccf(
np.asarray(x_vals, dtype=float),
np.asarray(y_vals, dtype=float),
adjusted=adjusted,
fft=fft,
nlags=nlags,
alpha=alpha,
)
if isinstance(result, tuple):
ccf_vals, confint = result
else:
ccf_vals = result
confint = None
ccf_arr = np.asarray(ccf_vals, dtype=float)
conf_arr = np.asarray(confint, dtype=float) if confint is not None else None
table = []
for lag in range(len(ccf_arr)):
low = ""
high = ""
if conf_arr is not None and lag < len(conf_arr):
low = float(conf_arr[lag][0])
high = float(conf_arr[lag][1])
table.append([lag, float(ccf_arr[lag]), low, high])
return table
except Exception as e:
return f"Error: {str(e)}"Online Calculator
CCOVF
This function computes the cross-covariance function between two univariate series, quantifying linear co-movement across lag offsets.
For lag k, the cross-covariance is:
\gamma_{xy}(k) = \operatorname{Cov}(x_{t+k}, y_t)
The result helps identify lagged lead-lag structure before normalization into cross-correlation.
Excel Usage
=CCOVF(x, y, adjusted, demean, fft)
x(list[list], required): First time series as a 2D range.y(list[list], required): Second time series as a 2D range.adjusted(bool, optional, default: true): Use denominator n-k instead of n in covariance estimation.demean(bool, optional, default: true): Subtract sample means from both series before covariance estimation.fft(bool, optional, default: true): Use FFT-based convolution.
Returns (list[list]): 2D table with columns lag and ccovf.
Example 1: Cross-covariance with default options
Inputs:
| x | y | adjusted | demean | fft | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 2 | 3 | 4 | 5 | 6 | 7 | true | true | true |
Excel formula:
=CCOVF({1,2,3,4,5,6}, {2,3,4,5,6,7}, TRUE, TRUE, TRUE)
Expected output:
| Result | |
|---|---|
| 0 | 2.91667 |
| 1 | 1.75 |
| 2 | 0.25 |
| 3 | -1.58333 |
| 4 | -3.75 |
| 5 | -6.25 |
Example 2: Cross-covariance without demeaning
Inputs:
| x | y | adjusted | demean | fft | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 5 | 4 | 6 | 5 | 7 | 1 | 2 | 2 | 3 | 3 | 4 | true | false | true |
Excel formula:
=CCOVF({3,5,4,6,5,7}, {1,2,2,3,3,4}, TRUE, FALSE, TRUE)
Expected output:
| Result | |
|---|---|
| 0 | 13.6667 |
| 1 | 12.2 |
| 2 | 11.75 |
| 3 | 10 |
| 4 | 9.5 |
| 5 | 7 |
Example 3: Cross-covariance without FFT
Inputs:
| x | y | adjusted | demean | fft | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 4 | 3 | 5 | 4 | 6 | 5 | 1 | 3 | 2 | 4 | 3 | 5 | 4 | false | true | false |
Excel formula:
=CCOVF({2,4,3,5,4,6,5}, {1,3,2,4,3,5,4}, FALSE, TRUE, FALSE)
Expected output:
| Result | |
|---|---|
| 0 | 1.55102 |
| 1 | 0.0991254 |
| 2 | 0.565598 |
| 3 | -0.457726 |
| 4 | -0.134111 |
| 5 | -0.586006 |
| 6 | -0.262391 |
Example 4: Cross-covariance with unadjusted denominator
Inputs:
| x | y | adjusted | demean | fft | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 3 | 2 | 4 | 3 | 5 | 4 | 6 | 2 | 1 | 3 | 2 | 4 | 3 | 5 | 4 | false | true | true |
Excel formula:
=CCOVF({1,3,2,4,3,5,4,6}, {2,1,3,2,4,3,5,4}, FALSE, TRUE, TRUE)
Expected output:
| Result | |
|---|---|
| 0 | 0.75 |
| 1 | 1.3125 |
| 2 | -0.0625 |
| 3 | 0.3125 |
| 4 | -0.625 |
| 5 | -0.3125 |
| 6 | -0.6875 |
| 7 | -0.3125 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import ccovf as sm_ccovf
def ccovf(x, y, adjusted=True, demean=True, fft=True):
"""
Estimate cross-covariance values between two time series across lags.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.ccovf.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): First time series as a 2D range.
y (list[list]): Second time series as a 2D range.
adjusted (bool, optional): Use denominator n-k instead of n in covariance estimation. Default is True.
demean (bool, optional): Subtract sample means from both series before covariance estimation. Default is True.
fft (bool, optional): Use FFT-based convolution. Default is True.
Returns:
list[list]: 2D table with columns lag and ccovf.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
x_vals = to1d(x)
y_vals = to1d(y)
if len(x_vals) < 2 or len(y_vals) < 2:
return "Error: x and y must each contain at least two numeric values"
if len(x_vals) != len(y_vals):
return "Error: x and y must have the same number of numeric values"
ccovf_vals = sm_ccovf(
np.asarray(x_vals, dtype=float),
np.asarray(y_vals, dtype=float),
adjusted=adjusted,
demean=demean,
fft=fft,
)
arr = np.asarray(ccovf_vals, dtype=float)
return [[lag, float(arr[lag])] for lag in range(len(arr))]
except Exception as e:
return f"Error: {str(e)}"Online Calculator
KPSS
This function applies the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test for stationarity.
Unlike the ADF test, the KPSS null hypothesis is stationarity (around a level or trend), and the alternative is a unit root.
The output includes the KPSS statistic, p-value, selected lag truncation, and reference critical values.
Excel Usage
=KPSS(x, regression, nlags, store)
x(list[list], required): Time-series observations as a 2D range.regression(str, optional, default: “c”): Null hypothesis type, c for level-stationary or ct for trend-stationary.nlags(str, optional, default: “auto”): Lag selection mode (auto or legacy) or an integer provided as text.store(bool, optional, default: false): Return result storage object in addition to scalar outputs.
Returns (list[list]): 2D key-value table summarizing KPSS statistics and critical values.
Example 1: KPSS with automatic lag selection and level stationarity null
Inputs:
| x | regression | nlags | store | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.1 | 1.05 | 1.08 | 1.03 | 1.09 | 1.04 | 1.1 | 1.06 | 1.11 | c | auto | false |
Excel formula:
=KPSS({1,1.1,1.05,1.08,1.03,1.09,1.04,1.1,1.06,1.11}, "c", "auto", FALSE)
Expected output:
| Result | |
|---|---|
| kpss_stat | 0.5 |
| p_value | 0.0416667 |
| lags | 9 |
| critical_10% | 0.347 |
| critical_5% | 0.463 |
| critical_2.5% | 0.574 |
| critical_1% | 0.739 |
Example 2: KPSS with legacy lag rule and trend stationarity null
Inputs:
| x | regression | nlags | store | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2.2 | 2.35 | 2.5 | 2.65 | 2.8 | 2.95 | 3.1 | 3.25 | 3.4 | ct | legacy | false |
Excel formula:
=KPSS({2,2.2,2.35,2.5,2.65,2.8,2.95,3.1,3.25,3.4}, "ct", "legacy", FALSE)
Expected output:
| Result | |
|---|---|
| kpss_stat | 0.358798 |
| p_value | 0.01 |
| lags | 7 |
| critical_10% | 0.119 |
| critical_5% | 0.146 |
| critical_2.5% | 0.176 |
| critical_1% | 0.216 |
Example 3: KPSS with manually provided lag count
Inputs:
| x | regression | nlags | store | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 3.1 | 3 | 3.15 | 3.05 | 3.2 | 3.1 | 3.25 | 3.15 | 3.3 | c | 2 | false |
Excel formula:
=KPSS({3,3.1,3,3.15,3.05,3.2,3.1,3.25,3.15,3.3}, "c", 2, FALSE)
Expected output:
| Result | |
|---|---|
| kpss_stat | 0.440047 |
| p_value | 0.0598935 |
| lags | 2 |
| critical_10% | 0.347 |
| critical_5% | 0.463 |
| critical_2.5% | 0.574 |
| critical_1% | 0.739 |
Example 4: KPSS on low-variance stationary-looking series
Inputs:
| x | regression | nlags | store | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 5.02 | 4.99 | 5.01 | 5 | 5.03 | 4.98 | 5.01 | 5 | 5.02 | c | auto | false |
Excel formula:
=KPSS({5,5.02,4.99,5.01,5,5.03,4.98,5.01,5,5.02}, "c", "auto", FALSE)
Expected output:
| Result | |
|---|---|
| kpss_stat | 0.326271 |
| p_value | 0.1 |
| lags | 6 |
| critical_10% | 0.347 |
| critical_5% | 0.463 |
| critical_2.5% | 0.574 |
| critical_1% | 0.739 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import kpss as sm_kpss
def kpss(x, regression='c', nlags='auto', store=False):
"""
Run the KPSS stationarity test under level or trend null hypotheses.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.kpss.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
regression (str, optional): Null hypothesis type, c for level-stationary or ct for trend-stationary. Default is 'c'.
nlags (str, optional): Lag selection mode (auto or legacy) or an integer provided as text. Default is 'auto'.
store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.
Returns:
list[list]: 2D key-value table summarizing KPSS statistics and critical values.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
if regression not in ("c", "ct"):
return "Error: regression must be 'c' or 'ct'"
if nlags in ("auto", "legacy"):
nlags_arg = nlags
else:
try:
parsed = int(float(nlags))
except (TypeError, ValueError):
return "Error: nlags must be 'auto', 'legacy', or an integer"
if parsed < 0:
return "Error: nlags integer must be nonnegative"
nlags_arg = parsed
series = to1d(x)
if len(series) < 4:
return "Error: x must contain at least four numeric values"
result = sm_kpss(
np.asarray(series, dtype=float),
regression=regression,
nlags=nlags_arg,
store=store,
)
kpss_stat = float(result[0])
p_value = float(result[1])
used_lags = int(result[2])
crit_values = result[3]
rows = [
["kpss_stat", kpss_stat],
["p_value", p_value],
["lags", used_lags],
]
if isinstance(crit_values, dict):
for key in ("10%", "5%", "2.5%", "1%"):
if key in crit_values:
rows.append([f"critical_{key}", float(crit_values[key])])
return rows
except Exception as e:
return f"Error: {str(e)}"Online Calculator
PACF
This function estimates the partial autocorrelation function (PACF), which measures the direct correlation between x_t and x_{t-k} after controlling for intermediate lags.
In autoregressive model identification, PACF helps indicate candidate model order by highlighting lags with substantial direct dependence.
The PACF at lag k can be interpreted as the final coefficient in an AR(k) regression.
Excel Usage
=PACF(x, nlags, method, alpha)
x(list[list], required): Time-series observations as a 2D range.nlags(int, optional, default: null): Maximum lag to compute; null uses statsmodels default.method(str, optional, default: “ywadjusted”): Estimation method for PACF.alpha(float, optional, default: null): Significance level for confidence intervals; null disables intervals.
Returns (list[list]): 2D table with columns lag, pacf, conf_low, and conf_high.
Example 1: PACF using Yule-Walker adjusted method
Inputs:
| x | nlags | method | alpha | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 4 | 3 | 2 | 4 | ywadjusted |
Excel formula:
=PACF({1,2,3,4,5,4,3,2}, 4, "ywadjusted", )
Expected output:
| Result | |||
|---|---|---|---|
| 0 | 1 | ||
| 1 | 0.571429 | ||
| 2 | -0.649832 | ||
| 3 | -0.832527 | ||
| 4 | -1.57327 |
Example 2: PACF with confidence intervals
Inputs:
| x | nlags | method | alpha | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | 4 | burg | 0.05 |
Excel formula:
=PACF({2,1,2,1,2,1,2,1,2}, 4, "burg", 0.05)
Expected output:
| Result | |||
|---|---|---|---|
| 0 | 1 | 1 | 1 |
| 1 | -0.97561 | -1.62893 | -0.322288 |
| 2 | 1 | 0.346679 | 1.65332 |
| 3 | 3.14684e-16 | -0.653321 | 0.653321 |
| 4 | 2.5924e-16 | -0.653321 | 0.653321 |
Example 3: PACF using OLS adjusted estimator
Inputs:
| x | nlags | method | alpha | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 4 | 6 | 5 | 7 | 6 | 8 | 7 | 9 | 3 | ols-adjusted |
Excel formula:
=PACF({5,4,6,5,7,6,8,7,9}, 3, "ols-adjusted", )
Expected output:
| Result | |||
|---|---|---|---|
| 0 | 1 | ||
| 1 | 0.5625 | ||
| 2 | 1.28571 | ||
| 3 | -0.5 |
Example 4: PACF using Levinson-Durbin biased estimator
Inputs:
| x | nlags | method | alpha | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 2 | 4 | 3 | 5 | 4 | 6 | 5 | 3 | ldbiased |
Excel formula:
=PACF({3,2,4,3,5,4,6,5}, 3, "ldbiased", )
Expected output:
| Result | |||
|---|---|---|---|
| 0 | 1 | ||
| 1 | 0.25 | ||
| 2 | 0.288889 | ||
| 3 | -0.346983 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import pacf as sm_pacf
def pacf(x, nlags=None, method='ywadjusted', alpha=None):
"""
Compute partial autocorrelation values across lags for lag-order diagnostics.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.pacf.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
nlags (int, optional): Maximum lag to compute; null uses statsmodels default. Default is None.
method (str, optional): Estimation method for PACF. Default is 'ywadjusted'.
alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.
Returns:
list[list]: 2D table with columns lag, pacf, conf_low, and conf_high.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
valid_methods = (
"yw", "ywadjusted", "ols", "ols-inefficient", "ols-adjusted",
"ywm", "ywmle", "ld", "ldadjusted", "ldb", "ldbiased", "burg"
)
if method not in valid_methods:
return "Error: method is not a supported PACF estimator"
series = to1d(x)
if len(series) < 3:
return "Error: x must contain at least three numeric values"
result = sm_pacf(np.asarray(series, dtype=float), nlags=nlags, method=method, alpha=alpha)
if isinstance(result, tuple):
pacf_vals, confint = result
else:
pacf_vals = result
confint = None
pacf_arr = np.asarray(pacf_vals, dtype=float)
conf_arr = np.asarray(confint, dtype=float) if confint is not None else None
table = []
for lag in range(len(pacf_arr)):
low = ""
high = ""
if conf_arr is not None and lag < len(conf_arr):
low = float(conf_arr[lag][0])
high = float(conf_arr[lag][1])
table.append([lag, float(pacf_arr[lag]), low, high])
return table
except Exception as e:
return f"Error: {str(e)}"Online Calculator
Q_STAT
This function computes the Ljung-Box portmanteau statistic from a sequence of autocorrelation estimates.
The statistic aggregates squared autocorrelations across lags to test whether serial correlation remains in a process:
Q = n(n+2) \sum_{k=1}^{m} \frac{\rho_k^2}{n-k}
It returns the cumulative Q-statistic and associated p-value at each lag.
Excel Usage
=Q_STAT(x, nobs)
x(list[list], required): Autocorrelation coefficients as a 2D range, typically excluding lag zero.nobs(int, required): Total number of observations in the underlying sample.
Returns (list[list]): 2D table with columns lag, q_stat, and p_value.
Example 1: Ljung-Box statistics from short ACF sequence
Inputs:
| x | nobs | ||
|---|---|---|---|
| 0.2 | 0.1 | 0.05 | 60 |
Excel formula:
=Q_STAT({0.2,0.1,0.05}, 60)
Expected output:
| Result | ||
|---|---|---|
| 1 | 2.52203 | 0.112266 |
| 2 | 3.16341 | 0.205624 |
| 3 | 3.32657 | 0.343962 |
Example 2: Ljung-Box statistics from moderate ACF sequence
Inputs:
| x | nobs | |||
|---|---|---|---|---|
| 0.35 | 0.22 | 0.11 | 0.04 | 120 |
Excel formula:
=Q_STAT({0.35,0.22,0.11,0.04}, 120)
Expected output:
| Result | ||
|---|---|---|
| 1 | 15.0706 | 0.000103564 |
| 2 | 21.0755 | 0.0000265167 |
| 3 | 22.5895 | 0.0000491731 |
| 4 | 22.7915 | 0.000139371 |
Example 3: Ljung-Box statistics with alternating autocorrelation signs
Inputs:
| x | nobs | |||
|---|---|---|---|---|
| 0.25 | -0.15 | 0.1 | -0.05 | 80 |
Excel formula:
=Q_STAT({0.25,-0.15,0.1,-0.05}, 80)
Expected output:
| Result | ||
|---|---|---|
| 1 | 5.18987 | 0.0227189 |
| 2 | 7.08218 | 0.0289817 |
| 3 | 7.93413 | 0.0473929 |
| 4 | 8.14992 | 0.0862383 |
Example 4: Ljung-Box statistics from five-lag autocorrelation input
Inputs:
| x | nobs | ||||
|---|---|---|---|---|---|
| 0.18 | 0.12 | 0.09 | 0.04 | 0.02 | 150 |
Excel formula:
=Q_STAT({0.18,0.12,0.09,0.04,0.02}, 150)
Expected output:
| Result | ||
|---|---|---|
| 1 | 4.95785 | 0.0259724 |
| 2 | 7.17623 | 0.0276504 |
| 3 | 8.43256 | 0.0378689 |
| 4 | 8.68242 | 0.0695466 |
| 5 | 8.74532 | 0.119664 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import q_stat as sm_q_stat
def q_stat(x, nobs):
"""
Compute Ljung-Box Q statistics and p-values from autocorrelation coefficients.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.q_stat.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Autocorrelation coefficients as a 2D range, typically excluding lag zero.
nobs (int): Total number of observations in the underlying sample.
Returns:
list[list]: 2D table with columns lag, q_stat, and p_value.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
if nobs <= 1:
return "Error: nobs must be greater than 1"
acf_vals = to1d(x)
if len(acf_vals) == 0:
return "Error: x must contain at least one numeric autocorrelation value"
q_vals, p_vals = sm_q_stat(np.asarray(acf_vals, dtype=float), nobs=nobs)
q_arr = np.asarray(q_vals, dtype=float)
p_arr = np.asarray(p_vals, dtype=float)
return [[lag + 1, float(q_arr[lag]), float(p_arr[lag])] for lag in range(len(q_arr))]
except Exception as e:
return f"Error: {str(e)}"Online Calculator
RURTEST
This function applies the Range Unit-Root (RUR) test, which tests a stationarity null hypothesis using a range-based statistic.
It is designed as an alternative unit-root diagnostic that can be robust under nonlinearities and structural features considered in its original development.
The output includes the RUR statistic, p-value, and critical values used for interpretation.
Excel Usage
=RURTEST(x, store)
x(list[list], required): Time-series observations as a 2D range.store(bool, optional, default: false): Return result storage object in addition to scalar outputs.
Returns (list[list]): 2D key-value table summarizing RUR test outputs.
Example 1: Range unit-root test on mildly trending data
Inputs:
| x | store | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.05 | 1.02 | 1.08 | 1.06 | 1.1 | 1.09 | 1.12 | 1.1 | 1.14 | 1.16 | 1.18 | 1.2 | 1.19 | 1.22 | 1.24 | 1.23 | 1.27 | 1.29 | 1.31 | 1.3 | 1.33 | 1.36 | 1.35 | 1.38 | 1.4 | 1.42 | 1.41 | 1.44 | 1.46 | false |
Excel formula:
=RURTEST({1,1.05,1.02,1.08,1.06,1.1,1.09,1.12,1.1,1.14,1.16,1.18,1.2,1.19,1.22,1.24,1.23,1.27,1.29,1.31,1.3,1.33,1.36,1.35,1.38,1.4,1.42,1.41,1.44,1.46}, FALSE)
Expected output:
| Result | |
|---|---|
| rur_stat | 3.65148 |
| p_value | 0.95 |
| critical_10% | 1.09624 |
| critical_5% | 0.94492 |
| critical_2.5% | 0.83556 |
| critical_1% | 0.68962 |
Example 2: Range unit-root test on level-stationary-looking data
Inputs:
| x | store | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 5.02 | 4.99 | 5.01 | 5 | 5.03 | 4.98 | 5.01 | 5 | 5.02 | 4.99 | 5.01 | 5 | 5.03 | 4.97 | 5 | 5.01 | 4.99 | 5.02 | 5 | 4.98 | 5.01 | 5 | 5.02 | 4.99 | 5.01 | 5 | 5.03 | 4.98 | 5 | false |
Excel formula:
=RURTEST({5,5.02,4.99,5.01,5,5.03,4.98,5.01,5,5.02,4.99,5.01,5,5.03,4.97,5,5.01,4.99,5.02,5,4.98,5.01,5,5.02,4.99,5.01,5,5.03,4.98,5}, FALSE)
Expected output:
| Result | |
|---|---|
| rur_stat | 0.912871 |
| p_value | 0.05 |
| critical_10% | 1.09624 |
| critical_5% | 0.94492 |
| critical_2.5% | 0.83556 |
| critical_1% | 0.68962 |
Example 3: Range unit-root test on higher variance observations
Inputs:
| x | store | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2.4 | 2.1 | 2.5 | 2.2 | 2.6 | 2.3 | 2.7 | 2.4 | 2.8 | 2.35 | 2.9 | 2.45 | 2.95 | 2.5 | 3 | 2.55 | 3.05 | 2.6 | 3.1 | 2.65 | 3.15 | 2.7 | 3.2 | 2.75 | 3.25 | 2.8 | 3.3 | 2.85 | 3.35 | false |
Excel formula:
=RURTEST({2,2.4,2.1,2.5,2.2,2.6,2.3,2.7,2.4,2.8,2.35,2.9,2.45,2.95,2.5,3,2.55,3.05,2.6,3.1,2.65,3.15,2.7,3.2,2.75,3.25,2.8,3.3,2.85,3.35}, FALSE)
Expected output:
| Result | |
|---|---|
| rur_stat | 2.73861 |
| p_value | 0.95 |
| critical_10% | 1.09624 |
| critical_5% | 0.94492 |
| critical_2.5% | 0.83556 |
| critical_1% | 0.68962 |
Example 4: Range unit-root test on smooth growth sequence
Inputs:
| x | store | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 3.05 | 3.1 | 3.16 | 3.2 | 3.25 | 3.31 | 3.36 | 3.4 | 3.45 | 3.5 | 3.55 | 3.6 | 3.66 | 3.7 | 3.75 | 3.81 | 3.86 | 3.9 | 3.95 | 4 | 4.05 | 4.1 | 4.16 | 4.2 | 4.25 | 4.31 | 4.36 | 4.4 | 4.45 | false |
Excel formula:
=RURTEST({3,3.05,3.1,3.16,3.2,3.25,3.31,3.36,3.4,3.45,3.5,3.55,3.6,3.66,3.7,3.75,3.81,3.86,3.9,3.95,4,4.05,4.1,4.16,4.2,4.25,4.31,4.36,4.4,4.45}, FALSE)
Expected output:
| Result | |
|---|---|
| rur_stat | 5.29465 |
| p_value | 0.95 |
| critical_10% | 1.09624 |
| critical_5% | 0.94492 |
| critical_2.5% | 0.83556 |
| critical_1% | 0.68962 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import range_unit_root_test as sm_range_unit_root_test
def rurtest(x, store=False):
"""
Run the range unit-root test as an alternative stationarity diagnostic.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.range_unit_root_test.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.
Returns:
list[list]: 2D key-value table summarizing RUR test outputs.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
series = to1d(x)
if len(series) < 25:
return "Error: x must contain at least 25 numeric values"
result = sm_range_unit_root_test(np.asarray(series, dtype=float), store=store)
rur_stat = float(result[0])
p_value = float(result[1])
crit_values = result[2]
rows = [
["rur_stat", rur_stat],
["p_value", p_value],
]
if isinstance(crit_values, dict):
for key in ("10%", "5%", "2.5%", "1%"):
if key in crit_values:
rows.append([f"critical_{key}", float(crit_values[key])])
return rows
except Exception as e:
return f"Error: {str(e)}"Online Calculator
ZIVOT_ANDREWS
This function performs the Zivot-Andrews unit-root test, which extends unit-root diagnostics by allowing a single unknown structural break in the series.
The test evaluates a unit-root null against alternatives with break-adjusted deterministic components.
It returns the test statistic, p-value, critical values, selected lag, and estimated break index.
Excel Usage
=ZIVOT_ANDREWS(x, trim, maxlag, regression, autolag)
x(list[list], required): Time-series observations as a 2D range.trim(float, optional, default: 0.15): Fraction of observations trimmed from each end when searching for break date.maxlag(int, optional, default: null): Maximum lag included in candidate regressions; null uses default rule.regression(str, optional, default: “c”): Deterministic specification, c, t, or ct.autolag(str, optional, default: “AIC”): Lag-selection criterion (AIC, BIC, t-stat, or none).
Returns (list[list]): 2D key-value table summarizing Zivot-Andrews test outputs.
Example 1: Zivot-Andrews with default options
Inputs:
| x | trim | maxlag | regression | autolag | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.1 | 1.2 | 1.25 | 1.3 | 1.35 | 1.33 | 1.4 | 1.45 | 1.5 | 1.55 | 1.58 | 1.6 | 1.65 | 1.7 | 0.15 | c | AIC |
Excel formula:
=ZIVOT_ANDREWS({1,1.1,1.2,1.25,1.3,1.35,1.33,1.4,1.45,1.5,1.55,1.58,1.6,1.65,1.7}, 0.15, , "c", "AIC")
Expected output:
| za_stat | NaN |
|---|---|
| p_value | NaN |
| base_lag | 4 |
| break_index | 4 |
| critical_1% | -5.27644 |
| critical_5% | -4.81067 |
| critical_10% | -4.56618 |
Example 2: Zivot-Andrews using trend-only regression
Inputs:
| x | trim | maxlag | regression | autolag | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2.04 | 2.08 | 2.12 | 2.16 | 2.2 | 2.24 | 2.28 | 2.32 | 2.36 | 2.4 | 2.44 | 2.48 | 2.52 | 2.56 | 2.6 | 2.66 | 2.72 | 2.78 | 2.84 | 2.9 | 2.96 | 3.02 | 3.08 | 3.14 | 3.2 | 3.26 | 3.32 | 3.38 | 3.44 | 0.15 | 1 | t | AIC |
Excel formula:
=ZIVOT_ANDREWS({2,2.04,2.08,2.12,2.16,2.2,2.24,2.28,2.32,2.36,2.4,2.44,2.48,2.52,2.56,2.6,2.66,2.72,2.78,2.84,2.9,2.96,3.02,3.08,3.14,3.2,3.26,3.32,3.38,3.44}, 0.15, 1, "t", "AIC")
Expected output:
| za_stat | NaN |
|---|---|
| p_value | NaN |
| base_lag | 1 |
| break_index | 17 |
| critical_1% | -5.03421 |
| critical_5% | -4.4058 |
| critical_10% | -4.13678 |
Example 3: Zivot-Andrews using constant and trend regression
Inputs:
| x | trim | maxlag | regression | autolag | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 3.05 | 3.08 | 3.12 | 3.2 | 3.25 | 3.3 | 3.38 | 3.42 | 3.5 | 3.55 | 3.6 | 3.66 | 3.72 | 3.8 | 0.15 | 2 | ct | t-stat |
Excel formula:
=ZIVOT_ANDREWS({3,3.05,3.08,3.12,3.2,3.25,3.3,3.38,3.42,3.5,3.55,3.6,3.66,3.72,3.8}, 0.15, 2, "ct", "t-stat")
Expected output:
| Result | |
|---|---|
| za_stat | -5.0866 |
| p_value | 0.0485155 |
| base_lag | 0 |
| break_index | 3 |
| critical_1% | -5.57556 |
| critical_5% | -5.07332 |
| critical_10% | -4.82668 |
Example 4: Zivot-Andrews with fixed max lag and no autolag
Inputs:
| x | trim | maxlag | regression | autolag | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 4.03 | 4.07 | 4.1 | 4.13 | 4.17 | 4.2 | 4.24 | 4.27 | 4.31 | 4.34 | 4.38 | 4.41 | 4.45 | 4.48 | 4.52 | 4.55 | 4.6 | 4.66 | 4.71 | 4.77 | 4.82 | 4.88 | 4.93 | 4.99 | 5.04 | 5.1 | 5.15 | 5.21 | 5.26 | 0.15 | 1 | c | none |
Excel formula:
=ZIVOT_ANDREWS({4,4.03,4.07,4.1,4.13,4.17,4.2,4.24,4.27,4.31,4.34,4.38,4.41,4.45,4.48,4.52,4.55,4.6,4.66,4.71,4.77,4.82,4.88,4.93,4.99,5.04,5.1,5.15,5.21,5.26}, 0.15, 1, "c", "none")
Expected output:
| Result | |
|---|---|
| za_stat | -1.51838 |
| p_value | 0.999 |
| base_lag | 1 |
| break_index | 17 |
| critical_1% | -5.27644 |
| critical_5% | -4.81067 |
| critical_10% | -4.56618 |
Python Code
Show Code
import numpy as np
from statsmodels.tsa.stattools import zivot_andrews as sm_zivot_andrews
def zivot_andrews(x, trim=0.15, maxlag=None, regression='c', autolag='AIC'):
"""
Run the Zivot-Andrews unit-root test allowing one endogenous structural break.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.zivot_andrews.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
trim (float, optional): Fraction of observations trimmed from each end when searching for break date. Default is 0.15.
maxlag (int, optional): Maximum lag included in candidate regressions; null uses default rule. Default is None.
regression (str, optional): Deterministic specification, c, t, or ct. Default is 'c'.
autolag (str, optional): Lag-selection criterion (AIC, BIC, t-stat, or none). Default is 'AIC'.
Returns:
list[list]: 2D key-value table summarizing Zivot-Andrews test outputs.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
if trim < 0 or trim > 0.333:
return "Error: trim must be between 0 and 0.333"
if regression not in ("c", "t", "ct"):
return "Error: regression must be 'c', 't', or 'ct'"
if autolag in ("none", "None", "", None):
autolag_arg = None
elif autolag in ("AIC", "BIC", "t-stat"):
autolag_arg = autolag
else:
return "Error: autolag must be 'AIC', 'BIC', 't-stat', or 'none'"
series = to1d(x)
if len(series) < 10:
return "Error: x must contain at least ten numeric values"
result = sm_zivot_andrews(
np.asarray(series, dtype=float),
trim=trim,
maxlag=maxlag,
regression=regression,
autolag=autolag_arg,
)
za_stat = float(result[0])
p_value = float(result[1])
crit_values = result[2]
base_lag = int(result[3])
break_index = int(result[4])
rows = [
["za_stat", za_stat],
["p_value", p_value],
["base_lag", base_lag],
["break_index", break_index],
]
if isinstance(crit_values, dict):
for key in ("1%", "5%", "10%"):
if key in crit_values:
rows.append([f"critical_{key}", float(crit_values[key])])
return rows
except Exception as e:
return f"Error: {str(e)}"Online Calculator