Autocorrelation And Stationarity Tests

Overview

Autocorrelation and stationarity are central ideas in time-series analysis because they describe how observations depend on prior values and whether those relationships remain stable over time. In practical modeling, many forecasting and inference methods assume some form of weak stationarity, so analysts need reliable diagnostics before fitting ARIMA-style models or interpreting residuals. This category focuses on tools that quantify serial dependence, test unit-root behavior, and evaluate whether remaining autocorrelation is statistically meaningful. For background, see Autocorrelation, Stationary process, and Unit root.

The unifying concepts are lag structure, covariance/correlation decomposition, and hypothesis testing under dependence. Correlation-based diagnostics summarize dependence at lag k using normalized measures such as \rho_k, while covariance-based diagnostics preserve scale in \gamma_k. Stationarity tests then formalize competing hypotheses about persistence: some tests take a unit root as the null, and others take stationarity as the null, so they are best interpreted jointly. Portmanteau statistics aggregate evidence across multiple lags, often using forms like

Q = n(n+2)\sum_{k=1}^{m}\frac{\rho_k^2}{n-k},

to check whether residual serial correlation remains.

These functions are implemented with statsmodels, especially statsmodels.tsa.stattools, a widely used Python toolkit for econometrics and statistical time-series modeling. The library provides consistent APIs for autocorrelation diagnostics, unit-root testing, and lag-selection options, which makes it suitable for both exploratory workflows and production-grade model validation.

For univariate dependence structure, ACF, ACOVF, and PACF provide complementary views of lag dynamics. ACF reports normalized serial correlation across lags, ACOVF reports scale-preserving autocovariance, and PACF isolates direct lag-k effects after controlling intermediate lags. Together, they support AR/MA order identification, residual checking, and feature engineering for lagged predictors. Q_STAT extends this by converting an autocorrelation sequence into Ljung-Box statistics and p-values to test whether correlation structure remains jointly significant.

For lead-lag relationships between two series, CCF and CCOVF quantify cross-series dependence across offsets. CCOVF measures raw co-movement and is useful when magnitude is important, while CCF normalizes by variance to compare effect strength on a common scale. These tools are commonly used in transfer-function modeling, signal alignment, and exploratory checks of whether one process systematically leads or follows another.

For stationarity and unit-root diagnostics, ADFULLER, KPSS, RURTEST, and ZIVOT_ANDREWS cover complementary assumptions. ADFULLER tests a unit-root null against stationarity alternatives, whereas KPSS inverts the null to stationarity, making the pair especially useful for triangulation. RURTEST adds a range-based unit-root diagnostic, and ZIVOT_ANDREWS allows one unknown structural break when testing persistence. In practice, analysts compare outcomes across these tests rather than relying on a single p-value, especially when trend shifts or regime changes may distort standard stationarity conclusions.

ACF

This function estimates the autocorrelation function (ACF) of a univariate time series for lag values from 0 up to a selected maximum lag.

The lag-k autocorrelation is the normalized covariance between observations separated by k periods:

\rho_k = \frac{\operatorname{Cov}(x_t, x_{t-k})}{\operatorname{Var}(x_t)}

Optional confidence intervals and Ljung-Box diagnostics can be included to help assess whether observed autocorrelations are statistically significant.

Excel Usage

=ACF(x, adjusted, nlags, qstat, fft, alpha, bartlett_confint, missing)

x (list[list], required): Time-series observations as a 2D range.
adjusted (bool, optional, default: false): Use denominator n-k instead of n in covariance normalization.
nlags (int, optional, default: null): Maximum lag to compute; null uses statsmodels default.
qstat (bool, optional, default: false): Return Ljung-Box Q statistics and p-values for lags above zero.
fft (bool, optional, default: true): Use FFT-based computation for improved speed on long series.
alpha (float, optional, default: null): Significance level for confidence intervals; null disables intervals.
bartlett_confint (bool, optional, default: true): Use Bartlett formula for confidence interval standard errors.
missing (str, optional, default: “none”): Missing-data handling mode.

Returns (list[list]): 2D table with columns lag, acf, conf_low, conf_high, q_stat, and p_value.

Example 1: ACF for a short increasing series

Inputs:

x						adjusted	nlags	qstat	fft	alpha	bartlett_confint	missing
1	2	3	4	5	6	false	3	false	true		true	none

Excel formula:

=ACF({1,2,3,4,5,6}, FALSE, 3, FALSE, TRUE, , TRUE, "none")

Expected output:

Result
0	1
1	0.5
2	0.0571429
3	-0.271429

Example 2: ACF with confidence intervals

Inputs:

x								adjusted	nlags	qstat	fft	alpha	bartlett_confint	missing
2	1	2	1	2	1	2	1	false	4	false	true	0.05	true	none

Excel formula:

=ACF({2,1,2,1,2,1,2,1}, FALSE, 4, FALSE, TRUE, 0.05, TRUE, "none")

Expected output:

Result
0	1	1	1
1	-0.875	-1.56795	-0.182048
2	0.75	-0.35248	1.85248
3	-0.625	-1.95002	0.700016
4	0.5	-0.959729	1.95973

Example 3: ACF with Ljung-Box statistics

Inputs:

x									adjusted	nlags	qstat	fft	alpha	bartlett_confint	missing
1	0	1	0	1	0	1	0	1	false	4	true	true		true	none

Excel formula:

=ACF({1,0,1,0,1,0,1,0,1}, FALSE, 4, TRUE, TRUE, , TRUE, "none")

Expected output:

Result
0	1
1	-0.888889	9.77778	0.00176634
2	0.772222	18.2115	0.000111023
3	-0.666667	25.5449	0.0000118766
4	0.544444	31.414	0.00000252021

Example 4: ACF using adjusted denominator

Inputs:

x								adjusted	nlags	qstat	fft	alpha	bartlett_confint	missing
3	4	6	8	7	5	4	6	true	3	false	false		true	none

Excel formula:

=ACF({3,4,6,8,7,5,4,6}, TRUE, 3, FALSE, FALSE, , TRUE, "none")

Expected output:

Result
0	1
1	0.423181
2	-0.505241
3	-0.909434

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import acf as sm_acf

def acf(x, adjusted=False, nlags=None, qstat=False, fft=True, alpha=None, bartlett_confint=True, missing='none'):
    """
    Compute autocorrelation values across lags with optional confidence intervals and Ljung-Box statistics.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.acf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n in covariance normalization. Default is False.
        nlags (int, optional): Maximum lag to compute; null uses statsmodels default. Default is None.
        qstat (bool, optional): Return Ljung-Box Q statistics and p-values for lags above zero. Default is False.
        fft (bool, optional): Use FFT-based computation for improved speed on long series. Default is True.
        alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.
        bartlett_confint (bool, optional): Use Bartlett formula for confidence interval standard errors. Default is True.
        missing (str, optional): Missing-data handling mode. Default is 'none'.

    Returns:
        list[list]: 2D table with columns lag, acf, conf_low, conf_high, q_stat, and p_value.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if missing not in ("none", "raise", "conservative", "drop"):
            return "Error: missing must be one of 'none', 'raise', 'conservative', or 'drop'"

        series = to1d(x)
        if len(series) < 2:
            return "Error: x must contain at least two numeric values"

        result = sm_acf(
            np.asarray(series, dtype=float),
            adjusted=adjusted,
            nlags=nlags,
            qstat=qstat,
            fft=fft,
            alpha=alpha,
            bartlett_confint=bartlett_confint,
            missing=missing,
        )

        acf_vals = None
        confint = None
        q_vals = None
        p_vals = None

        if isinstance(result, tuple):
            if len(result) == 4:
                acf_vals, confint, q_vals, p_vals = result
            elif len(result) == 3:
                acf_vals, q_vals, p_vals = result
            elif len(result) == 2:
                acf_vals, confint = result
            else:
                return "Error: Unexpected output format from statsmodels acf"
        else:
            acf_vals = result

        acf_arr = np.asarray(acf_vals, dtype=float)
        conf_arr = np.asarray(confint, dtype=float) if confint is not None else None
        q_arr = np.asarray(q_vals, dtype=float) if q_vals is not None else None
        p_arr = np.asarray(p_vals, dtype=float) if p_vals is not None else None

        table = []
        for lag in range(len(acf_arr)):
            low = ""
            high = ""
            q_stat_val = ""
            p_val = ""

            if conf_arr is not None and lag < len(conf_arr):
                low = float(conf_arr[lag][0])
                high = float(conf_arr[lag][1])

            if q_arr is not None and lag > 0 and (lag - 1) < len(q_arr):
                q_stat_val = float(q_arr[lag - 1])
            if p_arr is not None and lag > 0 and (lag - 1) < len(p_arr):
                p_val = float(p_arr[lag - 1])

            table.append([lag, float(acf_arr[lag]), low, high, q_stat_val, p_val])

        return table
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Time-series observations as a 2D range.

adjusted

Use denominator n-k instead of n in covariance normalization.

nlags

Maximum lag to compute; null uses statsmodels default.

qstat

Return Ljung-Box Q statistics and p-values for lags above zero.

fft

Use FFT-based computation for improved speed on long series.

alpha

Significance level for confidence intervals; null disables intervals.

bartlett_confint

Use Bartlett formula for confidence interval standard errors.

missing

Missing-data handling mode.

ACOVF

This function computes the autocovariance function (ACOVF), which describes covariance between a series and lagged versions of itself.

For lag k, autocovariance is defined as:

\gamma_k = \operatorname{Cov}(x_t, x_{t-k})

The output is useful for understanding serial dependence magnitude before normalization into autocorrelation.

Excel Usage

=ACOVF(x, adjusted, demean, fft, missing, nlag)

x (list[list], required): Time-series observations as a 2D range.
adjusted (bool, optional, default: false): Use denominator n-k instead of n in covariance estimation.
demean (bool, optional, default: true): Subtract sample mean before covariance estimation.
fft (bool, optional, default: true): Use FFT-based convolution.
missing (str, optional, default: “none”): Missing-data handling mode.
nlag (int, optional, default: null): Maximum lag to return; null returns full available range.

Returns (list[list]): 2D table with columns lag and acovf.

Example 1: Autocovariance with default settings

Inputs:

x						adjusted	demean	fft	missing	nlag
1	2	3	4	5	6	false	true	true	none

Excel formula:

=ACOVF({1,2,3,4,5,6}, FALSE, TRUE, TRUE, "none", )

Expected output:

Result
0	2.91667
1	1.45833
2	0.166667
3	-0.791667
4	-1.25
5	-1.04167

Example 2: Autocovariance with adjusted denominator

Inputs:

x							adjusted	demean	fft	missing	nlag
3	5	4	6	5	7	6	true	true	true	none	4

Excel formula:

=ACOVF({3,5,4,6,5,7,6}, TRUE, TRUE, TRUE, "none", 4)

Expected output:

Result
0	1.55102
1	0.115646
2	0.791837
3	-0.80102
4	-0.312925

Example 3: Autocovariance without demeaning

Inputs:

x							adjusted	demean	fft	missing	nlag
2	2	3	3	4	4	5	false	false	false	none	3

Excel formula:

=ACOVF({2,2,3,3,4,4,5}, FALSE, FALSE, FALSE, "none", 3)

Expected output:

Result
0	11.8571
1	9.57143
2	8
3	5.85714

Example 4: Autocovariance limited to selected lags

Inputs:

x								adjusted	demean	fft	missing	nlag
1	4	2	5	3	6	4	7	false	true	true	none	2

Excel formula:

=ACOVF({1,4,2,5,3,6,4,7}, FALSE, TRUE, TRUE, "none", 2)

Expected output:

Result
0	3.5
1	-0.625
2	2

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import acovf as sm_acovf

def acovf(x, adjusted=False, demean=True, fft=True, missing='none', nlag=None):
    """
    Estimate autocovariance values of a time series across lags.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.acovf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n in covariance estimation. Default is False.
        demean (bool, optional): Subtract sample mean before covariance estimation. Default is True.
        fft (bool, optional): Use FFT-based convolution. Default is True.
        missing (str, optional): Missing-data handling mode. Default is 'none'.
        nlag (int, optional): Maximum lag to return; null returns full available range. Default is None.

    Returns:
        list[list]: 2D table with columns lag and acovf.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if missing not in ("none", "raise", "conservative", "drop"):
            return "Error: missing must be one of 'none', 'raise', 'conservative', or 'drop'"

        series = to1d(x)
        if len(series) < 2:
            return "Error: x must contain at least two numeric values"

        acovf_vals = sm_acovf(
            np.asarray(series, dtype=float),
            adjusted=adjusted,
            demean=demean,
            fft=fft,
            missing=missing,
            nlag=nlag,
        )

        arr = np.asarray(acovf_vals, dtype=float)
        return [[lag, float(arr[lag])] for lag in range(len(arr))]
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Time-series observations as a 2D range.

adjusted

Use denominator n-k instead of n in covariance estimation.

demean

Subtract sample mean before covariance estimation.

fft

Use FFT-based convolution.

missing

Missing-data handling mode.

nlag

Maximum lag to return; null returns full available range.

ADFULLER

This function applies the Augmented Dickey-Fuller (ADF) test to evaluate whether a univariate time series contains a unit root.

The null hypothesis is that the series is non-stationary with a unit root, while the alternative is stationarity.

It returns the test statistic, p-value, lag usage, sample size, and critical values used for interpretation.

Excel Usage

=ADFULLER(x, maxlag, regression, autolag, store, regresults)

x (list[list], required): Time-series observations as a 2D range.
maxlag (int, optional, default: null): Maximum lag included in the test regression; null uses default rule.
regression (str, optional, default: “c”): Deterministic terms included in the test regression (c, ct, ctt, or n).
autolag (str, optional, default: “AIC”): Automatic lag-selection criterion (AIC, BIC, t-stat, or none).
store (bool, optional, default: false): Return result storage object in addition to scalar outputs.
regresults (bool, optional, default: false): Return full regression results when available.

Returns (list[list]): 2D key-value table summarizing ADF statistics and critical values.

Example 1: ADF test with default regression and autolag

Inputs:

x										maxlag	regression	autolag	store	regresults
1	1.2	1.1	1.3	1.25	1.35	1.3	1.4	1.38	1.45		c	AIC	false	false

Excel formula:

=ADFULLER({1,1.2,1.1,1.3,1.25,1.35,1.3,1.4,1.38,1.45}, , "c", "AIC", FALSE, FALSE)

Expected output:

Result
adf_stat	-2.1749
p_value	0.21548
usedlag	3
nobs	6
critical_1%	-5.35426
critical_5%	-3.64624
critical_10%	-2.9012
icbest	-37.0965

Example 2: ADF test with constant and trend regression

Inputs:

x										maxlag	regression	autolag	store	regresults
2	2.1	2.05	2.2	2.15	2.25	2.3	2.35	2.32	2.4	2	ct	BIC	false	false

Excel formula:

=ADFULLER({2,2.1,2.05,2.2,2.15,2.25,2.3,2.35,2.32,2.4}, 2, "ct", "BIC", FALSE, FALSE)

Expected output:

Result
adf_stat	-5.66528
p_value	0.0000107685
usedlag	0
nobs	9
critical_1%	-5.49966
critical_5%	-4.07211
critical_10%	-3.4935
icbest	-26.0036

Example 3: ADF test using fixed lag count

Inputs:

x										maxlag	regression	autolag	store	regresults
3	2.9	3.1	3	3.2	3.1	3.3	3.25	3.35	3.3	1	c	none	false	false

Excel formula:

=ADFULLER({3,2.9,3.1,3,3.2,3.1,3.3,3.25,3.35,3.3}, 1, "c", "none", FALSE, FALSE)

Expected output:

Result
adf_stat	-1.29158
p_value	0.633016
usedlag	1
nobs	8
critical_1%	-4.66519
critical_5%	-3.36719
critical_10%	-2.80296

Example 4: ADF test without deterministic terms

Inputs:

x										maxlag	regression	autolag	store	regresults
1	0.95	1.02	0.98	1.01	0.99	1.03	1	1.04	1.01	1	n	AIC	false	false

Excel formula:

=ADFULLER({1,0.95,1.02,0.98,1.01,0.99,1.03,1,1.04,1.01}, 1, "n", "AIC", FALSE, FALSE)

Expected output:

Result
adf_stat	2.85276
p_value	0.999591
usedlag	1
nobs	8
critical_1%	-2.90189
critical_5%	-1.96617
critical_10%	-1.57649
icbest	-46.7118

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import adfuller as sm_adfuller

def adfuller(x, maxlag=None, regression='c', autolag='AIC', store=False, regresults=False):
    """
    Run the Augmented Dickey-Fuller unit-root test for stationarity diagnostics.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.adfuller.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        maxlag (int, optional): Maximum lag included in the test regression; null uses default rule. Default is None.
        regression (str, optional): Deterministic terms included in the test regression (c, ct, ctt, or n). Default is 'c'.
        autolag (str, optional): Automatic lag-selection criterion (AIC, BIC, t-stat, or none). Default is 'AIC'.
        store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.
        regresults (bool, optional): Return full regression results when available. Default is False.

    Returns:
        list[list]: 2D key-value table summarizing ADF statistics and critical values.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if regression not in ("c", "ct", "ctt", "n"):
            return "Error: regression must be one of 'c', 'ct', 'ctt', or 'n'"

        if autolag in ("none", "None", "", None):
            autolag_arg = None
        elif autolag in ("AIC", "BIC", "t-stat"):
            autolag_arg = autolag
        else:
            return "Error: autolag must be 'AIC', 'BIC', 't-stat', or 'none'"

        series = to1d(x)
        if len(series) < 4:
            return "Error: x must contain at least four numeric values"

        result = sm_adfuller(
            np.asarray(series, dtype=float),
            maxlag=maxlag,
            regression=regression,
            autolag=autolag_arg,
            store=store,
            regresults=regresults,
        )

        adf_stat = float(result[0])
        p_value = float(result[1])
        usedlag = int(result[2])
        nobs_used = int(result[3])
        crit_values = result[4]

        rows = [
            ["adf_stat", adf_stat],
            ["p_value", p_value],
            ["usedlag", usedlag],
            ["nobs", nobs_used],
        ]

        if isinstance(crit_values, dict):
            for key in ("1%", "5%", "10%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        if len(result) > 5 and isinstance(result[5], (int, float, np.floating)):
            rows.append(["icbest", float(result[5])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Time-series observations as a 2D range.

maxlag

Maximum lag included in the test regression; null uses default rule.

regression

Deterministic terms included in the test regression (c, ct, ctt, or n).

autolag

Automatic lag-selection criterion (AIC, BIC, t-stat, or none).

store

Return result storage object in addition to scalar outputs.

regresults

Return full regression results when available.

CCF

This function estimates the cross-correlation function (CCF) between two univariate time series, reporting how strongly one series is linearly related to lagged values of another.

For lag k, the cross-correlation compares x_{t+k} with y_t after normalization by series variability.

Optional confidence intervals can be returned to assess whether cross-correlations differ materially from zero.

Excel Usage

=CCF(x, y, adjusted, fft, nlags, alpha)

x (list[list], required): First time series as a 2D range.
y (list[list], required): Second time series as a 2D range.
adjusted (bool, optional, default: true): Use denominator n-k instead of n for normalization.
fft (bool, optional, default: true): Use FFT-based convolution.
nlags (int, optional, default: null): Number of lags to compute; null uses statsmodels default.
alpha (float, optional, default: null): Significance level for confidence intervals; null disables intervals.

Returns (list[list]): 2D table with columns lag, ccf, conf_low, and conf_high.

Example 1: Cross-correlation with default options

Inputs:

x						y						adjusted	fft	nlags	alpha
1	2	3	4	5	6	1	1	2	3	5	8	true	true	5

Excel formula:

=CCF({1,2,3,4,5,6}, {1,1,2,3,5,8}, TRUE, TRUE, 5, )

Expected output:

Result
0	0.938953
1	0.359932
2	-0.166273
3	-0.625969
4	-1.09545

Example 2: Cross-correlation with confidence intervals

Inputs:

x								y								adjusted	fft	nlags	alpha
2	1	2	1	2	1	2	1	1	2	1	2	1	2	1	2	true	true	6	0.05

Excel formula:

=CCF({2,1,2,1,2,1,2,1}, {1,2,1,2,1,2,1,2}, TRUE, TRUE, 6, 0.05)

Expected output:

Result
0	-1	-1.69295	-0.307048
1	1	0.307048	1.69295
2	-1	-1.69295	-0.307048
3	1	0.307048	1.69295
4	-1	-1.69295	-0.307048
5	1	0.307048	1.69295

Example 3: Cross-correlation without FFT

Inputs:

x							y							adjusted	fft	nlags	alpha
3	5	4	6	5	7	6	2	4	3	5	4	6	5	false	false	5

Excel formula:

=CCF({3,5,4,6,5,7,6}, {2,4,3,5,4,6,5}, FALSE, FALSE, 5, )

Expected output:

Result
0	1
1	0.0639098
2	0.364662
3	-0.295113
4	-0.0864662

Example 4: Cross-correlation using a small lag count

Inputs:

x							y							adjusted	fft	nlags	alpha
1	3	2	4	3	5	4	2	1	3	2	4	3	5	true	true	3

Excel formula:

=CCF({1,3,2,4,3,5,4}, {2,1,3,2,4,3,5}, TRUE, TRUE, 3, )

Expected output:

Result
0	0.289474
1	0.508772
2	-0.160526

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import ccf as sm_ccf

def ccf(x, y, adjusted=True, fft=True, nlags=None, alpha=None):
    """
    Compute cross-correlation between two time series across nonnegative lags.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.ccf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): First time series as a 2D range.
        y (list[list]): Second time series as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n for normalization. Default is True.
        fft (bool, optional): Use FFT-based convolution. Default is True.
        nlags (int, optional): Number of lags to compute; null uses statsmodels default. Default is None.
        alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.

    Returns:
        list[list]: 2D table with columns lag, ccf, conf_low, and conf_high.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        x_vals = to1d(x)
        y_vals = to1d(y)

        if len(x_vals) < 2 or len(y_vals) < 2:
            return "Error: x and y must each contain at least two numeric values"

        if len(x_vals) != len(y_vals):
            return "Error: x and y must have the same number of numeric values"

        result = sm_ccf(
            np.asarray(x_vals, dtype=float),
            np.asarray(y_vals, dtype=float),
            adjusted=adjusted,
            fft=fft,
            nlags=nlags,
            alpha=alpha,
        )

        if isinstance(result, tuple):
            ccf_vals, confint = result
        else:
            ccf_vals = result
            confint = None

        ccf_arr = np.asarray(ccf_vals, dtype=float)
        conf_arr = np.asarray(confint, dtype=float) if confint is not None else None

        table = []
        for lag in range(len(ccf_arr)):
            low = ""
            high = ""
            if conf_arr is not None and lag < len(conf_arr):
                low = float(conf_arr[lag][0])
                high = float(conf_arr[lag][1])
            table.append([lag, float(ccf_arr[lag]), low, high])

        return table
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

First time series as a 2D range.

y *

Second time series as a 2D range.

adjusted

Use denominator n-k instead of n for normalization.

fft

Use FFT-based convolution.

nlags

Number of lags to compute; null uses statsmodels default.

alpha

Significance level for confidence intervals; null disables intervals.

CCOVF

This function computes the cross-covariance function between two univariate series, quantifying linear co-movement across lag offsets.

For lag k, the cross-covariance is:

\gamma_{xy}(k) = \operatorname{Cov}(x_{t+k}, y_t)

The result helps identify lagged lead-lag structure before normalization into cross-correlation.

Excel Usage

=CCOVF(x, y, adjusted, demean, fft)

x (list[list], required): First time series as a 2D range.
y (list[list], required): Second time series as a 2D range.
adjusted (bool, optional, default: true): Use denominator n-k instead of n in covariance estimation.
demean (bool, optional, default: true): Subtract sample means from both series before covariance estimation.
fft (bool, optional, default: true): Use FFT-based convolution.

Returns (list[list]): 2D table with columns lag and ccovf.

Example 1: Cross-covariance with default options

Inputs:

x						y						adjusted	demean	fft
1	2	3	4	5	6	2	3	4	5	6	7	true	true	true

Excel formula:

=CCOVF({1,2,3,4,5,6}, {2,3,4,5,6,7}, TRUE, TRUE, TRUE)

Expected output:

Result
0	2.91667
1	1.75
2	0.25
3	-1.58333
4	-3.75
5	-6.25

Example 2: Cross-covariance without demeaning

Inputs:

x						y						adjusted	demean	fft
3	5	4	6	5	7	1	2	2	3	3	4	true	false	true

Excel formula:

=CCOVF({3,5,4,6,5,7}, {1,2,2,3,3,4}, TRUE, FALSE, TRUE)

Expected output:

Result
0	13.6667
1	12.2
2	11.75
3	10
4	9.5
5	7

Example 3: Cross-covariance without FFT

Inputs:

x							y							adjusted	demean	fft
2	4	3	5	4	6	5	1	3	2	4	3	5	4	false	true	false

Excel formula:

=CCOVF({2,4,3,5,4,6,5}, {1,3,2,4,3,5,4}, FALSE, TRUE, FALSE)

Expected output:

Result
0	1.55102
1	0.0991254
2	0.565598
3	-0.457726
4	-0.134111
5	-0.586006
6	-0.262391

Example 4: Cross-covariance with unadjusted denominator

Inputs:

x								y								adjusted	demean	fft
1	3	2	4	3	5	4	6	2	1	3	2	4	3	5	4	false	true	true

Excel formula:

=CCOVF({1,3,2,4,3,5,4,6}, {2,1,3,2,4,3,5,4}, FALSE, TRUE, TRUE)

Expected output:

Result
0	0.75
1	1.3125
2	-0.0625
3	0.3125
4	-0.625
5	-0.3125
6	-0.6875
7	-0.3125

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import ccovf as sm_ccovf

def ccovf(x, y, adjusted=True, demean=True, fft=True):
    """
    Estimate cross-covariance values between two time series across lags.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.ccovf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): First time series as a 2D range.
        y (list[list]): Second time series as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n in covariance estimation. Default is True.
        demean (bool, optional): Subtract sample means from both series before covariance estimation. Default is True.
        fft (bool, optional): Use FFT-based convolution. Default is True.

    Returns:
        list[list]: 2D table with columns lag and ccovf.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        x_vals = to1d(x)
        y_vals = to1d(y)

        if len(x_vals) < 2 or len(y_vals) < 2:
            return "Error: x and y must each contain at least two numeric values"

        if len(x_vals) != len(y_vals):
            return "Error: x and y must have the same number of numeric values"

        ccovf_vals = sm_ccovf(
            np.asarray(x_vals, dtype=float),
            np.asarray(y_vals, dtype=float),
            adjusted=adjusted,
            demean=demean,
            fft=fft,
        )

        arr = np.asarray(ccovf_vals, dtype=float)
        return [[lag, float(arr[lag])] for lag in range(len(arr))]
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

First time series as a 2D range.

y *

Second time series as a 2D range.

adjusted

Use denominator n-k instead of n in covariance estimation.

demean

Subtract sample means from both series before covariance estimation.

fft

Use FFT-based convolution.

KPSS

This function applies the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test for stationarity.

Unlike the ADF test, the KPSS null hypothesis is stationarity (around a level or trend), and the alternative is a unit root.

The output includes the KPSS statistic, p-value, selected lag truncation, and reference critical values.

Excel Usage

=KPSS(x, regression, nlags, store)

x (list[list], required): Time-series observations as a 2D range.
regression (str, optional, default: “c”): Null hypothesis type, c for level-stationary or ct for trend-stationary.
nlags (str, optional, default: “auto”): Lag selection mode (auto or legacy) or an integer provided as text.
store (bool, optional, default: false): Return result storage object in addition to scalar outputs.

Returns (list[list]): 2D key-value table summarizing KPSS statistics and critical values.

Example 1: KPSS with automatic lag selection and level stationarity null

Inputs:

x										regression	nlags	store
1	1.1	1.05	1.08	1.03	1.09	1.04	1.1	1.06	1.11	c	auto	false

Excel formula:

=KPSS({1,1.1,1.05,1.08,1.03,1.09,1.04,1.1,1.06,1.11}, "c", "auto", FALSE)

Expected output:

Result
kpss_stat	0.5
p_value	0.0416667
lags	9
critical_10%	0.347
critical_5%	0.463
critical_2.5%	0.574
critical_1%	0.739

Example 2: KPSS with legacy lag rule and trend stationarity null

Inputs:

x										regression	nlags	store
2	2.2	2.35	2.5	2.65	2.8	2.95	3.1	3.25	3.4	ct	legacy	false

Excel formula:

=KPSS({2,2.2,2.35,2.5,2.65,2.8,2.95,3.1,3.25,3.4}, "ct", "legacy", FALSE)

Expected output:

Result
kpss_stat	0.358798
p_value	0.01
lags	7
critical_10%	0.119
critical_5%	0.146
critical_2.5%	0.176
critical_1%	0.216

Example 3: KPSS with manually provided lag count

Inputs:

x										regression	nlags	store
3	3.1	3	3.15	3.05	3.2	3.1	3.25	3.15	3.3	c	2	false

Excel formula:

=KPSS({3,3.1,3,3.15,3.05,3.2,3.1,3.25,3.15,3.3}, "c", 2, FALSE)

Expected output:

Result
kpss_stat	0.440047
p_value	0.0598935
lags	2
critical_10%	0.347
critical_5%	0.463
critical_2.5%	0.574
critical_1%	0.739

Example 4: KPSS on low-variance stationary-looking series

Inputs:

x										regression	nlags	store
5	5.02	4.99	5.01	5	5.03	4.98	5.01	5	5.02	c	auto	false

Excel formula:

=KPSS({5,5.02,4.99,5.01,5,5.03,4.98,5.01,5,5.02}, "c", "auto", FALSE)

Expected output:

Result
kpss_stat	0.326271
p_value	0.1
lags	6
critical_10%	0.347
critical_5%	0.463
critical_2.5%	0.574
critical_1%	0.739

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import kpss as sm_kpss

def kpss(x, regression='c', nlags='auto', store=False):
    """
    Run the KPSS stationarity test under level or trend null hypotheses.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.kpss.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        regression (str, optional): Null hypothesis type, c for level-stationary or ct for trend-stationary. Default is 'c'.
        nlags (str, optional): Lag selection mode (auto or legacy) or an integer provided as text. Default is 'auto'.
        store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.

    Returns:
        list[list]: 2D key-value table summarizing KPSS statistics and critical values.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if regression not in ("c", "ct"):
            return "Error: regression must be 'c' or 'ct'"

        if nlags in ("auto", "legacy"):
            nlags_arg = nlags
        else:
            try:
                parsed = int(float(nlags))
            except (TypeError, ValueError):
                return "Error: nlags must be 'auto', 'legacy', or an integer"
            if parsed < 0:
                return "Error: nlags integer must be nonnegative"
            nlags_arg = parsed

        series = to1d(x)
        if len(series) < 4:
            return "Error: x must contain at least four numeric values"

        result = sm_kpss(
            np.asarray(series, dtype=float),
            regression=regression,
            nlags=nlags_arg,
            store=store,
        )

        kpss_stat = float(result[0])
        p_value = float(result[1])
        used_lags = int(result[2])
        crit_values = result[3]

        rows = [
            ["kpss_stat", kpss_stat],
            ["p_value", p_value],
            ["lags", used_lags],
        ]

        if isinstance(crit_values, dict):
            for key in ("10%", "5%", "2.5%", "1%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Time-series observations as a 2D range.

regression

Null hypothesis type, c for level-stationary or ct for trend-stationary.

nlags

Lag selection mode (auto or legacy) or an integer provided as text.

store

Return result storage object in addition to scalar outputs.

PACF

This function estimates the partial autocorrelation function (PACF), which measures the direct correlation between x_t and x_{t-k} after controlling for intermediate lags.

In autoregressive model identification, PACF helps indicate candidate model order by highlighting lags with substantial direct dependence.

The PACF at lag k can be interpreted as the final coefficient in an AR(k) regression.

Excel Usage

=PACF(x, nlags, method, alpha)

x (list[list], required): Time-series observations as a 2D range.
nlags (int, optional, default: null): Maximum lag to compute; null uses statsmodels default.
method (str, optional, default: “ywadjusted”): Estimation method for PACF.
alpha (float, optional, default: null): Significance level for confidence intervals; null disables intervals.

Returns (list[list]): 2D table with columns lag, pacf, conf_low, and conf_high.

Example 1: PACF using Yule-Walker adjusted method

Inputs:

x								nlags	method	alpha
1	2	3	4	5	4	3	2	4	ywadjusted

Excel formula:

=PACF({1,2,3,4,5,4,3,2}, 4, "ywadjusted", )

Expected output:

Result
0	1
1	0.571429
2	-0.649832
3	-0.832527
4	-1.57327

Example 2: PACF with confidence intervals

Inputs:

x									nlags	method	alpha
2	1	2	1	2	1	2	1	2	4	burg	0.05

Excel formula:

=PACF({2,1,2,1,2,1,2,1,2}, 4, "burg", 0.05)

Expected output:

Result
0	1	1	1
1	-0.97561	-1.62893	-0.322288
2	1	0.346679	1.65332
3	3.14684e-16	-0.653321	0.653321
4	2.5924e-16	-0.653321	0.653321

Example 3: PACF using OLS adjusted estimator

Inputs:

x									nlags	method	alpha
5	4	6	5	7	6	8	7	9	3	ols-adjusted

Excel formula:

=PACF({5,4,6,5,7,6,8,7,9}, 3, "ols-adjusted", )

Expected output:

Result
0	1
1	0.5625
2	1.28571
3	-0.5

Example 4: PACF using Levinson-Durbin biased estimator

Inputs:

x								nlags	method	alpha
3	2	4	3	5	4	6	5	3	ldbiased

Excel formula:

=PACF({3,2,4,3,5,4,6,5}, 3, "ldbiased", )

Expected output:

Result
0	1
1	0.25
2	0.288889
3	-0.346983

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import pacf as sm_pacf

def pacf(x, nlags=None, method='ywadjusted', alpha=None):
    """
    Compute partial autocorrelation values across lags for lag-order diagnostics.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.pacf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        nlags (int, optional): Maximum lag to compute; null uses statsmodels default. Default is None.
        method (str, optional): Estimation method for PACF. Default is 'ywadjusted'.
        alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.

    Returns:
        list[list]: 2D table with columns lag, pacf, conf_low, and conf_high.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        valid_methods = (
            "yw", "ywadjusted", "ols", "ols-inefficient", "ols-adjusted",
            "ywm", "ywmle", "ld", "ldadjusted", "ldb", "ldbiased", "burg"
        )
        if method not in valid_methods:
            return "Error: method is not a supported PACF estimator"

        series = to1d(x)
        if len(series) < 3:
            return "Error: x must contain at least three numeric values"

        result = sm_pacf(np.asarray(series, dtype=float), nlags=nlags, method=method, alpha=alpha)

        if isinstance(result, tuple):
            pacf_vals, confint = result
        else:
            pacf_vals = result
            confint = None

        pacf_arr = np.asarray(pacf_vals, dtype=float)
        conf_arr = np.asarray(confint, dtype=float) if confint is not None else None

        table = []
        for lag in range(len(pacf_arr)):
            low = ""
            high = ""
            if conf_arr is not None and lag < len(conf_arr):
                low = float(conf_arr[lag][0])
                high = float(conf_arr[lag][1])
            table.append([lag, float(pacf_arr[lag]), low, high])

        return table
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Time-series observations as a 2D range.

nlags

Maximum lag to compute; null uses statsmodels default.

method

Estimation method for PACF.

alpha

Significance level for confidence intervals; null disables intervals.

Q_STAT

This function computes the Ljung-Box portmanteau statistic from a sequence of autocorrelation estimates.

The statistic aggregates squared autocorrelations across lags to test whether serial correlation remains in a process:

Q = n(n+2) \sum_{k=1}^{m} \frac{\rho_k^2}{n-k}

It returns the cumulative Q-statistic and associated p-value at each lag.

Excel Usage

=Q_STAT(x, nobs)

x (list[list], required): Autocorrelation coefficients as a 2D range, typically excluding lag zero.
nobs (int, required): Total number of observations in the underlying sample.

Returns (list[list]): 2D table with columns lag, q_stat, and p_value.

Example 1: Ljung-Box statistics from short ACF sequence

Inputs:

x			nobs
0.2	0.1	0.05	60

Excel formula:

=Q_STAT({0.2,0.1,0.05}, 60)

Expected output:

Result
1	2.52203	0.112266
2	3.16341	0.205624
3	3.32657	0.343962

Example 2: Ljung-Box statistics from moderate ACF sequence

Inputs:

x				nobs
0.35	0.22	0.11	0.04	120

Excel formula:

=Q_STAT({0.35,0.22,0.11,0.04}, 120)

Expected output:

Result
1	15.0706	0.000103564
2	21.0755	0.0000265167
3	22.5895	0.0000491731
4	22.7915	0.000139371

Example 3: Ljung-Box statistics with alternating autocorrelation signs

Inputs:

x				nobs
0.25	-0.15	0.1	-0.05	80

Excel formula:

=Q_STAT({0.25,-0.15,0.1,-0.05}, 80)

Expected output:

Result
1	5.18987	0.0227189
2	7.08218	0.0289817
3	7.93413	0.0473929
4	8.14992	0.0862383

Example 4: Ljung-Box statistics from five-lag autocorrelation input

Inputs:

x					nobs
0.18	0.12	0.09	0.04	0.02	150

Excel formula:

=Q_STAT({0.18,0.12,0.09,0.04,0.02}, 150)

Expected output:

Result
1	4.95785	0.0259724
2	7.17623	0.0276504
3	8.43256	0.0378689
4	8.68242	0.0695466
5	8.74532	0.119664

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import q_stat as sm_q_stat

def q_stat(x, nobs):
    """
    Compute Ljung-Box Q statistics and p-values from autocorrelation coefficients.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.q_stat.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Autocorrelation coefficients as a 2D range, typically excluding lag zero.
        nobs (int): Total number of observations in the underlying sample.

    Returns:
        list[list]: 2D table with columns lag, q_stat, and p_value.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if nobs <= 1:
            return "Error: nobs must be greater than 1"

        acf_vals = to1d(x)
        if len(acf_vals) == 0:
            return "Error: x must contain at least one numeric autocorrelation value"

        q_vals, p_vals = sm_q_stat(np.asarray(acf_vals, dtype=float), nobs=nobs)

        q_arr = np.asarray(q_vals, dtype=float)
        p_arr = np.asarray(p_vals, dtype=float)

        return [[lag + 1, float(q_arr[lag]), float(p_arr[lag])] for lag in range(len(q_arr))]
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Autocorrelation coefficients as a 2D range, typically excluding lag zero.

nobs *

Total number of observations in the underlying sample.

RURTEST

This function applies the Range Unit-Root (RUR) test, which tests a stationarity null hypothesis using a range-based statistic.

It is designed as an alternative unit-root diagnostic that can be robust under nonlinearities and structural features considered in its original development.

The output includes the RUR statistic, p-value, and critical values used for interpretation.

Excel Usage

=RURTEST(x, store)

x (list[list], required): Time-series observations as a 2D range.
store (bool, optional, default: false): Return result storage object in addition to scalar outputs.

Returns (list[list]): 2D key-value table summarizing RUR test outputs.

Example 1: Range unit-root test on mildly trending data

Inputs:

x																														store
1	1.05	1.02	1.08	1.06	1.1	1.09	1.12	1.1	1.14	1.16	1.18	1.2	1.19	1.22	1.24	1.23	1.27	1.29	1.31	1.3	1.33	1.36	1.35	1.38	1.4	1.42	1.41	1.44	1.46	false

Excel formula:

=RURTEST({1,1.05,1.02,1.08,1.06,1.1,1.09,1.12,1.1,1.14,1.16,1.18,1.2,1.19,1.22,1.24,1.23,1.27,1.29,1.31,1.3,1.33,1.36,1.35,1.38,1.4,1.42,1.41,1.44,1.46}, FALSE)

Expected output:

Result
rur_stat	3.65148
p_value	0.95
critical_10%	1.09624
critical_5%	0.94492
critical_2.5%	0.83556
critical_1%	0.68962

Example 2: Range unit-root test on level-stationary-looking data

Inputs:

x																														store
5	5.02	4.99	5.01	5	5.03	4.98	5.01	5	5.02	4.99	5.01	5	5.03	4.97	5	5.01	4.99	5.02	5	4.98	5.01	5	5.02	4.99	5.01	5	5.03	4.98	5	false

Excel formula:

=RURTEST({5,5.02,4.99,5.01,5,5.03,4.98,5.01,5,5.02,4.99,5.01,5,5.03,4.97,5,5.01,4.99,5.02,5,4.98,5.01,5,5.02,4.99,5.01,5,5.03,4.98,5}, FALSE)

Expected output:

Result
rur_stat	0.912871
p_value	0.05
critical_10%	1.09624
critical_5%	0.94492
critical_2.5%	0.83556
critical_1%	0.68962

Example 3: Range unit-root test on higher variance observations

Inputs:

x																														store
2	2.4	2.1	2.5	2.2	2.6	2.3	2.7	2.4	2.8	2.35	2.9	2.45	2.95	2.5	3	2.55	3.05	2.6	3.1	2.65	3.15	2.7	3.2	2.75	3.25	2.8	3.3	2.85	3.35	false

Excel formula:

=RURTEST({2,2.4,2.1,2.5,2.2,2.6,2.3,2.7,2.4,2.8,2.35,2.9,2.45,2.95,2.5,3,2.55,3.05,2.6,3.1,2.65,3.15,2.7,3.2,2.75,3.25,2.8,3.3,2.85,3.35}, FALSE)

Expected output:

Result
rur_stat	2.73861
p_value	0.95
critical_10%	1.09624
critical_5%	0.94492
critical_2.5%	0.83556
critical_1%	0.68962

Example 4: Range unit-root test on smooth growth sequence

Inputs:

x																														store
3	3.05	3.1	3.16	3.2	3.25	3.31	3.36	3.4	3.45	3.5	3.55	3.6	3.66	3.7	3.75	3.81	3.86	3.9	3.95	4	4.05	4.1	4.16	4.2	4.25	4.31	4.36	4.4	4.45	false

Excel formula:

=RURTEST({3,3.05,3.1,3.16,3.2,3.25,3.31,3.36,3.4,3.45,3.5,3.55,3.6,3.66,3.7,3.75,3.81,3.86,3.9,3.95,4,4.05,4.1,4.16,4.2,4.25,4.31,4.36,4.4,4.45}, FALSE)

Expected output:

Result
rur_stat	5.29465
p_value	0.95
critical_10%	1.09624
critical_5%	0.94492
critical_2.5%	0.83556
critical_1%	0.68962

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import range_unit_root_test as sm_range_unit_root_test

def rurtest(x, store=False):
    """
    Run the range unit-root test as an alternative stationarity diagnostic.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.range_unit_root_test.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.

    Returns:
        list[list]: 2D key-value table summarizing RUR test outputs.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        series = to1d(x)
        if len(series) < 25:
            return "Error: x must contain at least 25 numeric values"

        result = sm_range_unit_root_test(np.asarray(series, dtype=float), store=store)

        rur_stat = float(result[0])
        p_value = float(result[1])
        crit_values = result[2]

        rows = [
            ["rur_stat", rur_stat],
            ["p_value", p_value],
        ]

        if isinstance(crit_values, dict):
            for key in ("10%", "5%", "2.5%", "1%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Time-series observations as a 2D range.

store

Return result storage object in addition to scalar outputs.

ZIVOT_ANDREWS

This function performs the Zivot-Andrews unit-root test, which extends unit-root diagnostics by allowing a single unknown structural break in the series.

The test evaluates a unit-root null against alternatives with break-adjusted deterministic components.

It returns the test statistic, p-value, critical values, selected lag, and estimated break index.

Excel Usage

=ZIVOT_ANDREWS(x, trim, maxlag, regression, autolag)

x (list[list], required): Time-series observations as a 2D range.
trim (float, optional, default: 0.15): Fraction of observations trimmed from each end when searching for break date.
maxlag (int, optional, default: null): Maximum lag included in candidate regressions; null uses default rule.
regression (str, optional, default: “c”): Deterministic specification, c, t, or ct.
autolag (str, optional, default: “AIC”): Lag-selection criterion (AIC, BIC, t-stat, or none).

Returns (list[list]): 2D key-value table summarizing Zivot-Andrews test outputs.

Example 1: Zivot-Andrews with default options

Inputs:

x															trim	maxlag	regression	autolag
1	1.1	1.2	1.25	1.3	1.35	1.33	1.4	1.45	1.5	1.55	1.58	1.6	1.65	1.7	0.15		c	AIC

Excel formula:

=ZIVOT_ANDREWS({1,1.1,1.2,1.25,1.3,1.35,1.33,1.4,1.45,1.5,1.55,1.58,1.6,1.65,1.7}, 0.15, , "c", "AIC")

Expected output:

za_stat	NaN
p_value	NaN
base_lag	4
break_index	4
critical_1%	-5.27644
critical_5%	-4.81067
critical_10%	-4.56618

Example 2: Zivot-Andrews using trend-only regression

Inputs:

x																														trim	maxlag	regression	autolag
2	2.04	2.08	2.12	2.16	2.2	2.24	2.28	2.32	2.36	2.4	2.44	2.48	2.52	2.56	2.6	2.66	2.72	2.78	2.84	2.9	2.96	3.02	3.08	3.14	3.2	3.26	3.32	3.38	3.44	0.15	1	t	AIC

Excel formula:

=ZIVOT_ANDREWS({2,2.04,2.08,2.12,2.16,2.2,2.24,2.28,2.32,2.36,2.4,2.44,2.48,2.52,2.56,2.6,2.66,2.72,2.78,2.84,2.9,2.96,3.02,3.08,3.14,3.2,3.26,3.32,3.38,3.44}, 0.15, 1, "t", "AIC")

Expected output:

za_stat	NaN
p_value	NaN
base_lag	1
break_index	17
critical_1%	-5.03421
critical_5%	-4.4058
critical_10%	-4.13678

Example 3: Zivot-Andrews using constant and trend regression

Inputs:

x															trim	maxlag	regression	autolag
3	3.05	3.08	3.12	3.2	3.25	3.3	3.38	3.42	3.5	3.55	3.6	3.66	3.72	3.8	0.15	2	ct	t-stat

Excel formula:

=ZIVOT_ANDREWS({3,3.05,3.08,3.12,3.2,3.25,3.3,3.38,3.42,3.5,3.55,3.6,3.66,3.72,3.8}, 0.15, 2, "ct", "t-stat")

Expected output:

Result
za_stat	-5.0866
p_value	0.0485155
base_lag	0
break_index	3
critical_1%	-5.57556
critical_5%	-5.07332
critical_10%	-4.82668

Example 4: Zivot-Andrews with fixed max lag and no autolag

Inputs:

x																														trim	maxlag	regression	autolag
4	4.03	4.07	4.1	4.13	4.17	4.2	4.24	4.27	4.31	4.34	4.38	4.41	4.45	4.48	4.52	4.55	4.6	4.66	4.71	4.77	4.82	4.88	4.93	4.99	5.04	5.1	5.15	5.21	5.26	0.15	1	c	none

Excel formula:

=ZIVOT_ANDREWS({4,4.03,4.07,4.1,4.13,4.17,4.2,4.24,4.27,4.31,4.34,4.38,4.41,4.45,4.48,4.52,4.55,4.6,4.66,4.71,4.77,4.82,4.88,4.93,4.99,5.04,5.1,5.15,5.21,5.26}, 0.15, 1, "c", "none")

Expected output:

Result
za_stat	-1.51838
p_value	0.999
base_lag	1
break_index	17
critical_1%	-5.27644
critical_5%	-4.81067
critical_10%	-4.56618

Python Code

Show Code

import numpy as np
from statsmodels.tsa.stattools import zivot_andrews as sm_zivot_andrews

def zivot_andrews(x, trim=0.15, maxlag=None, regression='c', autolag='AIC'):
    """
    Run the Zivot-Andrews unit-root test allowing one endogenous structural break.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.zivot_andrews.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        trim (float, optional): Fraction of observations trimmed from each end when searching for break date. Default is 0.15.
        maxlag (int, optional): Maximum lag included in candidate regressions; null uses default rule. Default is None.
        regression (str, optional): Deterministic specification, c, t, or ct. Default is 'c'.
        autolag (str, optional): Lag-selection criterion (AIC, BIC, t-stat, or none). Default is 'AIC'.

    Returns:
        list[list]: 2D key-value table summarizing Zivot-Andrews test outputs.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if trim < 0 or trim > 0.333:
            return "Error: trim must be between 0 and 0.333"

        if regression not in ("c", "t", "ct"):
            return "Error: regression must be 'c', 't', or 'ct'"

        if autolag in ("none", "None", "", None):
            autolag_arg = None
        elif autolag in ("AIC", "BIC", "t-stat"):
            autolag_arg = autolag
        else:
            return "Error: autolag must be 'AIC', 'BIC', 't-stat', or 'none'"

        series = to1d(x)
        if len(series) < 10:
            return "Error: x must contain at least ten numeric values"

        result = sm_zivot_andrews(
            np.asarray(series, dtype=float),
            trim=trim,
            maxlag=maxlag,
            regression=regression,
            autolag=autolag_arg,
        )

        za_stat = float(result[0])
        p_value = float(result[1])
        crit_values = result[2]
        base_lag = int(result[3])
        break_index = int(result[4])

        rows = [
            ["za_stat", za_stat],
            ["p_value", p_value],
            ["base_lag", base_lag],
            ["break_index", break_index],
        ]

        if isinstance(crit_values, dict):
            for key in ("1%", "5%", "10%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

x *

Time-series observations as a 2D range.

trim

Fraction of observations trimmed from each end when searching for break date.

maxlag

Maximum lag included in candidate regressions; null uses default rule.

regression

Deterministic specification, c, t, or ct.

autolag

Lag-selection criterion (AIC, BIC, t-stat, or none).