Autocorrelation And Stationarity Tests

Overview

Autocorrelation and stationarity are central ideas in time-series analysis because they describe how observations depend on prior values and whether those relationships remain stable over time. In practical modeling, many forecasting and inference methods assume some form of weak stationarity, so analysts need reliable diagnostics before fitting ARIMA-style models or interpreting residuals. This category focuses on tools that quantify serial dependence, test unit-root behavior, and evaluate whether remaining autocorrelation is statistically meaningful. For background, see Autocorrelation, Stationary process, and Unit root.

The unifying concepts are lag structure, covariance/correlation decomposition, and hypothesis testing under dependence. Correlation-based diagnostics summarize dependence at lag k using normalized measures such as \rho_k, while covariance-based diagnostics preserve scale in \gamma_k. Stationarity tests then formalize competing hypotheses about persistence: some tests take a unit root as the null, and others take stationarity as the null, so they are best interpreted jointly. Portmanteau statistics aggregate evidence across multiple lags, often using forms like

Q = n(n+2)\sum_{k=1}^{m}\frac{\rho_k^2}{n-k},

to check whether residual serial correlation remains.

These functions are implemented with statsmodels, especially statsmodels.tsa.stattools, a widely used Python toolkit for econometrics and statistical time-series modeling. The library provides consistent APIs for autocorrelation diagnostics, unit-root testing, and lag-selection options, which makes it suitable for both exploratory workflows and production-grade model validation.

For univariate dependence structure, ACF, ACOVF, and PACF provide complementary views of lag dynamics. ACF reports normalized serial correlation across lags, ACOVF reports scale-preserving autocovariance, and PACF isolates direct lag-k effects after controlling intermediate lags. Together, they support AR/MA order identification, residual checking, and feature engineering for lagged predictors. Q_STAT extends this by converting an autocorrelation sequence into Ljung-Box statistics and p-values to test whether correlation structure remains jointly significant.

For lead-lag relationships between two series, CCF and CCOVF quantify cross-series dependence across offsets. CCOVF measures raw co-movement and is useful when magnitude is important, while CCF normalizes by variance to compare effect strength on a common scale. These tools are commonly used in transfer-function modeling, signal alignment, and exploratory checks of whether one process systematically leads or follows another.

For stationarity and unit-root diagnostics, ADFULLER, KPSS, RURTEST, and ZIVOT_ANDREWS cover complementary assumptions. ADFULLER tests a unit-root null against stationarity alternatives, whereas KPSS inverts the null to stationarity, making the pair especially useful for triangulation. RURTEST adds a range-based unit-root diagnostic, and ZIVOT_ANDREWS allows one unknown structural break when testing persistence. In practice, analysts compare outcomes across these tests rather than relying on a single p-value, especially when trend shifts or regime changes may distort standard stationarity conclusions.

ACF

This function estimates the autocorrelation function (ACF) of a univariate time series for lag values from 0 up to a selected maximum lag.

The lag-k autocorrelation is the normalized covariance between observations separated by k periods:

\rho_k = \frac{\operatorname{Cov}(x_t, x_{t-k})}{\operatorname{Var}(x_t)}

Optional confidence intervals and Ljung-Box diagnostics can be included to help assess whether observed autocorrelations are statistically significant.

Excel Usage

=ACF(x, adjusted, nlags, qstat, fft, alpha, bartlett_confint, missing)
  • x (list[list], required): Time-series observations as a 2D range.
  • adjusted (bool, optional, default: false): Use denominator n-k instead of n in covariance normalization.
  • nlags (int, optional, default: null): Maximum lag to compute; null uses statsmodels default.
  • qstat (bool, optional, default: false): Return Ljung-Box Q statistics and p-values for lags above zero.
  • fft (bool, optional, default: true): Use FFT-based computation for improved speed on long series.
  • alpha (float, optional, default: null): Significance level for confidence intervals; null disables intervals.
  • bartlett_confint (bool, optional, default: true): Use Bartlett formula for confidence interval standard errors.
  • missing (str, optional, default: “none”): Missing-data handling mode.

Returns (list[list]): 2D table with columns lag, acf, conf_low, conf_high, q_stat, and p_value.

Example 1: ACF for a short increasing series

Inputs:

x adjusted nlags qstat fft alpha bartlett_confint missing
1 2 3 4 5 6 false 3 false true true none

Excel formula:

=ACF({1,2,3,4,5,6}, FALSE, 3, FALSE, TRUE, , TRUE, "none")

Expected output:

Result
0 1
1 0.5
2 0.0571429
3 -0.271429
Example 2: ACF with confidence intervals

Inputs:

x adjusted nlags qstat fft alpha bartlett_confint missing
2 1 2 1 2 1 2 1 false 4 false true 0.05 true none

Excel formula:

=ACF({2,1,2,1,2,1,2,1}, FALSE, 4, FALSE, TRUE, 0.05, TRUE, "none")

Expected output:

Result
0 1 1 1
1 -0.875 -1.56795 -0.182048
2 0.75 -0.35248 1.85248
3 -0.625 -1.95002 0.700016
4 0.5 -0.959729 1.95973
Example 3: ACF with Ljung-Box statistics

Inputs:

x adjusted nlags qstat fft alpha bartlett_confint missing
1 0 1 0 1 0 1 0 1 false 4 true true true none

Excel formula:

=ACF({1,0,1,0,1,0,1,0,1}, FALSE, 4, TRUE, TRUE, , TRUE, "none")

Expected output:

Result
0 1
1 -0.888889 9.77778 0.00176634
2 0.772222 18.2115 0.000111023
3 -0.666667 25.5449 0.0000118766
4 0.544444 31.414 0.00000252021
Example 4: ACF using adjusted denominator

Inputs:

x adjusted nlags qstat fft alpha bartlett_confint missing
3 4 6 8 7 5 4 6 true 3 false false true none

Excel formula:

=ACF({3,4,6,8,7,5,4,6}, TRUE, 3, FALSE, FALSE, , TRUE, "none")

Expected output:

Result
0 1
1 0.423181
2 -0.505241
3 -0.909434

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import acf as sm_acf

def acf(x, adjusted=False, nlags=None, qstat=False, fft=True, alpha=None, bartlett_confint=True, missing='none'):
    """
    Compute autocorrelation values across lags with optional confidence intervals and Ljung-Box statistics.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.acf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n in covariance normalization. Default is False.
        nlags (int, optional): Maximum lag to compute; null uses statsmodels default. Default is None.
        qstat (bool, optional): Return Ljung-Box Q statistics and p-values for lags above zero. Default is False.
        fft (bool, optional): Use FFT-based computation for improved speed on long series. Default is True.
        alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.
        bartlett_confint (bool, optional): Use Bartlett formula for confidence interval standard errors. Default is True.
        missing (str, optional): Missing-data handling mode. Default is 'none'.

    Returns:
        list[list]: 2D table with columns lag, acf, conf_low, conf_high, q_stat, and p_value.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if missing not in ("none", "raise", "conservative", "drop"):
            return "Error: missing must be one of 'none', 'raise', 'conservative', or 'drop'"

        series = to1d(x)
        if len(series) < 2:
            return "Error: x must contain at least two numeric values"

        result = sm_acf(
            np.asarray(series, dtype=float),
            adjusted=adjusted,
            nlags=nlags,
            qstat=qstat,
            fft=fft,
            alpha=alpha,
            bartlett_confint=bartlett_confint,
            missing=missing,
        )

        acf_vals = None
        confint = None
        q_vals = None
        p_vals = None

        if isinstance(result, tuple):
            if len(result) == 4:
                acf_vals, confint, q_vals, p_vals = result
            elif len(result) == 3:
                acf_vals, q_vals, p_vals = result
            elif len(result) == 2:
                acf_vals, confint = result
            else:
                return "Error: Unexpected output format from statsmodels acf"
        else:
            acf_vals = result

        acf_arr = np.asarray(acf_vals, dtype=float)
        conf_arr = np.asarray(confint, dtype=float) if confint is not None else None
        q_arr = np.asarray(q_vals, dtype=float) if q_vals is not None else None
        p_arr = np.asarray(p_vals, dtype=float) if p_vals is not None else None

        table = []
        for lag in range(len(acf_arr)):
            low = ""
            high = ""
            q_stat_val = ""
            p_val = ""

            if conf_arr is not None and lag < len(conf_arr):
                low = float(conf_arr[lag][0])
                high = float(conf_arr[lag][1])

            if q_arr is not None and lag > 0 and (lag - 1) < len(q_arr):
                q_stat_val = float(q_arr[lag - 1])
            if p_arr is not None and lag > 0 and (lag - 1) < len(p_arr):
                p_val = float(p_arr[lag - 1])

            table.append([lag, float(acf_arr[lag]), low, high, q_stat_val, p_val])

        return table
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Time-series observations as a 2D range.
Use denominator n-k instead of n in covariance normalization.
Maximum lag to compute; null uses statsmodels default.
Return Ljung-Box Q statistics and p-values for lags above zero.
Use FFT-based computation for improved speed on long series.
Significance level for confidence intervals; null disables intervals.
Use Bartlett formula for confidence interval standard errors.
Missing-data handling mode.

ACOVF

This function computes the autocovariance function (ACOVF), which describes covariance between a series and lagged versions of itself.

For lag k, autocovariance is defined as:

\gamma_k = \operatorname{Cov}(x_t, x_{t-k})

The output is useful for understanding serial dependence magnitude before normalization into autocorrelation.

Excel Usage

=ACOVF(x, adjusted, demean, fft, missing, nlag)
  • x (list[list], required): Time-series observations as a 2D range.
  • adjusted (bool, optional, default: false): Use denominator n-k instead of n in covariance estimation.
  • demean (bool, optional, default: true): Subtract sample mean before covariance estimation.
  • fft (bool, optional, default: true): Use FFT-based convolution.
  • missing (str, optional, default: “none”): Missing-data handling mode.
  • nlag (int, optional, default: null): Maximum lag to return; null returns full available range.

Returns (list[list]): 2D table with columns lag and acovf.

Example 1: Autocovariance with default settings

Inputs:

x adjusted demean fft missing nlag
1 2 3 4 5 6 false true true none

Excel formula:

=ACOVF({1,2,3,4,5,6}, FALSE, TRUE, TRUE, "none", )

Expected output:

Result
0 2.91667
1 1.45833
2 0.166667
3 -0.791667
4 -1.25
5 -1.04167
Example 2: Autocovariance with adjusted denominator

Inputs:

x adjusted demean fft missing nlag
3 5 4 6 5 7 6 true true true none 4

Excel formula:

=ACOVF({3,5,4,6,5,7,6}, TRUE, TRUE, TRUE, "none", 4)

Expected output:

Result
0 1.55102
1 0.115646
2 0.791837
3 -0.80102
4 -0.312925
Example 3: Autocovariance without demeaning

Inputs:

x adjusted demean fft missing nlag
2 2 3 3 4 4 5 false false false none 3

Excel formula:

=ACOVF({2,2,3,3,4,4,5}, FALSE, FALSE, FALSE, "none", 3)

Expected output:

Result
0 11.8571
1 9.57143
2 8
3 5.85714
Example 4: Autocovariance limited to selected lags

Inputs:

x adjusted demean fft missing nlag
1 4 2 5 3 6 4 7 false true true none 2

Excel formula:

=ACOVF({1,4,2,5,3,6,4,7}, FALSE, TRUE, TRUE, "none", 2)

Expected output:

Result
0 3.5
1 -0.625
2 2

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import acovf as sm_acovf

def acovf(x, adjusted=False, demean=True, fft=True, missing='none', nlag=None):
    """
    Estimate autocovariance values of a time series across lags.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.acovf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n in covariance estimation. Default is False.
        demean (bool, optional): Subtract sample mean before covariance estimation. Default is True.
        fft (bool, optional): Use FFT-based convolution. Default is True.
        missing (str, optional): Missing-data handling mode. Default is 'none'.
        nlag (int, optional): Maximum lag to return; null returns full available range. Default is None.

    Returns:
        list[list]: 2D table with columns lag and acovf.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if missing not in ("none", "raise", "conservative", "drop"):
            return "Error: missing must be one of 'none', 'raise', 'conservative', or 'drop'"

        series = to1d(x)
        if len(series) < 2:
            return "Error: x must contain at least two numeric values"

        acovf_vals = sm_acovf(
            np.asarray(series, dtype=float),
            adjusted=adjusted,
            demean=demean,
            fft=fft,
            missing=missing,
            nlag=nlag,
        )

        arr = np.asarray(acovf_vals, dtype=float)
        return [[lag, float(arr[lag])] for lag in range(len(arr))]
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Time-series observations as a 2D range.
Use denominator n-k instead of n in covariance estimation.
Subtract sample mean before covariance estimation.
Use FFT-based convolution.
Missing-data handling mode.
Maximum lag to return; null returns full available range.

ADFULLER

This function applies the Augmented Dickey-Fuller (ADF) test to evaluate whether a univariate time series contains a unit root.

The null hypothesis is that the series is non-stationary with a unit root, while the alternative is stationarity.

It returns the test statistic, p-value, lag usage, sample size, and critical values used for interpretation.

Excel Usage

=ADFULLER(x, maxlag, regression, autolag, store, regresults)
  • x (list[list], required): Time-series observations as a 2D range.
  • maxlag (int, optional, default: null): Maximum lag included in the test regression; null uses default rule.
  • regression (str, optional, default: “c”): Deterministic terms included in the test regression (c, ct, ctt, or n).
  • autolag (str, optional, default: “AIC”): Automatic lag-selection criterion (AIC, BIC, t-stat, or none).
  • store (bool, optional, default: false): Return result storage object in addition to scalar outputs.
  • regresults (bool, optional, default: false): Return full regression results when available.

Returns (list[list]): 2D key-value table summarizing ADF statistics and critical values.

Example 1: ADF test with default regression and autolag

Inputs:

x maxlag regression autolag store regresults
1 1.2 1.1 1.3 1.25 1.35 1.3 1.4 1.38 1.45 c AIC false false

Excel formula:

=ADFULLER({1,1.2,1.1,1.3,1.25,1.35,1.3,1.4,1.38,1.45}, , "c", "AIC", FALSE, FALSE)

Expected output:

Result
adf_stat -2.1749
p_value 0.21548
usedlag 3
nobs 6
critical_1% -5.35426
critical_5% -3.64624
critical_10% -2.9012
icbest -37.0965
Example 2: ADF test with constant and trend regression

Inputs:

x maxlag regression autolag store regresults
2 2.1 2.05 2.2 2.15 2.25 2.3 2.35 2.32 2.4 2 ct BIC false false

Excel formula:

=ADFULLER({2,2.1,2.05,2.2,2.15,2.25,2.3,2.35,2.32,2.4}, 2, "ct", "BIC", FALSE, FALSE)

Expected output:

Result
adf_stat -5.66528
p_value 0.0000107685
usedlag 0
nobs 9
critical_1% -5.49966
critical_5% -4.07211
critical_10% -3.4935
icbest -26.0036
Example 3: ADF test using fixed lag count

Inputs:

x maxlag regression autolag store regresults
3 2.9 3.1 3 3.2 3.1 3.3 3.25 3.35 3.3 1 c none false false

Excel formula:

=ADFULLER({3,2.9,3.1,3,3.2,3.1,3.3,3.25,3.35,3.3}, 1, "c", "none", FALSE, FALSE)

Expected output:

Result
adf_stat -1.29158
p_value 0.633016
usedlag 1
nobs 8
critical_1% -4.66519
critical_5% -3.36719
critical_10% -2.80296
Example 4: ADF test without deterministic terms

Inputs:

x maxlag regression autolag store regresults
1 0.95 1.02 0.98 1.01 0.99 1.03 1 1.04 1.01 1 n AIC false false

Excel formula:

=ADFULLER({1,0.95,1.02,0.98,1.01,0.99,1.03,1,1.04,1.01}, 1, "n", "AIC", FALSE, FALSE)

Expected output:

Result
adf_stat 2.85276
p_value 0.999591
usedlag 1
nobs 8
critical_1% -2.90189
critical_5% -1.96617
critical_10% -1.57649
icbest -46.7118

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import adfuller as sm_adfuller

def adfuller(x, maxlag=None, regression='c', autolag='AIC', store=False, regresults=False):
    """
    Run the Augmented Dickey-Fuller unit-root test for stationarity diagnostics.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.adfuller.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        maxlag (int, optional): Maximum lag included in the test regression; null uses default rule. Default is None.
        regression (str, optional): Deterministic terms included in the test regression (c, ct, ctt, or n). Default is 'c'.
        autolag (str, optional): Automatic lag-selection criterion (AIC, BIC, t-stat, or none). Default is 'AIC'.
        store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.
        regresults (bool, optional): Return full regression results when available. Default is False.

    Returns:
        list[list]: 2D key-value table summarizing ADF statistics and critical values.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if regression not in ("c", "ct", "ctt", "n"):
            return "Error: regression must be one of 'c', 'ct', 'ctt', or 'n'"

        if autolag in ("none", "None", "", None):
            autolag_arg = None
        elif autolag in ("AIC", "BIC", "t-stat"):
            autolag_arg = autolag
        else:
            return "Error: autolag must be 'AIC', 'BIC', 't-stat', or 'none'"

        series = to1d(x)
        if len(series) < 4:
            return "Error: x must contain at least four numeric values"

        result = sm_adfuller(
            np.asarray(series, dtype=float),
            maxlag=maxlag,
            regression=regression,
            autolag=autolag_arg,
            store=store,
            regresults=regresults,
        )

        adf_stat = float(result[0])
        p_value = float(result[1])
        usedlag = int(result[2])
        nobs_used = int(result[3])
        crit_values = result[4]

        rows = [
            ["adf_stat", adf_stat],
            ["p_value", p_value],
            ["usedlag", usedlag],
            ["nobs", nobs_used],
        ]

        if isinstance(crit_values, dict):
            for key in ("1%", "5%", "10%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        if len(result) > 5 and isinstance(result[5], (int, float, np.floating)):
            rows.append(["icbest", float(result[5])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Time-series observations as a 2D range.
Maximum lag included in the test regression; null uses default rule.
Deterministic terms included in the test regression (c, ct, ctt, or n).
Automatic lag-selection criterion (AIC, BIC, t-stat, or none).
Return result storage object in addition to scalar outputs.
Return full regression results when available.

CCF

This function estimates the cross-correlation function (CCF) between two univariate time series, reporting how strongly one series is linearly related to lagged values of another.

For lag k, the cross-correlation compares x_{t+k} with y_t after normalization by series variability.

Optional confidence intervals can be returned to assess whether cross-correlations differ materially from zero.

Excel Usage

=CCF(x, y, adjusted, fft, nlags, alpha)
  • x (list[list], required): First time series as a 2D range.
  • y (list[list], required): Second time series as a 2D range.
  • adjusted (bool, optional, default: true): Use denominator n-k instead of n for normalization.
  • fft (bool, optional, default: true): Use FFT-based convolution.
  • nlags (int, optional, default: null): Number of lags to compute; null uses statsmodels default.
  • alpha (float, optional, default: null): Significance level for confidence intervals; null disables intervals.

Returns (list[list]): 2D table with columns lag, ccf, conf_low, and conf_high.

Example 1: Cross-correlation with default options

Inputs:

x y adjusted fft nlags alpha
1 2 3 4 5 6 1 1 2 3 5 8 true true 5

Excel formula:

=CCF({1,2,3,4,5,6}, {1,1,2,3,5,8}, TRUE, TRUE, 5, )

Expected output:

Result
0 0.938953
1 0.359932
2 -0.166273
3 -0.625969
4 -1.09545
Example 2: Cross-correlation with confidence intervals

Inputs:

x y adjusted fft nlags alpha
2 1 2 1 2 1 2 1 1 2 1 2 1 2 1 2 true true 6 0.05

Excel formula:

=CCF({2,1,2,1,2,1,2,1}, {1,2,1,2,1,2,1,2}, TRUE, TRUE, 6, 0.05)

Expected output:

Result
0 -1 -1.69295 -0.307048
1 1 0.307048 1.69295
2 -1 -1.69295 -0.307048
3 1 0.307048 1.69295
4 -1 -1.69295 -0.307048
5 1 0.307048 1.69295
Example 3: Cross-correlation without FFT

Inputs:

x y adjusted fft nlags alpha
3 5 4 6 5 7 6 2 4 3 5 4 6 5 false false 5

Excel formula:

=CCF({3,5,4,6,5,7,6}, {2,4,3,5,4,6,5}, FALSE, FALSE, 5, )

Expected output:

Result
0 1
1 0.0639098
2 0.364662
3 -0.295113
4 -0.0864662
Example 4: Cross-correlation using a small lag count

Inputs:

x y adjusted fft nlags alpha
1 3 2 4 3 5 4 2 1 3 2 4 3 5 true true 3

Excel formula:

=CCF({1,3,2,4,3,5,4}, {2,1,3,2,4,3,5}, TRUE, TRUE, 3, )

Expected output:

Result
0 0.289474
1 0.508772
2 -0.160526

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import ccf as sm_ccf

def ccf(x, y, adjusted=True, fft=True, nlags=None, alpha=None):
    """
    Compute cross-correlation between two time series across nonnegative lags.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.ccf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): First time series as a 2D range.
        y (list[list]): Second time series as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n for normalization. Default is True.
        fft (bool, optional): Use FFT-based convolution. Default is True.
        nlags (int, optional): Number of lags to compute; null uses statsmodels default. Default is None.
        alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.

    Returns:
        list[list]: 2D table with columns lag, ccf, conf_low, and conf_high.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        x_vals = to1d(x)
        y_vals = to1d(y)

        if len(x_vals) < 2 or len(y_vals) < 2:
            return "Error: x and y must each contain at least two numeric values"

        if len(x_vals) != len(y_vals):
            return "Error: x and y must have the same number of numeric values"

        result = sm_ccf(
            np.asarray(x_vals, dtype=float),
            np.asarray(y_vals, dtype=float),
            adjusted=adjusted,
            fft=fft,
            nlags=nlags,
            alpha=alpha,
        )

        if isinstance(result, tuple):
            ccf_vals, confint = result
        else:
            ccf_vals = result
            confint = None

        ccf_arr = np.asarray(ccf_vals, dtype=float)
        conf_arr = np.asarray(confint, dtype=float) if confint is not None else None

        table = []
        for lag in range(len(ccf_arr)):
            low = ""
            high = ""
            if conf_arr is not None and lag < len(conf_arr):
                low = float(conf_arr[lag][0])
                high = float(conf_arr[lag][1])
            table.append([lag, float(ccf_arr[lag]), low, high])

        return table
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

First time series as a 2D range.
Second time series as a 2D range.
Use denominator n-k instead of n for normalization.
Use FFT-based convolution.
Number of lags to compute; null uses statsmodels default.
Significance level for confidence intervals; null disables intervals.

CCOVF

This function computes the cross-covariance function between two univariate series, quantifying linear co-movement across lag offsets.

For lag k, the cross-covariance is:

\gamma_{xy}(k) = \operatorname{Cov}(x_{t+k}, y_t)

The result helps identify lagged lead-lag structure before normalization into cross-correlation.

Excel Usage

=CCOVF(x, y, adjusted, demean, fft)
  • x (list[list], required): First time series as a 2D range.
  • y (list[list], required): Second time series as a 2D range.
  • adjusted (bool, optional, default: true): Use denominator n-k instead of n in covariance estimation.
  • demean (bool, optional, default: true): Subtract sample means from both series before covariance estimation.
  • fft (bool, optional, default: true): Use FFT-based convolution.

Returns (list[list]): 2D table with columns lag and ccovf.

Example 1: Cross-covariance with default options

Inputs:

x y adjusted demean fft
1 2 3 4 5 6 2 3 4 5 6 7 true true true

Excel formula:

=CCOVF({1,2,3,4,5,6}, {2,3,4,5,6,7}, TRUE, TRUE, TRUE)

Expected output:

Result
0 2.91667
1 1.75
2 0.25
3 -1.58333
4 -3.75
5 -6.25
Example 2: Cross-covariance without demeaning

Inputs:

x y adjusted demean fft
3 5 4 6 5 7 1 2 2 3 3 4 true false true

Excel formula:

=CCOVF({3,5,4,6,5,7}, {1,2,2,3,3,4}, TRUE, FALSE, TRUE)

Expected output:

Result
0 13.6667
1 12.2
2 11.75
3 10
4 9.5
5 7
Example 3: Cross-covariance without FFT

Inputs:

x y adjusted demean fft
2 4 3 5 4 6 5 1 3 2 4 3 5 4 false true false

Excel formula:

=CCOVF({2,4,3,5,4,6,5}, {1,3,2,4,3,5,4}, FALSE, TRUE, FALSE)

Expected output:

Result
0 1.55102
1 0.0991254
2 0.565598
3 -0.457726
4 -0.134111
5 -0.586006
6 -0.262391
Example 4: Cross-covariance with unadjusted denominator

Inputs:

x y adjusted demean fft
1 3 2 4 3 5 4 6 2 1 3 2 4 3 5 4 false true true

Excel formula:

=CCOVF({1,3,2,4,3,5,4,6}, {2,1,3,2,4,3,5,4}, FALSE, TRUE, TRUE)

Expected output:

Result
0 0.75
1 1.3125
2 -0.0625
3 0.3125
4 -0.625
5 -0.3125
6 -0.6875
7 -0.3125

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import ccovf as sm_ccovf

def ccovf(x, y, adjusted=True, demean=True, fft=True):
    """
    Estimate cross-covariance values between two time series across lags.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.ccovf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): First time series as a 2D range.
        y (list[list]): Second time series as a 2D range.
        adjusted (bool, optional): Use denominator n-k instead of n in covariance estimation. Default is True.
        demean (bool, optional): Subtract sample means from both series before covariance estimation. Default is True.
        fft (bool, optional): Use FFT-based convolution. Default is True.

    Returns:
        list[list]: 2D table with columns lag and ccovf.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        x_vals = to1d(x)
        y_vals = to1d(y)

        if len(x_vals) < 2 or len(y_vals) < 2:
            return "Error: x and y must each contain at least two numeric values"

        if len(x_vals) != len(y_vals):
            return "Error: x and y must have the same number of numeric values"

        ccovf_vals = sm_ccovf(
            np.asarray(x_vals, dtype=float),
            np.asarray(y_vals, dtype=float),
            adjusted=adjusted,
            demean=demean,
            fft=fft,
        )

        arr = np.asarray(ccovf_vals, dtype=float)
        return [[lag, float(arr[lag])] for lag in range(len(arr))]
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

First time series as a 2D range.
Second time series as a 2D range.
Use denominator n-k instead of n in covariance estimation.
Subtract sample means from both series before covariance estimation.
Use FFT-based convolution.

KPSS

This function applies the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test for stationarity.

Unlike the ADF test, the KPSS null hypothesis is stationarity (around a level or trend), and the alternative is a unit root.

The output includes the KPSS statistic, p-value, selected lag truncation, and reference critical values.

Excel Usage

=KPSS(x, regression, nlags, store)
  • x (list[list], required): Time-series observations as a 2D range.
  • regression (str, optional, default: “c”): Null hypothesis type, c for level-stationary or ct for trend-stationary.
  • nlags (str, optional, default: “auto”): Lag selection mode (auto or legacy) or an integer provided as text.
  • store (bool, optional, default: false): Return result storage object in addition to scalar outputs.

Returns (list[list]): 2D key-value table summarizing KPSS statistics and critical values.

Example 1: KPSS with automatic lag selection and level stationarity null

Inputs:

x regression nlags store
1 1.1 1.05 1.08 1.03 1.09 1.04 1.1 1.06 1.11 c auto false

Excel formula:

=KPSS({1,1.1,1.05,1.08,1.03,1.09,1.04,1.1,1.06,1.11}, "c", "auto", FALSE)

Expected output:

Result
kpss_stat 0.5
p_value 0.0416667
lags 9
critical_10% 0.347
critical_5% 0.463
critical_2.5% 0.574
critical_1% 0.739
Example 2: KPSS with legacy lag rule and trend stationarity null

Inputs:

x regression nlags store
2 2.2 2.35 2.5 2.65 2.8 2.95 3.1 3.25 3.4 ct legacy false

Excel formula:

=KPSS({2,2.2,2.35,2.5,2.65,2.8,2.95,3.1,3.25,3.4}, "ct", "legacy", FALSE)

Expected output:

Result
kpss_stat 0.358798
p_value 0.01
lags 7
critical_10% 0.119
critical_5% 0.146
critical_2.5% 0.176
critical_1% 0.216
Example 3: KPSS with manually provided lag count

Inputs:

x regression nlags store
3 3.1 3 3.15 3.05 3.2 3.1 3.25 3.15 3.3 c 2 false

Excel formula:

=KPSS({3,3.1,3,3.15,3.05,3.2,3.1,3.25,3.15,3.3}, "c", 2, FALSE)

Expected output:

Result
kpss_stat 0.440047
p_value 0.0598935
lags 2
critical_10% 0.347
critical_5% 0.463
critical_2.5% 0.574
critical_1% 0.739
Example 4: KPSS on low-variance stationary-looking series

Inputs:

x regression nlags store
5 5.02 4.99 5.01 5 5.03 4.98 5.01 5 5.02 c auto false

Excel formula:

=KPSS({5,5.02,4.99,5.01,5,5.03,4.98,5.01,5,5.02}, "c", "auto", FALSE)

Expected output:

Result
kpss_stat 0.326271
p_value 0.1
lags 6
critical_10% 0.347
critical_5% 0.463
critical_2.5% 0.574
critical_1% 0.739

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import kpss as sm_kpss

def kpss(x, regression='c', nlags='auto', store=False):
    """
    Run the KPSS stationarity test under level or trend null hypotheses.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.kpss.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        regression (str, optional): Null hypothesis type, c for level-stationary or ct for trend-stationary. Default is 'c'.
        nlags (str, optional): Lag selection mode (auto or legacy) or an integer provided as text. Default is 'auto'.
        store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.

    Returns:
        list[list]: 2D key-value table summarizing KPSS statistics and critical values.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if regression not in ("c", "ct"):
            return "Error: regression must be 'c' or 'ct'"

        if nlags in ("auto", "legacy"):
            nlags_arg = nlags
        else:
            try:
                parsed = int(float(nlags))
            except (TypeError, ValueError):
                return "Error: nlags must be 'auto', 'legacy', or an integer"
            if parsed < 0:
                return "Error: nlags integer must be nonnegative"
            nlags_arg = parsed

        series = to1d(x)
        if len(series) < 4:
            return "Error: x must contain at least four numeric values"

        result = sm_kpss(
            np.asarray(series, dtype=float),
            regression=regression,
            nlags=nlags_arg,
            store=store,
        )

        kpss_stat = float(result[0])
        p_value = float(result[1])
        used_lags = int(result[2])
        crit_values = result[3]

        rows = [
            ["kpss_stat", kpss_stat],
            ["p_value", p_value],
            ["lags", used_lags],
        ]

        if isinstance(crit_values, dict):
            for key in ("10%", "5%", "2.5%", "1%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Time-series observations as a 2D range.
Null hypothesis type, c for level-stationary or ct for trend-stationary.
Lag selection mode (auto or legacy) or an integer provided as text.
Return result storage object in addition to scalar outputs.

PACF

This function estimates the partial autocorrelation function (PACF), which measures the direct correlation between x_t and x_{t-k} after controlling for intermediate lags.

In autoregressive model identification, PACF helps indicate candidate model order by highlighting lags with substantial direct dependence.

The PACF at lag k can be interpreted as the final coefficient in an AR(k) regression.

Excel Usage

=PACF(x, nlags, method, alpha)
  • x (list[list], required): Time-series observations as a 2D range.
  • nlags (int, optional, default: null): Maximum lag to compute; null uses statsmodels default.
  • method (str, optional, default: “ywadjusted”): Estimation method for PACF.
  • alpha (float, optional, default: null): Significance level for confidence intervals; null disables intervals.

Returns (list[list]): 2D table with columns lag, pacf, conf_low, and conf_high.

Example 1: PACF using Yule-Walker adjusted method

Inputs:

x nlags method alpha
1 2 3 4 5 4 3 2 4 ywadjusted

Excel formula:

=PACF({1,2,3,4,5,4,3,2}, 4, "ywadjusted", )

Expected output:

Result
0 1
1 0.571429
2 -0.649832
3 -0.832527
4 -1.57327
Example 2: PACF with confidence intervals

Inputs:

x nlags method alpha
2 1 2 1 2 1 2 1 2 4 burg 0.05

Excel formula:

=PACF({2,1,2,1,2,1,2,1,2}, 4, "burg", 0.05)

Expected output:

Result
0 1 1 1
1 -0.97561 -1.62893 -0.322288
2 1 0.346679 1.65332
3 3.14684e-16 -0.653321 0.653321
4 2.5924e-16 -0.653321 0.653321
Example 3: PACF using OLS adjusted estimator

Inputs:

x nlags method alpha
5 4 6 5 7 6 8 7 9 3 ols-adjusted

Excel formula:

=PACF({5,4,6,5,7,6,8,7,9}, 3, "ols-adjusted", )

Expected output:

Result
0 1
1 0.5625
2 1.28571
3 -0.5
Example 4: PACF using Levinson-Durbin biased estimator

Inputs:

x nlags method alpha
3 2 4 3 5 4 6 5 3 ldbiased

Excel formula:

=PACF({3,2,4,3,5,4,6,5}, 3, "ldbiased", )

Expected output:

Result
0 1
1 0.25
2 0.288889
3 -0.346983

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import pacf as sm_pacf

def pacf(x, nlags=None, method='ywadjusted', alpha=None):
    """
    Compute partial autocorrelation values across lags for lag-order diagnostics.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.pacf.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        nlags (int, optional): Maximum lag to compute; null uses statsmodels default. Default is None.
        method (str, optional): Estimation method for PACF. Default is 'ywadjusted'.
        alpha (float, optional): Significance level for confidence intervals; null disables intervals. Default is None.

    Returns:
        list[list]: 2D table with columns lag, pacf, conf_low, and conf_high.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        valid_methods = (
            "yw", "ywadjusted", "ols", "ols-inefficient", "ols-adjusted",
            "ywm", "ywmle", "ld", "ldadjusted", "ldb", "ldbiased", "burg"
        )
        if method not in valid_methods:
            return "Error: method is not a supported PACF estimator"

        series = to1d(x)
        if len(series) < 3:
            return "Error: x must contain at least three numeric values"

        result = sm_pacf(np.asarray(series, dtype=float), nlags=nlags, method=method, alpha=alpha)

        if isinstance(result, tuple):
            pacf_vals, confint = result
        else:
            pacf_vals = result
            confint = None

        pacf_arr = np.asarray(pacf_vals, dtype=float)
        conf_arr = np.asarray(confint, dtype=float) if confint is not None else None

        table = []
        for lag in range(len(pacf_arr)):
            low = ""
            high = ""
            if conf_arr is not None and lag < len(conf_arr):
                low = float(conf_arr[lag][0])
                high = float(conf_arr[lag][1])
            table.append([lag, float(pacf_arr[lag]), low, high])

        return table
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Time-series observations as a 2D range.
Maximum lag to compute; null uses statsmodels default.
Estimation method for PACF.
Significance level for confidence intervals; null disables intervals.

Q_STAT

This function computes the Ljung-Box portmanteau statistic from a sequence of autocorrelation estimates.

The statistic aggregates squared autocorrelations across lags to test whether serial correlation remains in a process:

Q = n(n+2) \sum_{k=1}^{m} \frac{\rho_k^2}{n-k}

It returns the cumulative Q-statistic and associated p-value at each lag.

Excel Usage

=Q_STAT(x, nobs)
  • x (list[list], required): Autocorrelation coefficients as a 2D range, typically excluding lag zero.
  • nobs (int, required): Total number of observations in the underlying sample.

Returns (list[list]): 2D table with columns lag, q_stat, and p_value.

Example 1: Ljung-Box statistics from short ACF sequence

Inputs:

x nobs
0.2 0.1 0.05 60

Excel formula:

=Q_STAT({0.2,0.1,0.05}, 60)

Expected output:

Result
1 2.52203 0.112266
2 3.16341 0.205624
3 3.32657 0.343962
Example 2: Ljung-Box statistics from moderate ACF sequence

Inputs:

x nobs
0.35 0.22 0.11 0.04 120

Excel formula:

=Q_STAT({0.35,0.22,0.11,0.04}, 120)

Expected output:

Result
1 15.0706 0.000103564
2 21.0755 0.0000265167
3 22.5895 0.0000491731
4 22.7915 0.000139371
Example 3: Ljung-Box statistics with alternating autocorrelation signs

Inputs:

x nobs
0.25 -0.15 0.1 -0.05 80

Excel formula:

=Q_STAT({0.25,-0.15,0.1,-0.05}, 80)

Expected output:

Result
1 5.18987 0.0227189
2 7.08218 0.0289817
3 7.93413 0.0473929
4 8.14992 0.0862383
Example 4: Ljung-Box statistics from five-lag autocorrelation input

Inputs:

x nobs
0.18 0.12 0.09 0.04 0.02 150

Excel formula:

=Q_STAT({0.18,0.12,0.09,0.04,0.02}, 150)

Expected output:

Result
1 4.95785 0.0259724
2 7.17623 0.0276504
3 8.43256 0.0378689
4 8.68242 0.0695466
5 8.74532 0.119664

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import q_stat as sm_q_stat

def q_stat(x, nobs):
    """
    Compute Ljung-Box Q statistics and p-values from autocorrelation coefficients.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.q_stat.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Autocorrelation coefficients as a 2D range, typically excluding lag zero.
        nobs (int): Total number of observations in the underlying sample.

    Returns:
        list[list]: 2D table with columns lag, q_stat, and p_value.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if nobs <= 1:
            return "Error: nobs must be greater than 1"

        acf_vals = to1d(x)
        if len(acf_vals) == 0:
            return "Error: x must contain at least one numeric autocorrelation value"

        q_vals, p_vals = sm_q_stat(np.asarray(acf_vals, dtype=float), nobs=nobs)

        q_arr = np.asarray(q_vals, dtype=float)
        p_arr = np.asarray(p_vals, dtype=float)

        return [[lag + 1, float(q_arr[lag]), float(p_arr[lag])] for lag in range(len(q_arr))]
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Autocorrelation coefficients as a 2D range, typically excluding lag zero.
Total number of observations in the underlying sample.

RURTEST

This function applies the Range Unit-Root (RUR) test, which tests a stationarity null hypothesis using a range-based statistic.

It is designed as an alternative unit-root diagnostic that can be robust under nonlinearities and structural features considered in its original development.

The output includes the RUR statistic, p-value, and critical values used for interpretation.

Excel Usage

=RURTEST(x, store)
  • x (list[list], required): Time-series observations as a 2D range.
  • store (bool, optional, default: false): Return result storage object in addition to scalar outputs.

Returns (list[list]): 2D key-value table summarizing RUR test outputs.

Example 1: Range unit-root test on mildly trending data

Inputs:

x store
1 1.05 1.02 1.08 1.06 1.1 1.09 1.12 1.1 1.14 1.16 1.18 1.2 1.19 1.22 1.24 1.23 1.27 1.29 1.31 1.3 1.33 1.36 1.35 1.38 1.4 1.42 1.41 1.44 1.46 false

Excel formula:

=RURTEST({1,1.05,1.02,1.08,1.06,1.1,1.09,1.12,1.1,1.14,1.16,1.18,1.2,1.19,1.22,1.24,1.23,1.27,1.29,1.31,1.3,1.33,1.36,1.35,1.38,1.4,1.42,1.41,1.44,1.46}, FALSE)

Expected output:

Result
rur_stat 3.65148
p_value 0.95
critical_10% 1.09624
critical_5% 0.94492
critical_2.5% 0.83556
critical_1% 0.68962
Example 2: Range unit-root test on level-stationary-looking data

Inputs:

x store
5 5.02 4.99 5.01 5 5.03 4.98 5.01 5 5.02 4.99 5.01 5 5.03 4.97 5 5.01 4.99 5.02 5 4.98 5.01 5 5.02 4.99 5.01 5 5.03 4.98 5 false

Excel formula:

=RURTEST({5,5.02,4.99,5.01,5,5.03,4.98,5.01,5,5.02,4.99,5.01,5,5.03,4.97,5,5.01,4.99,5.02,5,4.98,5.01,5,5.02,4.99,5.01,5,5.03,4.98,5}, FALSE)

Expected output:

Result
rur_stat 0.912871
p_value 0.05
critical_10% 1.09624
critical_5% 0.94492
critical_2.5% 0.83556
critical_1% 0.68962
Example 3: Range unit-root test on higher variance observations

Inputs:

x store
2 2.4 2.1 2.5 2.2 2.6 2.3 2.7 2.4 2.8 2.35 2.9 2.45 2.95 2.5 3 2.55 3.05 2.6 3.1 2.65 3.15 2.7 3.2 2.75 3.25 2.8 3.3 2.85 3.35 false

Excel formula:

=RURTEST({2,2.4,2.1,2.5,2.2,2.6,2.3,2.7,2.4,2.8,2.35,2.9,2.45,2.95,2.5,3,2.55,3.05,2.6,3.1,2.65,3.15,2.7,3.2,2.75,3.25,2.8,3.3,2.85,3.35}, FALSE)

Expected output:

Result
rur_stat 2.73861
p_value 0.95
critical_10% 1.09624
critical_5% 0.94492
critical_2.5% 0.83556
critical_1% 0.68962
Example 4: Range unit-root test on smooth growth sequence

Inputs:

x store
3 3.05 3.1 3.16 3.2 3.25 3.31 3.36 3.4 3.45 3.5 3.55 3.6 3.66 3.7 3.75 3.81 3.86 3.9 3.95 4 4.05 4.1 4.16 4.2 4.25 4.31 4.36 4.4 4.45 false

Excel formula:

=RURTEST({3,3.05,3.1,3.16,3.2,3.25,3.31,3.36,3.4,3.45,3.5,3.55,3.6,3.66,3.7,3.75,3.81,3.86,3.9,3.95,4,4.05,4.1,4.16,4.2,4.25,4.31,4.36,4.4,4.45}, FALSE)

Expected output:

Result
rur_stat 5.29465
p_value 0.95
critical_10% 1.09624
critical_5% 0.94492
critical_2.5% 0.83556
critical_1% 0.68962

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import range_unit_root_test as sm_range_unit_root_test

def rurtest(x, store=False):
    """
    Run the range unit-root test as an alternative stationarity diagnostic.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.range_unit_root_test.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        store (bool, optional): Return result storage object in addition to scalar outputs. Default is False.

    Returns:
        list[list]: 2D key-value table summarizing RUR test outputs.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        series = to1d(x)
        if len(series) < 25:
            return "Error: x must contain at least 25 numeric values"

        result = sm_range_unit_root_test(np.asarray(series, dtype=float), store=store)

        rur_stat = float(result[0])
        p_value = float(result[1])
        crit_values = result[2]

        rows = [
            ["rur_stat", rur_stat],
            ["p_value", p_value],
        ]

        if isinstance(crit_values, dict):
            for key in ("10%", "5%", "2.5%", "1%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Time-series observations as a 2D range.
Return result storage object in addition to scalar outputs.

ZIVOT_ANDREWS

This function performs the Zivot-Andrews unit-root test, which extends unit-root diagnostics by allowing a single unknown structural break in the series.

The test evaluates a unit-root null against alternatives with break-adjusted deterministic components.

It returns the test statistic, p-value, critical values, selected lag, and estimated break index.

Excel Usage

=ZIVOT_ANDREWS(x, trim, maxlag, regression, autolag)
  • x (list[list], required): Time-series observations as a 2D range.
  • trim (float, optional, default: 0.15): Fraction of observations trimmed from each end when searching for break date.
  • maxlag (int, optional, default: null): Maximum lag included in candidate regressions; null uses default rule.
  • regression (str, optional, default: “c”): Deterministic specification, c, t, or ct.
  • autolag (str, optional, default: “AIC”): Lag-selection criterion (AIC, BIC, t-stat, or none).

Returns (list[list]): 2D key-value table summarizing Zivot-Andrews test outputs.

Example 1: Zivot-Andrews with default options

Inputs:

x trim maxlag regression autolag
1 1.1 1.2 1.25 1.3 1.35 1.33 1.4 1.45 1.5 1.55 1.58 1.6 1.65 1.7 0.15 c AIC

Excel formula:

=ZIVOT_ANDREWS({1,1.1,1.2,1.25,1.3,1.35,1.33,1.4,1.45,1.5,1.55,1.58,1.6,1.65,1.7}, 0.15, , "c", "AIC")

Expected output:

za_stat NaN
p_value NaN
base_lag 4
break_index 4
critical_1% -5.27644
critical_5% -4.81067
critical_10% -4.56618
Example 2: Zivot-Andrews using trend-only regression

Inputs:

x trim maxlag regression autolag
2 2.04 2.08 2.12 2.16 2.2 2.24 2.28 2.32 2.36 2.4 2.44 2.48 2.52 2.56 2.6 2.66 2.72 2.78 2.84 2.9 2.96 3.02 3.08 3.14 3.2 3.26 3.32 3.38 3.44 0.15 1 t AIC

Excel formula:

=ZIVOT_ANDREWS({2,2.04,2.08,2.12,2.16,2.2,2.24,2.28,2.32,2.36,2.4,2.44,2.48,2.52,2.56,2.6,2.66,2.72,2.78,2.84,2.9,2.96,3.02,3.08,3.14,3.2,3.26,3.32,3.38,3.44}, 0.15, 1, "t", "AIC")

Expected output:

za_stat NaN
p_value NaN
base_lag 1
break_index 17
critical_1% -5.03421
critical_5% -4.4058
critical_10% -4.13678
Example 3: Zivot-Andrews using constant and trend regression

Inputs:

x trim maxlag regression autolag
3 3.05 3.08 3.12 3.2 3.25 3.3 3.38 3.42 3.5 3.55 3.6 3.66 3.72 3.8 0.15 2 ct t-stat

Excel formula:

=ZIVOT_ANDREWS({3,3.05,3.08,3.12,3.2,3.25,3.3,3.38,3.42,3.5,3.55,3.6,3.66,3.72,3.8}, 0.15, 2, "ct", "t-stat")

Expected output:

Result
za_stat -5.0866
p_value 0.0485155
base_lag 0
break_index 3
critical_1% -5.57556
critical_5% -5.07332
critical_10% -4.82668
Example 4: Zivot-Andrews with fixed max lag and no autolag

Inputs:

x trim maxlag regression autolag
4 4.03 4.07 4.1 4.13 4.17 4.2 4.24 4.27 4.31 4.34 4.38 4.41 4.45 4.48 4.52 4.55 4.6 4.66 4.71 4.77 4.82 4.88 4.93 4.99 5.04 5.1 5.15 5.21 5.26 0.15 1 c none

Excel formula:

=ZIVOT_ANDREWS({4,4.03,4.07,4.1,4.13,4.17,4.2,4.24,4.27,4.31,4.34,4.38,4.41,4.45,4.48,4.52,4.55,4.6,4.66,4.71,4.77,4.82,4.88,4.93,4.99,5.04,5.1,5.15,5.21,5.26}, 0.15, 1, "c", "none")

Expected output:

Result
za_stat -1.51838
p_value 0.999
base_lag 1
break_index 17
critical_1% -5.27644
critical_5% -4.81067
critical_10% -4.56618

Python Code

Show Code
import numpy as np
from statsmodels.tsa.stattools import zivot_andrews as sm_zivot_andrews

def zivot_andrews(x, trim=0.15, maxlag=None, regression='c', autolag='AIC'):
    """
    Run the Zivot-Andrews unit-root test allowing one endogenous structural break.

    See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.zivot_andrews.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Time-series observations as a 2D range.
        trim (float, optional): Fraction of observations trimmed from each end when searching for break date. Default is 0.15.
        maxlag (int, optional): Maximum lag included in candidate regressions; null uses default rule. Default is None.
        regression (str, optional): Deterministic specification, c, t, or ct. Default is 'c'.
        autolag (str, optional): Lag-selection criterion (AIC, BIC, t-stat, or none). Default is 'AIC'.

    Returns:
        list[list]: 2D key-value table summarizing Zivot-Andrews test outputs.
    """
    try:
        def to1d(values):
            if isinstance(values, list):
                if all(isinstance(row, list) for row in values):
                    raw = [item for row in values for item in row]
                else:
                    raw = values
            else:
                raw = [values]

            out = []
            for item in raw:
                try:
                    out.append(float(item))
                except (TypeError, ValueError):
                    continue
            return out

        if trim < 0 or trim > 0.333:
            return "Error: trim must be between 0 and 0.333"

        if regression not in ("c", "t", "ct"):
            return "Error: regression must be 'c', 't', or 'ct'"

        if autolag in ("none", "None", "", None):
            autolag_arg = None
        elif autolag in ("AIC", "BIC", "t-stat"):
            autolag_arg = autolag
        else:
            return "Error: autolag must be 'AIC', 'BIC', 't-stat', or 'none'"

        series = to1d(x)
        if len(series) < 10:
            return "Error: x must contain at least ten numeric values"

        result = sm_zivot_andrews(
            np.asarray(series, dtype=float),
            trim=trim,
            maxlag=maxlag,
            regression=regression,
            autolag=autolag_arg,
        )

        za_stat = float(result[0])
        p_value = float(result[1])
        crit_values = result[2]
        base_lag = int(result[3])
        break_index = int(result[4])

        rows = [
            ["za_stat", za_stat],
            ["p_value", p_value],
            ["base_lag", base_lag],
            ["break_index", break_index],
        ]

        if isinstance(crit_values, dict):
            for key in ("1%", "5%", "10%"):
                if key in crit_values:
                    rows.append([f"critical_{key}", float(crit_values[key])])

        return rows
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Time-series observations as a 2D range.
Fraction of observations trimmed from each end when searching for break date.
Maximum lag included in candidate regressions; null uses default rule.
Deterministic specification, c, t, or ct.
Lag-selection criterion (AIC, BIC, t-stat, or none).