Skip to main content

Data Analysis Toolpak

AppSource

Data Analysis Toolpak adds the STAT custom function namespace to Excel. It provides Python-backed statistical calculations for probability distributions, descriptive statistics, hypothesis tests, correlation and association testing, regression models, Bayesian analysis, multivariate analysis, survival models, and time-series workflows.

The add-in is designed for spreadsheet-first analysis: keep observations in Excel ranges, call calculations with formulas such as =STAT.DESCRIBE(A2:A101), and use the taskpane when you want guided function discovery, parameter entry, or a static result instead of a live formula.

Warning: Functions are provided as-is for academic and non-commercial use with no warranty whatsoever as per the terms of use. Never rely on results for clinical, financial, safety, compliance, or commercial decisions. Always verify assumptions, inputs, methods, and results independently.

Quick Start

  1. Install Data Analysis Toolpak using your organization's add-in deployment process or a sideloaded manifest, then open Excel.
  2. On the Home ribbon, choose Data Analysis Toolpak to open the taskpane.
  3. Search or browse functions, select a function, fill the required parameters, and insert either a formula or the calculated result.
  4. To type formulas manually, use the STAT namespace: =STAT.FUNCTION_NAME(argument1, argument2, ...).

Use Formulas Directly

Enter functions in cells with the STAT namespace. For example:

=STAT.NORM(1.96, 0, 1, "cdf")

Use normal Excel cell references and ranges for parameters. Table-like parameters use Excel ranges such as A2:C25, which are passed to Python internally as two-dimensional lists. Many functions return scalar values. Functions that produce multiple statistics, model summaries, or tables may spill into adjacent cells or return a structured result, depending on the underlying Python function.

Insert From The Taskpane

Use the taskpane when you want to search the function catalog, inspect parameter descriptions, choose option values, or select worksheet ranges through the UI. The taskpane can insert a live custom function formula or calculate the current result and write the static value into the selected cell.

Select a function Figure 1: Use the taskpane to search and browse the statistical function catalog.

Insert a function Figure 2: Fill parameters and choose to insert a live formula or a static result.

Some functions support insertion as Excel's native =PY(...) formula. Use this path when the taskpane offers it and you want the workbook to use Microsoft Python in Excel rather than the add-in's local Pyodide runtime. The Function Reference table marks native =PY(...) support per function from the add-in metadata.

Insert a py function Figure 3: Some functions support direct insertion as a native Excel =PY formula.

Function Areas

Data Analysis Toolpak includes Python-backed functions across these statistical areas. Use the taskpane search box to find the specific calculation, inspect parameter descriptions, and open detailed function documentation when available.

AreaWhat it covers
Probability distributionsContinuous, discrete, and multivariate distributions, including normal, t, chi-square, F, beta, binomial, Poisson, gamma, lognormal, multinomial, Dirichlet, Wishart, and related distribution methods.
Summary and frequency statisticsDescriptive statistics, geometric and harmonic means, moments, skewness, kurtosis, modes, percentiles, frequency tables, binned statistics, and reliability metrics.
Hypothesis testsOne-sample, paired-sample, independent-sample, association, correlation, normality, variance, goodness-of-fit, ANOVA, and post-hoc testing workflows.
Regression and statistical modelsOLS, WLS, GLS, robust regression, quantile regression, GLM, count models, discrete choice models, mixed effects, survival models, and diagnostics.
Bayesian analysisConjugate-prior updates, credible intervals, posterior summaries, Dirichlet-multinomial calculations, tail probabilities, and weighted posterior summaries.
Multivariate analysisMANOVA, canonical correlation, factor analysis, PCA, multidimensional scaling, random correlation matrices, and multivariate distribution helpers.
Time seriesAutocorrelation, stationarity tests, moving averages, decomposition, spectral analysis, exponential smoothing, ARIMA/SARIMAX forecasts, and order selection.

Common Use Cases

1. Descriptive Statistics For A Metric

Problem: You have a column of sales, response times, or measurements and need a quick statistical profile.

Formula: =STAT.DESCRIBE(A2:A9)

Use when: You want count, central tendency, spread, and shape statistics from one range before moving into charts, tests, or models.

RowValueFormula or actionOutput
212.4Select A2:A9 in the taskpane for STAT.DESCRIBECount, min, max, mean, variance, skewness, kurtosis
310.9Or type =STAT.DESCRIBE(A2:A9)Same spilled summary output
414.8Leave ddof blank for the default degrees-of-freedom behaviorUses metadata default ddof=0
513.2Set bias if you need biased or unbiased shape estimatesAdjusted shape statistics

2. Compare Two Independent Groups

Problem: You ran an A/B test and need to know whether the average metric differs between two independent groups.

Formula: =STAT.TTEST_IND(A2:A8, B2:B8, FALSE, "two-sided")

Use when: You have two independent samples and want a t-test. Set equal_var to FALSE for Welch's t-test when variances may differ.

ObservationVariant AVariant BFormula or actionOutput
118.220.1Select A2:A8 as a, B2:B8 as bTest statistic and p-value
217.922.4Set equal_var to FALSEWelch-style comparison
319.121.8Set ttest_alternative to two-sidedTwo-sided significance test
418.720.9Insert formula for live recalculationUpdates when new observations arrive

3. Probability Distribution Lookup

Problem: You need the probability that a standard normal variable is less than or equal to 1.96.

Formula: =STAT.NORM(1.96, 0, 1, "cdf")

Use when: You need distribution values such as probability density, cumulative probability, inverse CDF, survival probability, mean, variance, standard deviation, or median. Most distribution wrappers use a method parameter such as norm_method, binom_mode, or poisson_mode.

CaseValueParametersFormula or actionOutput
Standard normal CDF1.96loc=0, scale=1, norm_method="cdf"=STAT.NORM(1.96,0,1,"cdf")About 0.975
Normal PDF0loc=0, scale=1, norm_method="pdf"=STAT.NORM(0,0,1,"pdf")About 0.399
Poisson probability3mu=2.5, poisson_mode="pmf"=STAT.POISSON_DIST(3,2.5,"pmf")Probability of exactly 3 events
Binomial probability7n=10, p=0.6, binom_mode="pmf"=STAT.BINOM(7,10,0.6,"pmf")Probability of 7 successes

4. Regression Model From Worksheet Ranges

Problem: You have a response column and one or more predictor columns and want an ordinary least squares model summary.

Formula: =STAT.OLS_REGRESSION(D2:D21, A2:C21)

Use when: You need a model estimate and diagnostics from tabular data already in Excel. Keep y as a single response column and x as one or more predictor columns with matching row counts.

RowPredictor 1Predictor 2ResponseFormula or actionOutput
24.17.232.5Select D2:D21 as y and A2:C21 as xRegression coefficients and model statistics
35.08.136.0Leave fit_intercept blank to use the defaultIncludes an intercept by default
45.87.938.4Set alpha if you need a different confidence levelConfidence intervals adjust
56.29.443.1Insert result instead of formula for a frozen report tableStatic model summary

5. Time-Series Stationarity And Forecasting

Problem: You have monthly demand data and need to check stationarity before forecasting.

Formula: =STAT.ADFULLER(B2:B61)

Use when: You want to diagnose a series before selecting a model. Pair stationarity diagnostics such as ADFULLER or KPSS with forecasting helpers such as SES_FORECAST, HOLT_FORECAST, HW_FORECAST, ARIMA_FORECAST, or SARIMAX_FORECAST.

MonthDemandFormula or actionOutput
Jan120=STAT.ADFULLER(B2:B61)ADF statistic, p-value, and related diagnostics
Feb128=STAT.KPSS(B2:B61)KPSS statistic, p-value, lags, and critical values
Mar131=STAT.HW_FORECAST(B2:B61,6,"add","add",12)Six-step seasonal forecast
Apr125Insert a static result after finalizing assumptionsStable forecast output for a report

6. Troubleshooting Optional Parameters And Table Shapes

Problem: A formula returns an error string, #VALUE!, or output that does not match expectations.

Formula/action: Reopen the function in the taskpane, check required parameters, inspect option names, and verify each range shape.

Use when: A model, test, or distribution wrapper expects a specific table layout, enum value, or non-empty numeric range.

SymptomLikely causeFormula or actionOutput
Error: ...Invalid option text such as a misspelled method valueChoose the value from the taskpane dropdown or type the documented option exactlyFunction recalculates
#VALUE!Input range is empty, ragged, or has unexpected textConfirm ranges are rectangular and numeric where requiredValid range passed to Python
Model returns an error stringResponse and predictor ranges have different row countsAlign y, x, and group ranges before rerunningModel summary output
Result changes too slowlyLarge model or first Python runtime loadInsert static result after checking the outputStable worksheet value

Input And Output Notes

  • Numeric parameters use workbook numbers. Units are not converted automatically.
  • Range parameters with type list[list] should be rectangular Excel ranges. A single row or single column is still passed as a two-dimensional list.
  • Optional parameters are shown in square brackets in signatures. Defaults are shown in the Function Reference table and in the taskpane metadata.
  • Boolean parameters accept Excel TRUE/FALSE values or taskpane controls where available.
  • For option parameters, choose the value from the taskpane dropdown or type the documented option text exactly as shown.
  • Functions built on SciPy, statsmodels, Pingouin, scikit-learn, lifelines, or related Python packages inherit the assumptions and limitations of those libraries.
  • Functions execute through the local Pyodide runtime unless inserted through Excel's native =PY(...) path.

FAQ

Does Data Analysis Toolpak send workbook data to Boardflare?

No. Data Analysis Toolpak calculations run inside Excel using a local Python/Pyodide runtime, and supported calculations can also be inserted as native Excel =PY(...) formulas. Review your organization's Microsoft 365 Python in Excel policy before using the native =PY(...) path.

Why does the first calculation take longer?

The add-in may need to initialize the Python runtime and load packages such as SciPy, statsmodels, Pingouin, scikit-learn, or lifelines the first time a function is used. Later calculations are usually faster once the runtime is warm.

Why do some functions not support native =PY insertion?

Native =PY(...) support is controlled by whether Python in Excel supports the required packages or modules in its execution environment.

What should I do with high-stakes analyses?

DO NOT RELY ON DATA ANALYSIS TOOLPAK FOR HIGH-STAKES ANALYSES. Use it for exploratory, educational, or non-commercial workbook analysis only.

Each function has a link to its Python source code or detailed documentation where available, and users are encouraged to review that source before relying on the result, even for non-critical calculations. Since Data Analysis Toolpak functions wrap established Python packages, it is important to validate the relevant source code, documentation, assumptions, and limitations of the underlying package being called. Neither the wrapper functions nor the underlying Python libraries have been validated for accuracy, reliability, or fitness for any purpose. They are provided "as is" with no warranties of any kind. Wrapper code and underlying Python libraries may be updated, changed, or removed at any time without notice. Use of this add-in is solely at the user's own risk.