Skip to Content

PEARSONR

Overview

The PEARSONR function calculates the Pearson correlation coefficient and the associated p-value for testing non-correlation between two datasets. The Pearson correlation coefficient measures the linear relationship between two datasets, ranging from -1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 indicating no linear correlation. The p-value tests the null hypothesis that the distributions underlying the samples are uncorrelated and normally distributed. For more details, see the scipy.stats.pearsonr documentation.

Compared to Excel, the CORREL and PEARSON functions both calculate the Pearson correlation coefficient, but do not provide the associated p-value. To obtain a p-value in Excel, users must perform additional manual calculations or use the Data Analysis ToolPak. The PEARSONR function in this package provides both the correlation coefficient and the p-value in a single call.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:

=PEARSONR(x, y)
  • x (2D list, required): Table or array of values (numeric). Must have at least two rows.
  • y (2D list, required): Table or array of values (numeric). Must have the same number of rows as x.

The function returns a 2D list with one row and two columns: [correlation, p_value], or an error message (string) if the input is invalid.

Examples

Example 1: Perfect Positive Correlation

Inputs:

xy
12
24
36
48

Excel formula:

=PEARSONR({1;2;3;4}, {2;4;6;8})

Expected output:

Correlationp_value
1.00.0

Example 2: Perfect Negative Correlation

Inputs:

xy
18
26
34
42

Excel formula:

=PEARSONR({1;2;3;4}, {8;6;4;2})

Expected output:

Correlationp_value
-1.00.0

Example 3: No Correlation

Inputs:

xy
15
24
33
42
51

Excel formula:

=PEARSONR({1;2;3;4;5}, {5;4;3;2;1})

Expected output:

Correlationp_value
-1.00.0

Example 4: Real Data Example

Inputs:

xy
110
29
32.5
46
54
63
72

Excel formula:

=PEARSONR({1;2;3;4;5;6;7}, {10;9;2.5;6;4;3;2})

Expected output:

Correlationp_value
-0.82850388358842770.021280260007523342

Python Code

from scipy.stats import pearsonr as scipy_pearsonr def pearsonr(x, y): """ Calculate the Pearson correlation coefficient and p-value for two datasets. Args: x: 2D list of values (numeric), must have at least two rows. y: 2D list of values (numeric), must have the same number of rows as x. Returns: 2D list [[correlation, p_value]] or an error message (str) if input is invalid. This example function is provided as-is without any representation of accuracy. """ # Ensure x and y are lists and have at least two rows if not isinstance(x, list) or not isinstance(y, list): return "Invalid input: x and y must be 2D lists." # If input is a scalar, wrap in 2D list if not isinstance(x[0], list): x = [[v] for v in x] if isinstance(x, list) else [[x]] if not isinstance(y[0], list): y = [[v] for v in y] if isinstance(y, list) else [[y]] if len(x) < 2 or len(y) < 2: return "Invalid input: x and y must have at least two rows." if len(x) != len(y): return "Invalid input: x and y must have the same number of rows." try: x_flat = [float(row[0]) for row in x] y_flat = [float(row[0]) for row in y] except Exception: return "Invalid input: x and y must contain numeric values." try: result = scipy_pearsonr(x_flat, y_flat) corr = float(result.statistic) pval = float(result.pvalue) # Disallow nan/inf if any([c is None or isinstance(c, str) or c != c or abs(c) == float('inf') for c in [corr, pval]]): return "Invalid result: correlation or p-value is not finite." return [[corr, pval]] except Exception as e: return f"scipy.stats.pearsonr error: {e}"

Example Workbook

Link to Workbook

Last updated on