Skip to Content

PEARSONR

Overview

The PEARSONR function calculates the Pearson correlation coefficient and the associated p-value for testing non-correlation between two datasets. The Pearson correlation coefficient measures the linear relationship between two datasets, ranging from -1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 indicating no linear correlation. The p-value tests the null hypothesis that the distributions underlying the samples are uncorrelated and normally distributed. For more details, see the scipy.stats.pearsonr documentation .

Compared to Excel, the CORREL and PEARSON functions both calculate the Pearson correlation coefficient, but do not provide the associated p-value. To obtain a p-value in Excel, users must perform additional manual calculations or use the Data Analysis ToolPak. The PEARSONR function in this package provides both the correlation coefficient and the p-value in a single call.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:

=PEARSONR(x, y)
  • x (2D list, required): Table or array of values (numeric). Must have at least two rows.
  • y (2D list, required): Table or array of values (numeric). Must have the same number of rows as x.

The function returns a 2D list with one row and two columns: [correlation, p_value], or an error message (string) if the input is invalid.

Examples

Example 1: Perfect Positive Correlation

Inputs:

xy
12
24
36
48

Excel formula:

=PEARSONR({1;2;3;4}, {2;4;6;8})

Expected output:

Correlationp_value
1.00.0

Example 2: Perfect Negative Correlation

Inputs:

xy
18
26
34
42

Excel formula:

=PEARSONR({1;2;3;4}, {8;6;4;2})

Expected output:

Correlationp_value
-1.00.0

Example 3: No Correlation

Inputs:

xy
15
24
33
42
51

Excel formula:

=PEARSONR({1;2;3;4;5}, {5;4;3;2;1})

Expected output:

Correlationp_value
-1.00.0

Example 4: Real Data Example

Inputs:

xy
110
29
32.5
46
54
63
72

Excel formula:

=PEARSONR({1;2;3;4;5;6;7}, {10;9;2.5;6;4;3;2})

Expected output:

Correlationp_value
-0.82850388358842770.021280260007523342

Python Code

from scipy.stats import pearsonr as scipy_pearsonr from typing import Union, List def pearsonr(x: Union[List[List[float]], float], y: Union[List[List[float]], float]) -> Union[List[List[float]], str]: """ Calculate the Pearson correlation coefficient and p-value for two datasets. Args: x: 2D list of values (numeric), must have at least two rows. y: 2D list of values (numeric), must have the same number of rows as x. Returns: 2D list [[correlation, p_value]] or an error message (str) if input is invalid. This example function is provided as-is without any representation of accuracy. """ # Ensure x and y are lists and have at least two rows if not isinstance(x, list) or not isinstance(y, list): return "Invalid input: x and y must be 2D lists." # If input is a scalar, wrap in 2D list if not isinstance(x[0], list): x = [[v] for v in x] if isinstance(x, list) else [[x]] if not isinstance(y[0], list): y = [[v] for v in y] if isinstance(y, list) else [[y]] if len(x) < 2 or len(y) < 2: return "Invalid input: x and y must have at least two rows." if len(x) != len(y): return "Invalid input: x and y must have the same number of rows." try: x_flat = [float(row[0]) for row in x] y_flat = [float(row[0]) for row in y] except Exception: return "Invalid input: x and y must contain numeric values." try: result = scipy_pearsonr(x_flat, y_flat) corr = float(result.statistic) pval = float(result.pvalue) # Disallow nan/inf if any([c is None or isinstance(c, str) or c != c or abs(c) == float('inf') for c in [corr, pval]]): return "Invalid result: correlation or p-value is not finite." return [[corr, pval]] except Exception as e: return f"scipy.stats.pearsonr error: {e}"

Example Workbook

Link to Workbook 

Last updated on