PEARSONR

Overview

The PEARSONR function calculates the Pearson correlation coefficient and the associated p-value for testing non-correlation between two datasets. The Pearson correlation coefficient measures the linear relationship between two datasets, ranging from -1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 indicating no linear correlation. The p-value tests the null hypothesis that the distributions underlying the samples are uncorrelated and normally distributed. For more details, see the scipy.stats.pearsonr documentation .

Compared to Excel, the CORREL and PEARSON functions both calculate the Pearson correlation coefficient, but do not provide the associated p-value. To obtain a p-value in Excel, users must perform additional manual calculations or use the Data Analysis ToolPak. The PEARSONR function in this package provides both the correlation coefficient and the p-value in a single call.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:


=PEARSONR(x, y)

x (2D list, required): Table or array of values (numeric). Must have at least two rows.
y (2D list, required): Table or array of values (numeric). Must have the same number of rows as x.

The function returns a 2D list with one row and two columns: [correlation, p_value], or an error message (string) if the input is invalid.

Examples

Example 1: Perfect Positive Correlation

Inputs:

x	y
1	2
2	4
3	6
4	8

Excel formula:


=PEARSONR({1;2;3;4}, {2;4;6;8})

Expected output:

Correlation	p_value
1.0	0.0

Example 2: Perfect Negative Correlation

Inputs:

x	y
1	8
2	6
3	4
4	2

Excel formula:


=PEARSONR({1;2;3;4}, {8;6;4;2})

Expected output:

Correlation	p_value
-1.0	0.0

Example 3: No Correlation

Inputs:

x	y
1	5
2	4
3	3
4	2
5	1

Excel formula:


=PEARSONR({1;2;3;4;5}, {5;4;3;2;1})

Expected output:

Correlation	p_value
-1.0	0.0

Example 4: Real Data Example

Inputs:

x	y
1	10
2	9
3	2.5
4	6
5	4
6	3
7	2

Excel formula:


=PEARSONR({1;2;3;4;5;6;7}, {10;9;2.5;6;4;3;2})

Expected output:

Correlation	p_value
-0.8285038835884277	0.021280260007523342

Python Code


from scipy.stats import pearsonr as scipy_pearsonr
 
def pearsonr(x, y):
    """
    Calculate the Pearson correlation coefficient and p-value for two datasets.
 
    Args:
        x: 2D list of values (numeric), must have at least two rows.
        y: 2D list of values (numeric), must have the same number of rows as x.
 
    Returns:
        2D list [[correlation, p_value]] or an error message (str) if input is invalid.
 
    This example function is provided as-is without any representation of accuracy.
    """
    # Ensure x and y are lists and have at least two rows
    if not isinstance(x, list) or not isinstance(y, list):
        return "Invalid input: x and y must be 2D lists."
    # If input is a scalar, wrap in 2D list
    if not isinstance(x[0], list):
        x = [[v] for v in x] if isinstance(x, list) else [[x]]
    if not isinstance(y[0], list):
        y = [[v] for v in y] if isinstance(y, list) else [[y]]
    if len(x) < 2 or len(y) < 2:
        return "Invalid input: x and y must have at least two rows."
    if len(x) != len(y):
        return "Invalid input: x and y must have the same number of rows."
    try:
        x_flat = [float(row[0]) for row in x]
        y_flat = [float(row[0]) for row in y]
    except Exception:
        return "Invalid input: x and y must contain numeric values."
    try:
        result = scipy_pearsonr(x_flat, y_flat)
        corr = float(result.statistic)
        pval = float(result.pvalue)
        # Disallow nan/inf
        if any([c is None or isinstance(c, str) or c != c or abs(c) == float('inf') for c in [corr, pval]]):
            return "Invalid result: correlation or p-value is not finite."
        return [[corr, pval]]
    except Exception as e:
        return f"scipy.stats.pearsonr error: {e}"

Example Workbook

Link to Workbook