How to Use the Correlation Coefficient Calculator
This correlation coefficient calculator finds Pearson r and r² from any dataset — enter your X values in the left text area and your corresponding Y values in the right text area, one value per line or separated by commas. The number of X and Y values must be equal. The calculator instantly computes Pearson r, r², the means of X and Y, and a plain-language interpretation of the correlation strength. With at least 2 pairs of values, all results appear automatically as you type.
Both positive and negative values are supported, and there is no limit on sample size. If all X values are identical or all Y values are identical, r is undefined (division by zero in the formula) and the calculator will display an error. For descriptive statistics about your datasets, see our standard deviation calculator. Once you know the correlation is strong, use our linear regression calculator to find the best-fit equation and make predictions.
The Pearson Correlation Formula
The Pearson correlation coefficient measures the linear association between two variables:
r = [n·Σxy − Σx·Σy] / √[(n·Σx² − (Σx)²)(n·Σy² − (Σy)²)]
An equivalent formula using means is:
r = Σ[(xᵢ − x̄)(yᵢ − ȳ)] / √[Σ(xᵢ − x̄)² × Σ(yᵢ − ȳ)²]
Worked Example
Given five (x, y) pairs: (1, 2), (2, 4), (3, 5), (4, 4), (5, 5):
- n = 5, Σx = 15, Σy = 20, Σxy = 66, Σx² = 55, Σy² = 86
- Numerator: 5 × 66 − 15 × 20 = 330 − 300 = 30
- Denominator: √[(5 × 55 − 225)(5 × 86 − 400)] = √[(275 − 225)(430 − 400)] = √[50 × 30] = √1500 ≈ 38.73
- r = 30 / 38.73 ≈ 0.7746
r ≈ 0.77 indicates a strong positive linear relationship, and r² ≈ 0.60 means 60% of the variation in Y is explained by X.
Interpreting the Correlation Coefficient
The value of r tells you two things simultaneously: the direction and the strength of the linear relationship.
- Direction: Positive r means Y tends to increase as X increases; negative r means Y tends to decrease as X increases.
- Strength: |r| close to 1 means the data points fall nearly on a straight line; |r| close to 0 means the data is scattered with no linear pattern.
General guidelines for |r| (these vary by field):
- 0.90 – 1.00: Very strong relationship
- 0.70 – 0.89: Strong relationship
- 0.50 – 0.69: Moderate relationship
- 0.30 – 0.49: Weak relationship
- 0.00 – 0.29: Negligible or no linear relationship
Note that Pearson r only measures linear relationships. Two variables can have a strong curved (non-linear) relationship and still show r ≈ 0. Always plot your data in a scatter plot before relying on r as the sole measure of association.
The Coefficient of Determination (r²)
Squaring the correlation coefficient gives r², the coefficient of determination. This has a direct interpretation: it is the proportion of variance in Y that is explained by a linear relationship with X.
- r = 0.9 → r² = 0.81 → 81% of variance in Y is explained by X
- r = 0.7 → r² = 0.49 → 49% explained
- r = 0.5 → r² = 0.25 → 25% explained
- r = 0.3 → r² = 0.09 → only 9% explained
The remaining percentage (1 − r²) is attributable to other variables, measurement error, or random variation. r² is the primary output of simple linear regression and is always reported alongside the regression equation in scientific literature.
Limitations and Common Mistakes
Pearson r has several important limitations to keep in mind:
- Only linear relationships: A perfect U-shaped relationship between X and Y would give r ≈ 0, even though a clear pattern exists. Use a scatter plot to check for non-linearity.
- Sensitive to outliers: A single extreme data point can dramatically change r. Always check your data for errors before interpreting the correlation.
- Correlation ≠ causation: A high r tells you the variables move together, not that one causes the other. See the FAQ below for details.
- Not meaningful for non-numeric data: Pearson r requires continuous (or at least interval-scale) data. For ranked data, use Spearman's ρ instead.
- Sample size matters: With n = 5, even r = 0.8 may not be statistically significant. With n = 1000, even r = 0.1 may be statistically significant. Always report the sample size alongside r.
For comparing variability within each dataset, our variance calculator shows the individual spread of X and Y values.
Sources & References
- Pearson's Correlation — NIST/SEMATECH e-Handbook of Statistical Methods
- Correlation Coefficients — Khan Academy — Khan Academy