Correlation Coefficient Formula & Calculation

#Math Formula
TL;DR
This article explains the correlation coefficient formula — Pearson's r — from its definition through the step-by-step calculation, with a worked numerical example and guidance on interpreting the result. You will understand what r measures, how to compute it by hand, and how to avoid the most common misreading of correlation as causation.
BT
Bhanzu TeamLast updated on May 12, 20264 min read

The correlation coefficient formula (Pearson's $r$) measures the strength and direction of a linear relationship between two variables, producing a value between $-1$ and $+1$.

Quick Reference:

Pearson correlation formula: $$r = \frac{n\sum xy - \sum x \sum y}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}}$$

Range: $-1 \leq r \leq 1$

Interpretation: $r = 1$ (perfect positive), $r = -1$ (perfect negative), $r = 0$ (no linear correlation)

Type: Statistical measure — bivariate analysis

Used in: Statistics, data science, economics, psychology, biology, machine learning

Definition

The Pearson correlation coefficient $r$ quantifies how closely two variables $x$ and $y$ vary together in a linear pattern. A positive $r$ means both variables increase together; a negative $r$ means one increases as the other decreases; $r = 0$ means no linear association.

The formula normalises the joint variation (covariance) by the product of the individual standard deviations, ensuring the result always lies in $[-1, 1]$.

Variable Key

Symbol

Meaning

$r$

Pearson correlation coefficient

$n$

Number of data pairs $(x_i, y_i)$

$x$

Values of the first variable

$y$

Values of the second variable

$\sum xy$

Sum of products of each $x$–$y$ pair

$\sum x^2$

Sum of squares of $x$ values

$\sum y^2$

Sum of squares of $y$ values

$\sum x$

Sum of all $x$ values

$\sum y$

Sum of all $y$ values

Alternative Form Using Means

The correlation coefficient can also be written as:

$$r = \frac{\sum(x - \bar{x})(y - \bar{y})}{\sqrt{\sum(x-\bar{x})^2 \cdot \sum(y-\bar{y})^2}}$$

where $\bar{x}$ and $\bar{y}$ are the means of $x$ and $y$ respectively. This form makes the intuition clearer: $r$ measures how much $x$ and $y$ deviate from their means in the same direction at the same time.

Interpreting The Correlation Coefficient

Value of $r$

Interpretation

$0.9$ to $1.0$

Very strong positive correlation

$0.7$ to $0.9$

Strong positive correlation

$0.5$ to $0.7$

Moderate positive correlation

$0.3$ to $0.5$

Weak positive correlation

$0$ to $0.3$

Very weak or no linear correlation

Negative values

Same scale, opposite direction

Origin of Correlation Coefficient Formula

Karl Pearson (1857–1936, UK) developed the correlation coefficient in 1895, building on earlier work by Francis Galton (1822–1911, UK). Galton had noticed that tall parents tend to have tall children (though not as tall as the parents) — a phenomenon he called "regression to the mean." Pearson formalised this observation into a precise numerical measure. Their work established the mathematical foundation of modern statistics.

Worked Example of Correlation Coefficient

Find the correlation coefficient for the following data: $(x, y)$: $(1, 2), (2, 4), (3, 5), (4, 4), (5, 5)$.

$x$

$y$

$xy$

$x^2$

$y^2$

1

2

2

1

4

2

4

8

4

16

3

5

15

9

25

4

4

16

16

16

5

5

25

25

25

Σ = 15

Σ = 20

Σ = 66

Σ = 55

Σ = 86

$n = 5$. Applying the formula:

$$r = \frac{5(66) - (15)(20)}{\sqrt{[5(55) - 15^2][5(86) - 20^2]}}$$

$$= \frac{330 - 300}{\sqrt{[275 - 225][430 - 400]}} = \frac{30}{\sqrt{50 \times 30}} = \frac{30}{\sqrt{1500}} = \frac{30}{38.73} \approx 0.77$$

Final answer: $r \approx 0.77$ — strong positive correlation.

Common Confusions With The Correlation Coefficient Formula

Correlation does not imply causation. A high $r$ between two variables does not mean one causes the other. Ice cream sales and drowning rates are strongly correlated (both rise in summer) — the cause is a third variable (hot weather), not a direct link.

The correlation coefficient measures linear relationships only. Two variables can have a strong non-linear relationship (e.g. quadratic) with $r \approx 0$. Always inspect a scatter plot alongside $r$.

$r$ is not a percentage. $r = 0.7$ does not mean "70% correlated." The coefficient of determination $r^2 = 0.49$ means that $49%$ of the variation in $y$ is explained by its linear relationship with $x$.

Was this article helpful?

Your feedback helps us write better content

Frequently Asked Questions

What is the correlation coefficient formula?
The correlation coefficient formula (Pearson's $r$) is $r = \frac{n\sum xy - \sum x \sum y}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}}$. It measures the strength and direction of the linear relationship between two variables, ranging from $-1$ to $+1$.
What does r = 0 mean?
$r = 0$ means no linear correlation between the two variables. They may still have a non-linear relationship — a value of 0 does not mean independent.
What is the difference between r and r²?
$r$ is the correlation coefficient; $r^2$ is the coefficient of determination. $r^2$ represents the proportion of variance in one variable explained by the other.
Can the correlation coefficient be greater than 1?
No. By definition $-1 \leq r \leq 1$. A value outside this range indicates a calculation error.
✍️ Written By
BT
Bhanzu Team
Content Creator and Editor
Bhanzu’s editorial team, known as Team Bhanzu, is made up of experienced educators, curriculum experts, content strategists, and fact-checkers dedicated to making math simple and engaging for learners worldwide. Every article and resource is carefully researched, thoughtfully structured, and rigorously reviewed to ensure accuracy, clarity, and real-world relevance. We understand that building strong math foundations can raise questions for students and parents alike. That’s why Team Bhanzu focuses on delivering practical insights, concept-driven explanations, and trustworthy guidance-empowering learners to develop confidence, speed, and a lifelong love for mathematics.
Related Articles
Book a FREE Demo ClassBook Now →