Question
The Pearson Correlation Coefficient The Pearson correlation coefficient (r) is a statistical measurement of the correlation, or linear association, between two sets of values. The
The Pearson Correlation Coefficient
The Pearson correlation coefficient (r) is a statistical measurement of the correlation, or linear association, between two sets of values. The formula to calculate the sample correlation coefficient for two sets of values, x and y, is given by the formula below.
- and are the sample means of the two arrays of values
Calculate the Pearson Correlation Coefficient by Hand:
Whenever tackling a formula with so many functions in it, it is best to break the formula down into its component parts. Let's break down the formula below into its different functions and complete the math step-by-step. For the following example calculation, we will determine the correlation coefficient for Mike Smith and Patient 1. Mike Smith is your x array and Patient 1 is y.
First work through the numerator(xi-x)(yi-y)
- - Calculate the means (average) for both arrays ( and ).
x=(x1+x2 +x3 +...+xn)n
y=(y1+y2 +y3 +...+yn)n
Example:
x=(0+3 +3.58+4+3.58+3)6 = 2.86
y=(0+1.58 +2+2.8+1.58+1)6 = 1.493
- - Calculate the deviances (difference between each value and the mean) for each gene expression for both patients.
Example:
(xi-x) | (yi-y) |
(0-2.86)= -2.86 | (0-1.493)= -1.493 |
(3-2.86)=0.14 | (1.58-1.493)= 0.087 |
(3.58-2.86)=0.72 | (2-1.493)= 0.507 |
(4-2.86)=1.14 | (2.8-1.493)= 1.307 |
(3.58-2.86) = 0.72 | (1.58-1.493)= 0.087 |
(3-2.86)=0.14 | (1-1.493)= -0.493 |
- -Multiply the deviances from both arrays (this is the product).
(xi-x)(yi-y)
Example:
(-2.86)(-1.493)=4.271 |
(0.14)(0.087) = 0.012 |
(0.72)(0.507) = 0.365 |
(1.14)(1.307) = 1.490 |
(0.72)(0.086) = 0.062 |
(0.14)(-0.493) = -0.069 |
- -Calculate the sum of the products of the deviances. (The sum of step 3.)
Example:
(xi-x)(yi-y) =(4.271) + (0.012) + (0.365) + (1.490) + (0.062) + (-0.069) = 6.1308 |
Now work through the denominator:(xi-x)2(yi-y)2
- - Square each deviance from both arrays.
(xi-x)2
Example:
(xi-x)2 | (yi-y)2 |
(0-2.86)2= -2.862=8.180 | (0-1.493)2 -1.4932=2.230 |
(3-2.86)2=0.142=0.020 | (1.58-1.493)20.0872=0.008 |
(3.58-2.86)2=0.722=0.518 | (2-1.493)2=0.5072=0.257 |
(4-2.86)2=1.142=1.300 | (2.8-1.493)2=1.3072=1.707 |
(3.58-2.86)2=0.722= = 0.518 | (1.58-1.493)2=0.0872=0.008 |
(3-2.86)2=0.142=0.020 | (1-1.493)2= -0.4932=0.243 |
- - Add up the sum of the squares of each deviance from the previous step.
(xi-x)2(yi-y)2
Example:
(xi-x)2 | (yi-y)2 |
8.180+0.020+0.518+1.300+0.518+0.020=10.555 | 2.230+0.008+0.257+1.707+0.008+0.243=4.452 |
- - Multiply the sums from the previous step.
(xi-x)2(yi-y)2
Example:
(10.555)*(4.452)=46.997
- - Take the square root of the product from the previous step.
(xi-x)2(yi-y)2
Example:
46.997= 6.855
Finally divide the numerator by the denominator.
- - Take the sum from Step 4 and divide it by the square root from Step 8.
Example:
6.13086.855=0.894
Calculate the Pearson Correlation Coefficient using Google Sheets:
You can use Google Sheets to complete the same steps as shown to calculate the Pearson Coefficient by hand, but use the functions of Google Sheets to do all of the math for you.