Sunday, 5 December 2010

Reading 11, 2.6 Five steps to test significance of Correlation Coefficient


Level 2, Volume 1, Quantitative Methods, Reading 11, Correlation & Regression
Five steps to test the significance of Correlation Coefficient (CC)


(The references refer to the CFA text book.  The steps are based on my own understanding of the subject)

2.6 Testing the Significance of the Correlation Coefficient 

In the previous sections I calculated the correlation between BHP Revenue and Dividends per share/Earnings per share. It was noted that the correlation is apparently not that strong (in the range of 0.5) but increases significantly when the outlier from year 2009 is removed. (to 0.96)  Part of considering correlations relate to whether the correlation is spurious(e.g. a correlation between two variables that reflects chance relationships in a particular data set). 


A test of significance is used to determine whether there is a statistically significant relationship or whether the relationship is spurious. The following steps are suggested:

Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence 
You could either be requested to calculate the CC or you could be given the CC, sample size/population size and level of confidence.

Step 2: State the hypothesis 
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)

Step 3: Calculate t test 
Use the following formula:







Step 4: Obtain the level of significance based on the information using Student’s t-distribution 

Calculate degrees of freedom = n(sample size) – 2
Determine the level of significance. This is typically set at the 5% and/or 1% level.
Find the corresponding answer on the table.

Step 5: Conclude 

If answer of t-test <> than answer from step 4, Reject Ho AND accept Ha.

Practical examples

Example 1: Correlation between BHP Revenue & Earnings per ordinary share

In previous examples the CC was calculated as 0.513509. The sample size is 5.
Based on the relative low correlation and small sample size one should expect that the correlation is not statistically significant.

Following the steps

Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence 
r = 0.513509
n = 5
Level of significance to be tested for 5% & 1%

Step 2: State the hypothesis 
Ho: p = 0
Ha: p <>0

(Because p = 0, we are testing for a two tailed test!)

Step 3: Calculate t test 
Use the formula to get the answer.

Step 4: Obtain the level of significance based on the information using Student’s t-distribution
Calculate degrees of freedom = n(sample size) – 2
= 3

Determine the level of significance.
5% & 1%

For 5%, being a two tailed test, level of significance = 3.182
For 1%, being a two tailed test, level of significance = 5.841

Step 5: Conclude 
For both 5% & 1% of significance, the answer from the t test is less than the level of significance.
We therefore conclude that there is no statistically significant correlation.

This example also shows the impact of a small size and the impact of outliers. I suspect that the correlation would be significant if the sample size is bigger and outliers are removed.

The impact of the removal of outliers are tested in the next example.


Example 2: Correlation between BHP Revenue & Earnings per ordinary share with the outlier removed 
In previous examples the CC was calculated as 0.964825. The sample size is 4.
Based on the relative low correlation and small sample size one should expect that the correlation is not statistically significant.

Following the steps

Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence
r = 0.964825
n = 4
level of significance to be tested for 5% & 1%

Step 2: State the hypothesis 
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)

Step 3: Calculate t test 
Use the formula to get the answer.


= 5.188261

Step 4: Obtain the level of significance based on the information using Student’s t-distribution 
Calculate degrees of freedom = n(sample size) – 2
= 2
Determine the level of significance.
5% & 1%

For 5%, being a two tailed test, level of significance = 4.303
For 1%, being a two tailed test, level of significance = 9.925

Step 5: Conclude 
We can therefore conclude that for 5% confidence level there is a statistically significant correlation. We cannot conclude it for the 1% level of confidence.



No comments:

Post a Comment