Level 2, Volume 1, Quantitative Methods, Reading 11, Correlation & Regression
Five steps to test the significance of Correlation Coefficient (CC)
2.6 Testing the Significance of the Correlation Coefficient
In the previous sections I calculated the correlation between BHP Revenue and Dividends per share/Earnings per share. It was noted that the correlation is apparently not that strong (in the range of 0.5) but increases significantly when the outlier from year 2009 is removed. (to 0.96) Part of considering correlations relate to whether the correlation is spurious(e.g. a correlation between two variables that reflects chance relationships in a particular data set).
A test of significance is used to determine whether there is a statistically significant relationship or whether the relationship is spurious. The following steps are suggested:
Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence
You could either be requested to calculate the CC or you could be given the CC, sample size/population size and level of confidence.
Step 2: State the hypothesis
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)
Step 3: Calculate t test
Use the following formula:
Step 4: Obtain the level of significance based on the information using Student’s t-distribution
Calculate degrees of freedom = n(sample size) – 2
Determine the level of significance. This is typically set at the 5% and/or 1% level.
Find the corresponding answer on the table.
Step 5: Conclude
If answer of t-test <> than answer from step 4, Reject Ho AND accept Ha.
Step 4: Obtain the level of significance based on the information using Student’s t-distribution
Calculate degrees of freedom = n(sample size) – 2
= 3
Determine the level of significance.
5% & 1%
For 5%, being a two tailed test, level of significance = 3.182
For 1%, being a two tailed test, level of significance = 5.841
Step 5: Conclude
For both 5% & 1% of significance, the answer from the t test is less than the level of significance.
We therefore conclude that there is no statistically significant correlation.
This example also shows the impact of a small size and the impact of outliers. I suspect that the correlation would be significant if the sample size is bigger and outliers are removed.
The impact of the removal of outliers are tested in the next example.
Example 2: Correlation between BHP Revenue & Earnings per ordinary share with the outlier removed
In previous examples the CC was calculated as 0.964825. The sample size is 4.
Based on the relative low correlation and small sample size one should expect that the correlation is not statistically significant.
Following the steps
Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence
r = 0.964825
n = 4
level of significance to be tested for 5% & 1%
Step 2: State the hypothesis
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)
Step 3: Calculate t test
Use the formula to get the answer.
= 5.188261
Step 4: Obtain the level of significance based on the information using Student’s t-distribution
Calculate degrees of freedom = n(sample size) – 2
= 2
Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence
You could either be requested to calculate the CC or you could be given the CC, sample size/population size and level of confidence.
Step 2: State the hypothesis
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)
Step 3: Calculate t test
Use the following formula:
Step 4: Obtain the level of significance based on the information using Student’s t-distribution
Calculate degrees of freedom = n(sample size) – 2
Determine the level of significance. This is typically set at the 5% and/or 1% level.
Find the corresponding answer on the table.
Step 5: Conclude
If answer of t-test <> than answer from step 4, Reject Ho AND accept Ha.
Practical examples
Example 1: Correlation between BHP Revenue & Earnings per ordinary share
In previous examples the CC was calculated as 0.513509. The sample size is 5.
Based on the relative low correlation and small sample size one should expect that the correlation is not statistically significant.
Following the steps
Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence
r = 0.513509
n = 5
Level of significance to be tested for 5% & 1%
Step 2: State the hypothesis
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)
Step 3: Calculate t test
Use the formula to get the answer.
Example 1: Correlation between BHP Revenue & Earnings per ordinary share
In previous examples the CC was calculated as 0.513509. The sample size is 5.
Based on the relative low correlation and small sample size one should expect that the correlation is not statistically significant.
Following the steps
Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence
r = 0.513509
n = 5
Level of significance to be tested for 5% & 1%
Step 2: State the hypothesis
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)
Step 3: Calculate t test
Use the formula to get the answer.
Step 4: Obtain the level of significance based on the information using Student’s t-distribution
Calculate degrees of freedom = n(sample size) – 2
= 3
Determine the level of significance.
5% & 1%
For 5%, being a two tailed test, level of significance = 3.182
For 1%, being a two tailed test, level of significance = 5.841
Step 5: Conclude
For both 5% & 1% of significance, the answer from the t test is less than the level of significance.
We therefore conclude that there is no statistically significant correlation.
This example also shows the impact of a small size and the impact of outliers. I suspect that the correlation would be significant if the sample size is bigger and outliers are removed.
The impact of the removal of outliers are tested in the next example.
Example 2: Correlation between BHP Revenue & Earnings per ordinary share with the outlier removed
In previous examples the CC was calculated as 0.964825. The sample size is 4.
Based on the relative low correlation and small sample size one should expect that the correlation is not statistically significant.
Following the steps
Step 1: Obtain correlation coefficient (“CC”), sample/population size and level of confidence
r = 0.964825
n = 4
level of significance to be tested for 5% & 1%
Step 2: State the hypothesis
Ho: p = 0
Ha: p <>0
(Because p = 0, we are testing for a two tailed test!)
Step 3: Calculate t test
Use the formula to get the answer.
= 5.188261
Step 4: Obtain the level of significance based on the information using Student’s t-distribution
Calculate degrees of freedom = n(sample size) – 2
= 2
Determine the level of significance.
5% & 1%
For 5%, being a two tailed test, level of significance = 4.303
For 1%, being a two tailed test, level of significance = 9.925
Step 5: Conclude
We can therefore conclude that for 5% confidence level there is a statistically significant correlation. We cannot conclude it for the 1% level of confidence.
5% & 1%
For 5%, being a two tailed test, level of significance = 4.303
For 1%, being a two tailed test, level of significance = 9.925
Step 5: Conclude
We can therefore conclude that for 5% confidence level there is a statistically significant correlation. We cannot conclude it for the 1% level of confidence.
No comments:
Post a Comment