Saturday, 29 January 2011

Reading 12.i Serial correlation


The LOS reads:
"Discuss the types of heteroskedasticity and the effects of heteroskedasticity and serial correlation on statistical inference"

In the previous blog heteroskedasticity was discussed. 

Similar to the previous blog there are 4 important areas that one should master:
1. What is serial correlation
2. What are the effects of serial correlation
3. Detecting serial correlation
4. How do you correct for serial correlation

1. What is serial correlation?
In Reading 11 and earlier in Reading 12 the underlying assumptions of the Linear Regression Model were discussed. Assumption 5 relates to Serial Correlation:

The summary of the definition was:















There are 2 types of Serial Correlation:

Positive
A positive error for one observation increases the chance of a positive error for another observation. Positive serial correlation also means that a negative error for one observation increases the chance of a negative error for another observation

Positive serial correlation is most often found in business related fields.

Negative
A positive error for one observation increases the chance of a negative error for another observation, and a negative error for one observation increases the chance of a positive error for another.



2. What are the effects of serial correlation
Positive serial correlation will result in standard errors that are too small.
But we also use the error terms to calculate the Standard Error, which is used in our T tests (as the denominator)

Thus, serial correlation results in t statistics being too large, with the effect that we are rejecting null hypothesis that should be accepted. (Type 1 Error)

3. How do you detect it?
3.1 You view a diagram. The text book has a pretty nice graph that explains it

3.2 You do a Durbin Watson Statistic test 
There is a pretty complex formula in the text book (that I will study maybe) but a much simpler formula if you are using samples > 25 to 30 (which is often the case in business)

The formula is:
DW approximate 2(1-r)    (NB Approximate is important, as one will see later!)

where r = the correlation coefficient between residuals(errors) from one period and those from previous periods.
We know that correlation is calculated as a number from (and including) -1 to 1.
And positive correlation = 1
Negative correlation = -1

Therefore, using the formula
Positive serial correlation
DW = 2(1-1)
       = 2(0)
        = 0
Therefore, Positive correlation would have a DW score of 0

Negative serial correlation
DW = 2(1-(-1)
       =  2(2)
       = 4
Therefore, Negative correlation would have a DW score of 4

Summary
Positive      No serial correlation              Negative correlation
     0                           2                                             4

But the problem is that the DW statistic refers to APPROXIMATELY 0, 2 and 4 respectively.
So how close is APPROXIMATELY? For this we again use a statistics table

One requires the following information:
- DW statistic
- Sample size
- Level of significance (The text book only gives 5% significance, so I assume the test will always refer to 5%)
- # of independent variables

Example
- Correlation coefficient between one observation and the next = 0.87
- Sample size is 30
- Significance is 5%
- Independent variables is 2

Step 1: Calculate the DW score
= 2(1-0.87)
= 0.26

Step 2: Use the DW tables
D1 = 1.28
Du = 1.57

The summary will look like this:
0              D1=1.28          Du = 1.57          2       4-Du = 2.43                  4-d1 = 2.72      4 

This is important:
Between 0 and 1.28 = Positive serial correlation
Between 1.28 & 1.57 = Test is inconclusive
Between 1.57 & 2.43 = No evidence of serial correlation
Between 2.43 & 2.72 = Test in inconclusive
Between 2.72 & 4 = Negative serial correlation


4. How do you correct for serial correlation
1. You improve the specification of the model (back to the beginning it seems!)

2. You use the Hansen method!

3 comments:

  1. Good post.
    I would like to point that a hypothesis is either rejected in a test or we 'fail to reject' it. It's never accepted. So that correction I wish to suggest.
    Anyway, the your posts helped me to review the topic quickly. Thanks.

    ReplyDelete
  2. Thanks for the post, helped me to have a quick review of the topic...
    There is just one thing, I can not find, how to use the "hansen method".
    Can you, please, give a hint on the point?

    ReplyDelete