Saturday, 11 December 2010

Reading 11, 3.4 Coefficient of Determination - Method 2

Level 2, Volume 1, Quantitative Methods, Reading 11, Correlation & Regression

The references refer to the CFA text book.
3.4 Coefficient of determination (“CD”)

Why CD?
The standard error of estimate (SEE) gives some indication of how certain one can be about a particular prediction of Y using the regression equation.
It does not tell us how well the independent variable explains variation in the dependent variable. The coefficient of determination does this, measuring the fraction of the total variation in the dependent variable that is explained by the independent variable.

There are 2 methods to calculate CD. Method 1 was discussed in the previous post


Method 2:  For use in a linear regression with more than ONE independent variable

Total variation = Unexplained variation Plus Explained variation
Therefore
Explained variation = Total variation Minus Unexplained variation



The CD is the fraction of the total variation that is explained by the regression.
Thus CD =                            Explained variation(Divided by)
           Total variation

Thus CD also =            Total variation Minus Unexplained variation (Divided by)
Total variation

Thus CD, (after dividing by Total variation) = 1 minus  Unexplained variation(Divided by)
                                                                                  Total variation

Formulas used
Total variation is calculated as:

(We use the mean to calculate total variation)


Unexplained variation is:
(Using the difference between actual data and the expected data)


Data
Assume that Revenue is the independent variable (X)
2010
2009
2008
2007
2006
Revenue US $m
52798
50211
59473
47473
39099
Earnings per ordinary share (diluted) (US cents)
227.8
105.4
274.8
228.9
172.4

Calculation
Year
Revenue $ (X)
Earnings - Expected based on regression analysis in 3.1
Earnings - Actual
Total variation (based on the mean)
Unexplained variation
2010
52798
215.22
227.80
672.88
158.18
2009
50211
203.65
105.40
9304.53
9653.11
2008
59473
245.08
274.80
5320.24
883.11
2007
47473
191.40
228.90
731.16
1406.09
2006
39099
153.94
172.40
867.89
340.70
Average
49810.8
201.86
201.86
Sum
16896.71
12441.20
12441.20
Formula
Explained variation =
1 Minus
16896.71
=
1 Minus
0.7363
=
0.26


NB, the answer is very similar to that of the first method. This is because only one variable is used. If more than one variable is used, Method 1 can not be used.

No comments:

Post a Comment