Mastering CFA: Quantitative Methods, Introduction to Linear Regression

Mastering CFA

Wednesday, 8 December 2010

Quantitative Methods, Introduction to Linear Regression

No comments:

Post a Comment

Level 2, Volume 1, Quantitative methods, Reading 11, Correlation & Regression

(This post was updated with Step 3 on Dec 11,2010)

3.1 Linear regression with one independent variable

Nb - the learning outcome does not require one to calculate linear regression, but only to distinguish between the dependent and independent variables in a linear regression.

It did take me some time to understand the formulas below but I suspect that it makes the rest of the chapter a lot easier if you understand the basic calculation.

Why is linear regression useful?

Linear regression allows us to use one variable to make predictions about another, test hypotheses about the relation between two variables, and quantify the strength of the relationship between the two variables.

The formula

Yi = b₀ + b₁xi + ei , i = 1, . . ., n

This equation states that the dependent variable, y, is equal to:

The intercept, b₀, (the point where a line crosses the x-axis)
Plus a slope coefficient, b₁, times the independent variable, x,
Plus an error term, e. The error term represents the portion of the dependent variable that cannot be explained by the independent variable.

Types of values

You can either use a time value or a cross sectional value in the linear regression.

Time value = many observations from different time periods for same variable: denote t = 1,2, ....t

Cross sectional = cross-sectional data involve many observations on x and y for the same time period: denote i = 1,2,....t

Nb: intercept b₀ and the slope coefficient b₁ = regression coefficients (nb, it excludes xi)

How do we calculate linear regression graph?

(Link to Excel file on box.net)

Step1: calculate slope coefficient b₁

The formula for the slope coefficient is Covariance (X,Y) / Variance (X)

Using data from previous examples, the slope is calculated as follows:

1. Data
Year	2010	2009	2008	2007	2006
X: Revenue US $m	52798	50211	59473	47473	39099
Y: Earnings per ordinary share (diluted) (US sent)	227.8	105.4	274.8	228.9	172.4

2. Calculation
	Year	Revenue $	Dividends	Cross product	Squared deviations Revenue (X)
	2010	52798	227.80	77,487.97	8,923,363.84
	2009	50211	105.40	-38,603.29	160,160.04
	2008	59473	274.80	704,760.87	93,358,108.84
	2007	47473	228.90	-63,214.11	5,465,308.84
	2006	39099	172.40	315,569.63	114,742,659.24
Average		49810.8	201.86

*Covariance*	Sum			996,001.06	222,649,600.80
	(N-1)			4.00
	Answer			249,000.27

*Variance*	Sum Squared deviations				222,649,600.80
	(N-1)				4.00
	Answer				55,662,400.20

	1. Covariance			249,000.27
	2. Variance X			55,662,400.20
	Answer Slope Coefficient b₁ (1/2)			0.004473

Step2: Calculate Interval b₀

We calculate b₀ based on the fact that in linear regression, the regression line fits through the point corresponding to the means of the dependent and the independent variables.

Using data from above, we calculate b₀ as follows:

		Revenue $	Dividends
Average Mean		49810.8	201.86

Formula		Yi = b₀ + b₁Xi + e_i

where	Yi =	201.86
	b_{1 =}	0.004473
	Xi =	49810.8

Answer b₀		-20.96370784

Year	Actual Earnings (For Info)	Calculated Earnings per regression model (Yi)	bo (calculated in Step 2)	Plus	(B1(calculated in Step 1)	Multiply Xi (Revenue, independent variable))
2010	227.8	215.22	-20.9637		0.004473	52798
2009	105.4	203.65	-20.9637		0.004473	50211
2008	274.8	245.08	-20.9637		0.004473	59473
2007	228.9	191.40	-20.9637		0.004473	47473
2006	172.4	153.94	-20.9637		0.004473	39099

Step 3: Calculate Yi based on the regression formula

The regression formula is: Yi = b0 + b1Xi + ei

Using this formula we can calculate the dependent variable, being Earnings per share