Higher Engineering Mathematics, Sixth Edition

(Nancy Kaufman) #1

Chapter 60


Linear regression


60.1 Introductiontolinear regression


Regression analysis, usually termedregression,isused
to draw the line of ‘best fit’ through co-ordinates on
a graph. The techniques used enable a mathemati-
cal equation of the straight line formy=mx+cto
be deduced for a given set of co-ordinate values, the
line being such that the sum of the deviations of
the co-ordinate values from the line is a minimum,
i.e. it is the line of ‘best fit’. When a regression anal-
ysis is made, it is possible to obtain two lines of best fit,
depending on which variable is selected as the depen-
dent variable and which variable is the independent
variable. For example, in a resistive electrical circuit, the
current flowing is directly proportional to the voltage
applied to the circuit. There are two ways of obtain-
ingexperimental values relatingthecurrent and voltage.
Either, certain voltages are applied to the circuit and the
current values are measured, in which case thevoltage is
theindependent variableandthecurrent isthedependent
variable; or, the voltage can be adjusted until a desired
value of current is flowing and the value of voltage is
measured, in which case the current is the independent
value and the voltage is the dependent value.


60.2 The least-squares regression


lines


For a given set of co-ordinate values, (X 1 ,Y 1 ),
(X 2 ,Y 2 ),...,(Xn,Yn)let theXvalues be the indepen-
dent variables and theY-values be thedependent values.
Also letD 1 ,...,Dnbethevertical distances between the
line shown asPQin Fig. 60.1 and the points represent-
ing the co-ordinate values. The least-squares regression
line, i.e. the line of best fit, is the line which makes the
value ofD^21 +D^22 + ···+Dn^2 a minimum value.


x

y Q

(X 1 , Y 1 )
(X 2 , Y 2 )

(Xn, Yn )

D 1

D 2

Dn

H 3

H 4

P

Figure 60.1

The equation of the least-squares regression line is
usually written as Y=a 0 +a 1 X,wherea 0 is the
Y-axis intercept value anda 1 is the gradient of the line
(analogous tocandmin the equationy=mx+c). The
values ofa 0 anda 1 to make the sum of the ‘devia-
tions squared’ a minimum can be obtained from the two
equations:

Y=a 0 N+a 1


X (1)

(XY)=a 0


X+a 1


X^2 (2)

whereX andYare the co-ordinate values,Nis the
number of co-ordinates anda 0 anda 1 are called the
regression coefficientsofYonX. Equations (1)and (2)
are called thenormal equationsof the regression lines
ofYonX. The regression line ofYonXis used to esti-
mate values ofYfor given values ofX.IftheY-values
(vertical-axis) are selected as the independent variables,
the horizontal distances between the line shown asPQ
Free download pdf