Basic Statistics

(Barry) #1
168 REGRESSION AND CORRELATION

12.2 LINEAR REGRESSION: SINGLE SAMPLE

In this section we first show how to compute a linear regression line and then how to
interpret it.

12.2.1 Least-Squares Regression Line

After plotting the scatter diagram, we would like to fit a straight line or a curve to the
data points. Fitting curves will not be discussed here, except that later we show how
a transformation on one of the variables will sometimes enable us to use a straight
line in fitting data that was originally curved.
A straight line is the simplest to fit and is commonly used. Sometimes a researcher
simply draws a straight line by eye. The difficulty with this approach is that no two
people would draw the same line. We would like to obtain a line that is both best in
some sense and that also is the same line that other investigators use. The line with
both these attributes is the least-squares regression line.
The equation of the line is
Y=a+bX


Here, Y denotes the value of Y on the regression line for a given X. The coordinates
of any point on the line are given as (Y, X). The slope of the line is denoted by b and
the intercept by a. The numerical value of b can be calculated using the formula


C(X - X)(Y - Y)
C(X - X)Z

b=

and the numerical value of a can be obtained from

Before the interpretation of the regression line is discussed, the example given in
Table 12.1 will be used to demonstrate the calculations. Note that almost all statistical
programs will perform these calculations, so this is for illustration purposes. The
calculation of a regression line for a large sample is quite tedious, as is obvious from
Table 12.2. We first calculate the mean weight as X = 1907/10 = 190.7 and the
mean systolic blood pressure as Y = 1361/10 = 136.1. Then, for the first row in
Table 12.2, we obtain (X - x)’ by calculating (X - 190.7) or (165 - 190.7) and
squaring the difference of -25.7 to obtain 660.49. A similar calculation is done for
(Y - 136.1)’ to obtain 4.41. The value in the first row and last column is computed
from (X - 190.7)(Y - 136.1) = (-25.7)(-2.1) = 53.97. The last three columns
of Table 11.2 are filled in a similar fashion for rows numbered 2 through 10, and the
summation is now computed.
From the summation row, we obtain the results we need to compute

b= c(X - X)(Y - Y) - - 2097.3 = .2903


C(X - X)2 7224.1





Free download pdf