CK-12 Probability and Statistics - Advanced

(Marvins-Underground-K-12) #1

9.2. Least-Squares Regression http://www.ck12.org


We are able to predict the values forYfor any value ofXwithin a specified range.


Transformations to Achieve Linearity


Sometimes we find that there is a relationship betweenXandY, but it is not best summarized by a straight line. When
looking at the scatterplot graphs of correlation patterns, we called these types of relationshipscurvilinear.While
many relationships are linear, there are quite a number that are not including learning curves (learning more quickly
at the beginning followed by a leveling out) or exponential growth (doubling in size with each unit of growth). Below
is an example of a growth curve describing the growth of complex societies.


Since this is not a linear relationship, one may think that we may not be able to fit a regression line. However, we
can perform something called atransformationto achieve a linear relationship. We commonly use transformations
in everyday life. For example, the Richter scale measuring for earthquake intensity, and the idea of describing pay
raises in terms of percentages are both examples of making transformations on non-linear data.


Let’s take a closer look at logarithms so that we can understand how they are used in nonlinear transformations.
Notice that we can write the numbers 10,100 and 1,000 as 10= 101 , 100 = 102 , 1 , 000 = 103 , etc. We can also write
the numbers 2,4, and 8 as 2= 21 , 2 = 22 , 2 = 23 , etc. All of these equations take the form:x=cawhereais the
power to which the base(c)must be raised. We callathe logarithm because it is the power to which the base must
be raised to yield the number. Applying this equation, we find that log 1010 = 1 ,log 10100 = 2 ,log 101000 =3, etc.
and log 22 = 1 ,log 24 = 2 ,log 28 =3, etc. Because of these rules, variables that are exponential or multiplicative (in
other words, non-linear models) are linear in their logarithmic form.


In order to transform data in the linear regression model, we apply logarithmic transformations to each point in
the data set. This is most easily done using either the TI-83 calculator or a computer program such as Microsoft
Excel, the Statistical Package for Social Sciences (SPSS) or Statistical Analysis Software (SAS). This transformation
produces a linear correlation to which we can fit a linear regression line.


Let’s take a look at an example to help clarify this concept. Say that we were interested in making a case for investing
and examining how much return on investment one would get on $100 over time. Let’s assume that we invested $100
in the year 1900 and this money accrued 5% interest every year. The table below details how much we would have
each decade:


TABLE9.8: Table of account growth assuming


Year Investment with 5% Each Year
1900 100
Free download pdf