Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1

3 An initial regression model based on background information or theoretical
considerations is then fitted to the data and a regression line is estimated. Consider
what sources of information, in the model, contribute to the total variation in the
response variable, i.e., consider overall model fit—Are the explanatory variables
related in any way to the response variable? What proportion of the total variation in
the response variable is explained by the independent variables in the model?
4 Consider the parameter estimates—especially their standard errors and significance
tests and confidence intervals for the intercept and slope. Do not report these at this
stage because the next step is to check the regression assumptions and to evaluate the
fitted regression model. (N.B. In regression analysis assumptions are checked after the
initial model has been fitted because the regression residuals are used.)
5 Regression assumptions are checked by looking at the residuals from the fitted model.
This is called regression diagnostics, that is residual plots are scrutinized (residuals
are plotted against case-numbers and against explanatory variables), and standard
errors of fitted coefficients and residuals are examined.
6 Alternative regression models are built if necessary (independent variables added or


dropped, polynomial terms fitted, for example, rather than β 1 x 1 outlier
observations are identified) and further regression diagnostics are performed to
evaluate the adequacy of the model and the overall model fit (look at adjusted R^2 ).
Polynomial model, refers to higher powers of x, denoted by the degree of polynomial
e.g. x^2 is quadratic and x^3 is cubic.
7 A parsimonious regression model is selected, the three parameters, β 0 , β 1 , and σ are are
estimated, and tests of significance and confidence intervals for the intercept and slope
are performed. Caution is required with interpretation of the statistical significance of
individual explanatory variables in a multiple regression model when the explanatory
variables are orthogonal (not correlated). Tests of statistical significance can be
misleading.


8.2 Linear Regression Analysis

When to Use

To use linear regression, measures for the response variable should be continuous (at
least theoretically) and there should be observations on at least a pair of variables, a
response variable Y and an explanatory variable X. For every value of Y there should be a
corresponding value of X. It is also assumed that the relationship between Y and X is
linear. The size of the correlation, r, between two variables provides an indication of
linearity.
Regression analysis is often used by researchers in an exploratory way to discover
relationships between variables, to generate new ideas and concepts, to find important
variables and to identify unusual cases (outliers) in a data set. It can of course be used in
a more formal way to predict certain response values from carefully chosen explanatory
variables but by far the greatest number of applications of regression analysis in
education and psychology are what may be termed exploratory. Exploratory analysis
should not mean ‘blind analysis’ and all regression models should be guided by either


Inferences involving continuous data 255
Free download pdf