Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1

achievement. Each of the sample means would of course vary to some extent around the
true population mean and this variability of the sample means or sample averages is the
standard error of the mean. This gives an indication of the size of the error one is likely to
make if any one sample mean is used to estimate the population mean.
As with the standard error of the mean, a measurement scale or test also has a standard
error and is simply called the standard error of the test or scale score. This is often
reported in test manuals. It represents the accuracy of a test score: the larger the standard
error of measurement for a scale, the less accurate is the scale in measuring the construct
of interest. This measurement is provided so that researchers and other test users do not
place undue significance on small differences in test and scale scores. Often one wishes
to establish that there is a certain degree of confidence correctly called a ‘confidence
interval’. It is introduced in Chapter 4.
The standard error of measurement (s.e.m.) of a test and the test reliability (rtt) are
closely related:
(s.e.m. of test)^2 =(variability of observed scores)×(1−test reliability)^


In notational form, the relationship between s.e.m. and test reliability is,
s.e.m.^2 =σ^2 (1−rtt)^


where: s.e.m. is the standard error of measurement of the test
σ^2 , is a Greek letter, sigma squared, the variance of the total scores on the test.
Variance (variability of scores) is explained in Chapter 3.
rtt is the reliability of the test, that is a correlation coefficient.
The s.e.m. of a test is often more informative than the reliability of a test because test
reliability is likely to change when the test is administered to different groups (groups are
likely to have different variances) whereas s.e.m. is less likely to change from group to
group (Cronbach, 1990).
A note of caution is introduced at this point. Most of what has been stated about
reliability and standard errors so far applies to all raw test scores and to some
standardized test scores. Whenever raw scores are changed or transformed, caution is
required in the interpretation of any statistics that are the results of computations done
with these transformed scores. Further discussion of this topic is delayed until Chapter 5.


Summary

Measurement is the process of representing quantitative variables with numbers. It is an
essential component of educational and psychological research because constructed
variables such as aptitude, motivation, anxiety and knowledge cannot generally be
observed or measured directly. Instead tests and scales which are indirect measures of
underlying attributes, predispositions and concepts are used. The number system used in
quantifying test and scale scores in itself is entirely logical. It is the application and
interpretation of numbers based on certain measurement assumptions which are often
questionable, and at worst the system is not even considered by researchers.
Assessment of a test’s validity and reliability and standard error of measurement
enables a researcher to make judgments about the appropriateness and consistency of


Measurement issues 31
Free download pdf