146 PSYCHOLOGY
not mean they actually are. This kind of validity is called face validity,meaning
that the questions have a surface appearance of validity.
(a) Like gauges, clocks, and rulers, intelligence tests are what kind of instruments?
(b) A test is one that measures what it is supposed to measure.
Answers: (a) Measuring instruments; (b) valid.
In order to evaluate the validity of an intelligence test, it is necessary to com-
pare test scores with an outside criterion. An outside criterionis a measurement
instrument that is independent of the intelligence test being evaluated. A useful
outside criterion is grade point average. If intelligence means anything at all, then
students with high IQ scores should have high grade point averages. In research,
this relationship is evaluated with a statistical tool called the correlation coeffi-
cient,a measure of the magnitude of the relationship between two variables (see
chapter 2). If the correlation between IQ scores and grade point average is high,
then it seems reasonable to conclude that the intelligence test in question has
validity. The higher the correlation coefficient, the more valid the test is consid-
ered to be.
Other outside criteria that can be used are teacher ratings and evaluations
made by parents.
(a) An criterion is a measurement instrument that is independent of the
intelligence test being evaluated.
(b) What statistical tool is used to evaluate the magnitude of the relationship between two
variables?
Answers: (a) outside; (b) The correlation coefficient.
A reliable testis one that gives stable, repeatable results. Let’s say that you
use a certain thermometer to take the temperature of family members when an
illness is suspected. In most cases, the thermometer will be reliable. You can
depend on it.
An intelligence test has to be carefully assessed for reliability. This is also
accomplished with the use of the correlation coefficient. Let’s say that a 100-
question test is split into two versions, Form A and Form B. The original 100
questions are randomly assigned to two forms. Form A has 50 questions. Form
B has 50 questions. The two tests are administered, for example, one week apart
to the same group of children. If Sheila obtains an IQ score of 119 on Form A,
she should obtain a score close to 119 on Form B. However, if she obtains 119
on Form A and 87 on Form B, the reliability of the test is in question. Com-