238 Structures of Personality Traits
context). These examples of imperfect historical fit could eas-
ily be expanded upon. The five factors owe their consolidation
and impact to analyses of large data matrices that did not be-
come possible until the last decades of the twentieth century.
This section starts with setting out the strongest possible
case for PCA by presenting a classical (see Horst, 1965)
rationale for it. Next, it examines the grounds for the magical
number five. It then considers the so-named person-centered
approach as an alternative to PCA in certain contexts.
The Case for Principal Component Analysis
Applying PCA to a scores matrix is the logical consequence
of performing item analysis. In the general case, the aim of
item analysis is to maximize the internal consistency of one
or more scales based on the items; the exception whereby
items are weighted by their predictive validity is outside the
present scope. The basic idea of item analysis may be
expressed as follows: The investigator is aware that each
single item, carefully chosen as it may be, is an imperfect
operationalization of whatever construct it represents. But
the investigator has no better criterion against which to
gauge the validity of the item than the total score on the set
of equivalent items. Item analysis is thus a bootstrapping
operation.
Carrying this basic idea to its logical consequence pro-
ceeds as follows: At the first step, items are weighted accord-
ing to their association with the total score. Discarding items
on that basis would amount to arbitrarily assigning a zero
weight. That may be defensible in extreme cases where it is
evident for substantive reasons—albeit post hoc—that the
item does not belong in the set. In the general case, however,
all items would be retained.
By virtue of assigning weights to the items, however, the
total score has been replaced by a weighted sum. The implicit
rationale is that this weighted sum is a better approximation
of the underlying construct than was the unweighted sum. So
the logical second step would be to assign item weights ac-
cording to their association with the weighted sum. Thus an
iteration procedure has been started, the endpoint of which is
reached when convergence of weights and of weighted sums
occurs. At that point, the weighted total score is the first prin-
cipal component of the item scores (Horst, 1965). If the item
set is multidimensional, more than one principal component
is obtained, but the reasoning is essentially the same.
Thus a particularly strong argument in favor of PCA is
that it is logically inevitable. Also, since the days of com-
puter scoring, any practical objections against calculating
weighted sums have disappeared: Sooner than applying 10
hand-scoring keys to a 5-D questionnaire (five keys for
positive items and five for negative items), one would put the
item scores on electronic file anyway.
Raw-Scores PCA
The present argument does not prejudice in favor of PCA as
it is usually conceived, namely, PCA of zscores or correla-
tion matrices. Rather, it refers to raw-scores PCA, with devi-
ation scores and their covariance matrices, or standardized
scores and their correlations, as special cases. Raw-scores
PCA should be performed on bipolar scores; for example,
scores on a five-point scale should be coded as 2,1, 0,
1, and 2: We (Hofstee, 1990; Hofstee & Hendriks, 1998;
Hofstee, Ten Berge, & Hendriks, 1998) have argued that a
bipolar representation of personality variables is appropriate,
as they tend to come in pairs of opposites. Thinking in terms
of all-positive numbers is a habit imported from the abilities
and achievement domain, where it does not make sense to
assign a negative score.
Raw-scores PCA implies an absolute-scale interpretation
of the Likert scale, rather than the conventional interval-scale
interpretation. These alternative interpretations have subtle
consequences for our conception of personality. The first of
these concerns the reference point. With relative, interval-
scale scoring, the population mean is the reference point. For
desirable traits, that reference point is at the positive side of
the scale midpoint (0), and vice versa. Thus a person with a
score of .8 on a socialness scale with a population mean of
1.1 (most people being found social), would be said to be
somewhat asocial, albeit in a relative sense, which however is
the only available interpretation when using interval scaling.
The unthinking adoption of interval scales from the domain
of intelligence and achievement may lead to a bleak view of
humankind, whereby a sizable proportion of the population is
judged more or less deviant. A poor comfort is that the pro-
portion is a bit less than 50% because the raw-score distribu-
tion is not symmetric. Taking the scale midpoint seriously
solves the problem; it prevents a positive judgment from
being translated into something unfavorable and vice versa,
based on an inappropriate convention.
The second way in which absolute and interval scale con-
ceptions differ concerns spread. Using a five-point scale,
most items have standard deviations close to 1, as the preva-
lent responses are 1 and 1; thus the difference between
absolute and interval scaling is not dramatic in this respect.
But extremely favorable and unfavorable items obtain
smaller standard deviations. The effect of standard PCA and
interval scoring procedures is to increase their impact on the
total score. It would seem that this is also an unintended con-
sequence rather than a deliberate effect.