trials. To obtain this knowledge, we must reduce the data to a set of measures that carry the
information we need. The questions to be asked refer to the location, or central tendency,
and to the dispersion, or variability, of the distributions along the underlying scale. Mea-
sures of these characteristics will be considered in Sections 2.8 and 2.9. But before going
to those sections we need to set up a notational system that we can use in that discussion.
2.6 Notation
Any discussion of statistical techniques requires a notational system for expressing mathe-
matical operations. You might be surprised to learn that no standard notational system has
been adopted. Although several attempts to formulate a general policy have been made, the
fact remains that no two textbooks use exactly the same notation.
The notational systems commonly used range from the very complex to the very sim-
ple. The more complex systems gain precision at the expense of easy intelligibility,
whereas the simpler systems gain intelligibility at the expense of precision. Because the
loss of precision is usually minor when compared with the gain in comprehension, in this
book we will adopt an extremely simple system of notation.
Notation of Variables
The general rule is that an uppercase letter, often Xor Y, will represent a variable as a whole.
The letter and a subscript will then represent an individual value of that variable. Suppose
for example that we have the following five scores on the length of time (in seconds) that
third-grade children can hold their breath: [45, 42, 35, 23, 52]. This set of scores will be re-
ferred to as X. The first number of this set (45) can be referred to as , the second (42) as
, and so on. When we want to refer to a single score without specifying which one, we
will refer to , where i can take on any value between 1 and 5. In practice, the use of sub-
scripts is often a distraction, and they are generally omitted if no confusion will result.
Summation Notation
One of the most common symbols in statistics is the uppercase Greek letter sigma ,
which is the standard notation for summation. It is readily translated as “add up, or sum,
what follows.” Thus, is read “sum the .” To be perfectly correct, the notation for
summing all Nvalues of Xis , which translates to “sum all of the from i 5 1 to
i 5 N.” In practice, we seldom need to specify what is to be done this precisely, and in most
cases all subscripts are dropped and the notation for the sum of the is simply.
Several extensions of the simple case of must be noted and thoroughly understood.
One of these is , which is read as “sum the squared values of X” (i.e.,
5 8,247). Note that this is quite different from , which
tells us to sum the Xs and then square the result. This would equal
The general rule, which always applies,
is to perform operations within parentheses before performing operations outside parenthe-
ses. Thus, for , we sum the values of X and then we square the result, as opposed to
, for which we square the Xs before we sum.
Another common expression, when data are available on two variables (Xand Y), is
, which means “sum the products of the corresponding values of Xand Y.” The use of
these and other terms will be illustrated in the following example.
Imagine a simple experiment in which we record the anxiety scores (X) of five students
and also record the number of days during the last semester that they missed a test because
gXY
gX^2
(©X)^2
(45 1421351231 52)^2 =(197)^2 =38,809.
(gX)^2 5
4521422135212321522 gX^2
gX^2
gX
Xi gX
gNi= 1 Xi Xis
gXi Xis
1 g 2
Xi
X 2
X 1
30 Chapter 2 Describing and Exploring Data
sigma (∑)