X 2 = 4.75 (calculator result), df = 6 – 1 = 5 P -value > 0.25 (Table C) or P -value = 0.45
(calculator).
(Remember : To get X 2 on the calculator, put the Observed values in L1 , the Expected
values in L2 , let L3=(L1-L2 )^2 /L2, then LIST MATH SUM (L3) will be X 2 . The
corresponding probability is then found by DISTR χ^2 cdf(4.75,100,5 ). This can also be
done on a TI-84 that has the χ 2 GOF–Test .)
IV . Because P > 0.25, we fail to reject the null hypothesis. We do not have convincing
evidence that the calculator is failing to simulate a fair die.
Inference for Two-Way Tables
Two-Way Tables (Contingency Tables) Defined
A two-way table , or contingency table , for categorical data is simply a rectangular array of cells. Each
cell contains the frequencies for the joint values of the row and column variables. If the row variable has
r values, then there will be r rows of data in the table. If the column variable has c values, then there will
be c columns of data in the table. Thus, there are r × c cells in the table. (The dimension of the table is r ×
c .) The marginal totals are the sums of the observations for each row and each column.
example: A class of 36 students is polled concerning political party preference. The results are
presented in the following two-way table.
The values of the row variable (Gender) are “Male” and “Female.” The values of the column variable
(Political Party Preference) are “Democrat,” “Republican,” and “Independent.” There are r = 2 rows and
c = 3 columns. We refer to this as a 2 × 3 table (the number of rows always comes first). The row
marginal totals are 20 and 16; the column marginal totals are 18, 15, and 3. Note that the sum of the row
and column marginal totals must both add to the total number in the sample.
In the example above, we had one population of 36 students and two categorical variables (gender
and party preference). In this type of situation, we are interested in whether or not the variables are
independent in the population. That is, does knowledge of one variable provide you with information
about the other variable? Another study might have drawn a simple random sample of 20 males from, say,
the senior class and another simple random sample of 16 females. Now we have two populations rather
than one, but only one categorical variable. Now we might ask if the proportions of Democrats,
Republicans, and Independents in each population are the same. Either way we do it, we end up with the
same contingency table given in the example. We will look at how these differences in design play out in
the next couple of sections.