Basic Statistics

(Barry) #1
CHI-SQUARE TESTS FOR FREQUENCY TABLES: TWO-BY-TWO TABLES 151

and using the same reasoning,
B 21
90 120
If we multiply equation (1 1.1) by 30, we have


-__ - -



  • 5.25


30(21)
120

A=--

Similarly, the value of B is


= 15.75
90(21)
120

B=-


The expected values for the two other cells (C and D) can be obtained in a similar
fashion


= 24.75


30(99)
or C=-

c 99
30 120 120

-~ - -


and
or D=- = 74.25

D 99 90(99)
90 120 120

-~ - -


Note that what we do to compute the expected value is multiply the row total by
the column total for the row and column each cell is in, and then divide by the total
sample size. For example, for the first cell (u = 11) the row total is 21 for the first
row and the column total is 30 for the first column, and we multiplied 21 by 30 and
then divided by 120, the sample total.
One should also note that the sum of the expected frequencies A and B is 21, the
sum of C and D is 99, the sum of A and C is 30, and the sum of B and D is 90.
That is, the sums of the expected values are the same as the sums of the observed
frequencies. For a frequency table with two rows and columns, we can compute the
expected value of one of the cells and we then obtain each of the three other expected
values by subtraction. For example, if we compute A = 5.25, then we know that B =
21-5.25 = 15.75,C = 30-5.25 = 24.75,andD = 90-B = 90-15.75 = 74.25.
Thus, knowing the row and column totals and the expected value for one of the cells
allows us to compute all the expected values for a table with two rows and columns.
Also, knowing the row and column totals, if we know one observed frequency, we
can get the other observed frequencies by subtraction.
In Table 11.6, both the observed frequencies and the expected frequencies are
placed in the four cells. The expected frequencies are inside the parentheses. Since
the expected frequencies are what we might expect if HO is true, to test whether or
not HO is true, we look at these two sets of numbers. If they seem close together,
we decide that HO may be true; that is, there is no significant association between
smoking and low vital capacity. If the two sets of numbers are very different, we
decide that HO is false, since what was observed is so very different from what had
been anticipated.
Some method is necessary for deciding whether the observed and expected fre-
quencies are “close together” or “very different.” To make this decision, the statistic
called chi-square, x2, is calculated. For each cell we subtract the expected frequency

Free download pdf