AP Statistics 2017

(Marvins-Underground-K-12) #1

Chi-Square Test for Independence


A random sample of 400 residents of a large western city are polled to determine their attitudes
concerning the affirmative action admissions policy of the local university. The residents are classified
according to ethnicity (white, black, Asian) and whether or not they favor the affirmative action policy.
The results are presented in the following table.


We are interested in whether or not, in the population of this large city, ethnicity and attitude toward
affirmative action are associated (note that, in this situation, we have one population and two categorical
variables). That is, does knowledge of a person’s ethnicity give us information about that person’s attitude
toward affirmative action? Another way of asking this is, “Are the variables ethnicity and attitude
toward affirmative action independent in the population?” As part of a hypothesis test, the null hypothesis
is that the two variables are independent, and the alternative is that they are not: H 0 : the variables


ethnicity and attitude toward affirmative action are independent among all residents of this city vs. H (^) A :
the variables are not independent among all residents of this city. Alternatively, we could say H 0 : the
variables ethnicity and attitude toward affirmative action are not associated among all residents of this
city vs. H (^) A : the variables ethnicity and attitude toward affirmative action are associated among all
residents of this city.
The test statistic for the independence hypothesis is the same chi-square statistic we saw for the
goodness-of-fit test:
For a two-way table, the number of degrees of freedom is calculated as (number of rows – 1)(number
of columns – 1) = (r – 1)(c – 1). As with the goodness-of-fit test, we require that we are dealing with a
random sample and that the number of expected values in each cell be at least 5 (or some texts say there
are no empty cells and at least 80% of the cells have more than 5 expected values).
Calculation of the expected frequencies for chi-square can be labor intensive if there are many cells,
but it is usually done by technology (see the next Calculator Tip for details). However, you should know
how expected frequencies are arrived at.
example (calculation of expected frequency): Suppose we are testing for independence of the
variables (ethnicity and opinion) in the previous example. For the two-way table with the given
marginal values, find the expected frequency for the cell marked “Exp.”

Free download pdf