Data Analysis with Microsoft Excel: Updated for Office 2007

Chapter 7 Tables 299

Goodman-Kruskal gamma

A measure of association used when the row and column values are ordinal variables. Gamma ranges from 21 to 1. A negative value indicates negative association, a positive value indicates positive association, and 0 indicates no association between the variables. Kendall’s tau-b Similar to gamma, except that tau-b includes a correction for ties. Used only for ordinal variables. Stuart’s tau-c Similar to tau-b, except that it includes a correction for table size. Used only for ordinal variables. Somers’ D A modifi cation of the tau-b statistic. Somers’ D is used for ordinal variables in which one variable is used to predict the value of the other variable. Somers’ D (R|C) is used when the column variable is used to predict the value of the row variable. Somers’ D (C|R) is used when the row variable is used to predict the value of the column variable.

Because the x^2 distribution is a continuous distribution and counts rep- resent discrete values, some statisticians are concerned that the Pearson chi-square statistic is not appropriate. They recommend using the continuity- adjusted chi-square statistic instead. We feel that the Pearson chi-square statistic is more accurate and can be used without adjustment. Among the other statistics in Table 7-4, the likelihood ratio chi-square statistic is usually close to the Pearson chi-square statistic. Many statisticians prefer using the likelihood ratio chi-square because it is used in log- linear modeling—a topic beyond the scope of this book. All of the three test statistics shown in Figure 7-20 are significant at the 5% level. The association between the Calculus Requirement and Department variables ranges from 0.354 to 0.378 for the three measures of association (Phi, Contingency, and Cramer’s V). The final four measures of association (gamma, tau-b, tau-c, and Somers’ D) are used for ordinal data and are not appropriate for nominal data.

Validity of the Chi-Square Test with Small

Frequencies

One problem you may encounter is that it might not be valid to use the Pearson chi-square test on a table with a large number of sparse cells. A sparse cell is defi ned as a cell in which the expected count is less than 5. The Pearson chi-square test requires large samples, and this means that cells with small counts can be a problem. You might get by with as many as one- fi fth of the expected counts under 5, but if it’s more than that, the p value

Data Analysis with Microsoft Excel: Updated for Office 2007

Validity of the Chi-Square Test with Small

Frequencies

Get our desktop app

Company

Features

Documentation

Resources