Ralph Vince - Portfolio Mathematics

(Brent) #1

ch02 JWBK035-Vince February 12, 2007 6:50 Char Count= 0


Probability Distributions 91

Whether you determine your significance levels via a table or calculate
them yourself, you will need two parameters to determine a significance
level. The first of these parameters is, of course, the chi-square statistic
itself. The second is the number ofdegrees of freedom. Generally, the num-
ber of degrees of freedom is equal to the number of bins minus 1 minus the
number of population parameters that have to be estimated for the sample
statistics. What follows is a small table for converting between chi-square
values and degrees of freedom to significance levels:

Values of X^2

Degrees of
Significance Level
Freedom .20 .10 .05 .01

1 1.6 2.7 3.8 6.6
2 3.2 4.6 6.0 9.2
3 4.6 6.3 7.8 11.3
4 6.0 7.8 9.5 13.3
5 7.3 9.2 11.1 15.1
10 13.4 16.0 18.3 23.2
20 25.0 28.4 31.4 37.6

You should be aware that the chi-square test can do a lot more than
is presented here. For instance, you can use the chi-square test on a 2× 2
contingency table (actually on any N×M contingency table).
Finally, there is the problem of the arbitrary way we have chosen our
bins as regards both their number and their range. Recall that binning data
involves a certain loss of information about that data, but generally the
profile of the distribution remains relatively the same. If we choose to work
with only three bins, or if we choose to work with 30, we will likely get
somewhat different results. It is often a helpful exercise to bin your data in
several different ways when conducting statistical tests that rely on binned
data. In so doing, you can be rather certain that the results obtained were
not due solely to the arbitrary nature of how you chose your bins.
In a purely statistical sense, in order for our number of degrees of free-
dom to be valid, it is necessary that the number of elements in each of the
expected bins, the Ei’s, be at least five. When there is a bin with less than
five expected elements in it, theoretically the number of bins should be
reduced until all of the bins have at least five expected elements in them.
Often, when only the lowest and/or highest bin has less than five expected
elements in it, the adjustment can be made by making these groups “all less
than” and “all greater than” respectively.
Free download pdf