Functional Python Programming

(Wang) #1

Optimizations and Improvements


The cumulative distribution function for X^2 shows that a value of 19.18 has a
probability of the order of 0.00387: about 4 chances in 1000 of being random. The
next step is a follow-up study to discover the details of the various defect types and
shifts. We'll need to see which independent variable has the biggest correlation with
defects and continue the analysis.


Instead of following up with this case study, we'll look at a different and
interesting calculation.


Computing the chi-squared threshold


The essence of the X^2 test is a threshold value based on the number of degrees
of freedom and the level of uncertainty we're willing to entertain in accepting or
rejecting the null hypothesis. Conventionally, we're advised to use a threshold
around 0.05 (1/20) to reject the null hypothesis. We'd like there to be only 1 chance
in 20 that the data is simply random and it appears meaningful. In other words, we'd
like there to be 19 chances in 20 that the data reflects simple random variation.


The chi-squared values are usually provided in tabular form because the calculation
involves a number of transcendental functions. In some cases, libraries will provide
the X^2 cumulative distribution function, allowing us to compute a value rather than
look one up on tabulation of important values.


The cumulative distribution function for a X^2 value, x, and degrees of freedom, f,
is defined as follows:


()

,
;^22

2

kx

Fxk
k

γ

Γ



= 




It's common to the probability of being random as pF=−^1 ()χ^2 ;k. That is, if p > 0.05,
the data can be understood as random; the null hypothesis is true.


This requires two calculations: the incomplete gamma function, γ()sz, , and the
complete gamma function, Γ()x. These can involve some fairly complex math.
We'll cut some corners and implement two pretty-good approximations that are
narrowly focused on just this problem. Each of these functions will allow us to look
at functional design issues.

Free download pdf