Optimizations and Improvements
In many cases, we can make small changes to a Python application to switch from
float values to Fraction or Decimal values. When working with transcendental
functions, this change isn't necessarily beneficial. Transcendental functions—by
definition—involve irrational numbers.
Reducing accuracy based on audience requirements
For some calculations, a fraction value may be more intuitively meaningful than
a floating point value. This is part of presenting statistical results in a way that an
audience can understand and take action on.
For example, the chi-squared test generally involves computing the X^2 comparison
between actual values and expected values. We can then subject this comparison
value to a test against the X^2 cumulative distribution function. When the expected
and actual values have no particular relationship—we can call this a null
relationship—the variation will be random; X^2 the value tends to be small. When we
accept the null hypothesis, then we'll look elsewhere for a relationship. When the
actual values are significantly different from the expected values, we may reject the
null hypothesis. By rejecting the null hypothesis, we can explore further to determine
the precise nature of the relationship.
The decision is often based on the table of the X^2 Cumulative Distribution Function
(CDF) for selected X^2 values and given degrees of freedom. While the tabulated CDF
values are mostly irrational values, we don't usually use more than two or three
decimal places. This is merely a decision-making tool, there's no practical difference
in meaning between 0.049 and 0.05.
A widely used probability is 0.05 for rejecting the null hypothesis. This is a Fraction
object less than 1/20. When presenting data to an audience, it sometimes helps to
characterize results as fractions. A value like 0.05 is hard to visualize. Describing a
relationship has having 1 chance in 20 can help to characterize the likelihood of
a correlation.
Case study – making a chi-squared decision
We'll look at a common statistical decision. The decision is described in detail at
http://www.itl.nist.gov/div898/handbook/prc/section4/prc45.htm.