Introduction to Probability and Statistics for Engineers and Scientists

(Sean Pound) #1

7.2Maximum Likelihood Estimators 233


However, because proofreader 1 foundn 1 of theNerrors in the manuscript, it is reasonable
to suppose thatp 1 is also approximately equal tonN^1. Equating this topˆ 1 gives that


n1,2
n 2


n 1
N

or


N≈

n 1 n 2
n1,2

Because the preceding estimate is symmetric inn 1 andn 2 , it follows that it is the same
no matter which proofreader is designated as proofreader 1.
An interesting application of the preceding occurred when two teams of researchers
recently announced that they had decoded the human genetic code sequence. As part
of their work both teams estimated that the human genome consisted of approximately
33,000 genes. Because both teams independently arrived at the same number, many
scientists found this number believable. However, most scientists were quite surprised by
this relatively small number of genes; by comparison it is only about twice as many as a
fruit fly has. However, a closer inspection of the findings indicated that the two groups
only agreed on the existence of about 17,000 genes. (That is, 17,000 genes were found by
both teams.) Thus, based on our preceding estimator, we would estimate that the actual
number of genes, rather than being 33,000, is


n 1 n 2
n1,2

=

33,000×33,000
17,000

≈64,000

(Because there is some controversy about whether some of genes claimed to be found are
actually genes, 64,000 should probably be taken as an upper bound on the actual number
of genes.)
The estimation approach used when there are two proofreaders does not work when
there aremproofreaders, whenm>2. For, if for eachi, we letpˆibe the fraction of the
errors found by at least one of the other proofreadersj,(j=i), that are also found byi,
and then set that equal toNni, then the estimate ofN, namelynpˆii, would differ for different


values ofi. Moreover, with this approach it is possible that we may have thatpˆi>pˆj
even if proofreaderifinds fewer errors than does proofreaderj. For instance, form=3,
suppose proofreaders 1 and 2 find exactly the same set of 10 errors whereas proofreader 3
finds 20 errors with only 1 of them in common with the set of errors found by the others.
Then, because proofreader 1 (and 2) found 10 of the 29 errors found by at least one of the
other proofreaders,ˆpi=10/29,i=1, 2. On the other hand, because proofreader 3 only
found 1 of the 10 errors found by the others,ˆp 3 =1/10. Therefore, although proofreader
3 found twice the number of errors as did proofreader 1, the estimate ofp 3 is less than
that ofp 1. To obtain more reasonable estimates, we could take the preceding values of

Free download pdf