Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

238 Some Elementary Statistical Inferences

(d)Obtain the sample mean and standard deviation and on the histogram overlay the normal pdf with these estimates as parameters, usingmris=sort(mri)and lines(dnorm(mris,mean(mris),sd(mris))~mris,lty=2). Comment on the fit.

(e)Determine the proportions of the data within 1 and 2 standard deviations of the sample mean and compare these with the empirical rule.

4.1.11.This is a famous data set on the speed of light recorded by the scientist
Simon Newcomb. The data set was obtained at the Carnegie Melon site given in
Exercise 4.1.10 and it can also be found in the rda filespeedlight.rdaat the sites
referenced in the Preface. Stigler (1977) presents an informative discussion of this
data set.

(a)Load the rda file and type the commandprint(speed). As Stigler notes, the data values× 10 −^3 +24.8 are Newcomb’s data values; hence, negative items can occur. Also, in the unit of the data the “true value” is 33.02. Discuss the data.

(b)Obtain a histogram of the data. Comment on the shape.

(c)On the histogram overlay the default density estimator. Comment on the shape.

(d)Obtain the sample mean and standard deviation and on the histogram overlay the normal pdf with these estimates as parameters. Comment on the fit.

(e)Determine the proportions of the data within 1 and 2 standard deviations of the sample mean and compare these with the empirical rule.

4.2 Confidence Intervals

Let us continue with the statistical problem that we were discussing in Section
4.1. Recall that the random variable of interestX has densityf(x;θ),θ∈Ω,
whereθis unknown. In that section, we discussed estimatingθby a statistic
̂θ=̂θ(X 1 ,...,Xn), whereX 1 ,...,Xnis a sample from the distribution ofX.When

the sample is drawn, it is unlikely that the value ofθ̂is the true value of the
parameter. In fact, if̂θhas a continuous distribution, thenPθ(̂θ=θ)=0,wherethe
notationPθdenotes that the probability is computed whenθis the true parameter.
What is needed is an estimate of the error of the estimation; i.e., by how much did
θ̂missθ? In this section, we embody this estimate of error in terms of a confidence
interval, which we now formally define:

Definition 4.2.1(Confidence Interval).LetX 1 ,X 2 ,...,Xnbe a sample on a ran-
dom variableX,whereXhas pdff(x;θ),θ∈Ω.Let 0 <α< 1 be specified. Let
L=L(X 1 ,X 2 ,...,Xn)andU=U(X 1 ,X 2 ,...,Xn)be two statistics. We say that
the interval(L, U)is a(1−α)100%confidence intervalforθif

1 −α=Pθ[θ∈(L, U)]. (4.2.1)

Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

4.2 Confidence Intervals

Get our desktop app

Company

Features

Documentation

Resources