44 K. Li and N.R. Prabhala
Yi|E=Xiβ+(i|Ziγ+ηi> 0 ) (6)
=Xiβ+π(ηi|Ziγ+ηi> 0 )+νi. (7)
Equation(7)follows from the standard result thati|ηi =πηi+νiwhereπis the
coefficient in the regression ofionηi, andνiis an orthogonal zero-mean error term.^5
Given the orthogonality and zero-mean properties ofνi, we can take expectations of
equation(7)and obtain the regression model
E(Yi|E)=Xiβ+πE(ηi|Ziγ+ηi> 0 ) (8)
and a similar model for firms choosing not to announceE,
E(Yi|NE)=Xiβ+πE(ηi|Ziγ+ηi 0 ). (9)
Equations(8) and (9)can be compactly rewritten as
E(Yi|C)=Xiβ+πλC(Ziγ) (10)
whereC∈{E,NE}andλC(.)is the conditional expectation ofηigivenC. In particular,
ifηandare bivariate normal, as is standard in the bulk of the applied work,λE(.)=
φ(.)
Φ(.)andλNE(.)=−
φ(.)
1 −Φ(.)(Greene, 2003, p. 759).
A comparison of equations(1) and (10)clarifies why self-selection is an omitted
variable problem. In the population regression in equation(1), regressing outcomeY
onXconsistently estimatesβ. However, in self-selected samples, consistent estima-
tion requires that we include an additional variable, the inverse Mills ratioλC(.). Thus,
the process of correction for self-selection can be viewed as including an omitted vari-
able.
2.2.2. The omitted variable as private information
In the probit model(3) and (4),ηiis the part ofWinot explained by public variablesZi.
Thus,ηican be viewed as the private information driving the corporate financing de-
cision being modeled. The ex-ante expectation ofηishould be zero, and it is so, given
that it has been defined as an error term in the probit model.
Ex-post after firmiselectsC∈{E,NE}, the expectations ofηican be updated. The
revised expectation,E(ηi|C), is thus an updated estimate of the firm’s private informa-
tion. If we wished to test whether the private information in a firm’s choice affected
post-choice outcomes, we would regress outcomeYonE(ηi|C).ButE(ηi|C)=λC(.)
is the inverse Mills ratio term that we add anyway to adjust for self-selection. Thus,
correcting for self-selection is equivalent to testing for private information. The omitted
variable used to correct for self-selection,λC(.), is an estimate of the private information
(^5) Note thatπ=ρησwhereρηis the correlation betweenandη,andσ (^2) is the variance of.