CAUSAL INFERENCE MODELSsupplementary information that may be available.
Unfortunately, in the social sciences such supple-
mentary information is likely to be of questionable
quality, thereby reducing the degree of faith one
has in whatever causal assertions have been made.
In the causal modeling literature, which is
basically compatible with the so-called structural
equation modeling in econometrics, equation sys-
tems are constructed so as to represent as well as
possible a presumed real-world situation, given
whatever limitations have been imposed in terms
of omitted variables that produce unknown biases,
possibly incorrect functional forms for one’s equa-
tions, measurement errors, or in general what are
termed specification errors in the equations. Since
such limitations are always present, any particular
equation will contain a disturbance term that is
assumed to behave in a certain fashion. One’s
assumptions about such disturbances are both
critical for one’s inferences and also (for the most
part) inherently untestable with the data at hand.
This in turn means that such inferences must
always be tentative. One never ‘‘finds’’ effects, for
example, but only infers them on the basis of
findings about covariances and temporal sequences
and a set of untested theoretical assumptions. To
the degree that such assumptions are hidden from
view, both the social scientist and one’s readers
may therefore be seriously misled to the degree
that these assumptions are also incorrect.
In the recursive models commonly in use in
sociology, it is assumed that causal influences can
be ordered, such that one may designate an X 1 that
does not depend on any of the remaining variables
in the system but, presumably, varies as a result of
exogenous causes that have been ignored in the
theory. A second variable, X 2 , may then be found
that may depend upon X 1 as well as a different set
of exogenous factors, but the assumption is that X 2
does not affect X 1 , either directly or through any
other mechanism. One then builds up the system,
equation by equation, by locating an X 3 that may
depend on either or both of X 1 or X 2 , plus still
another set of independent variables (referred to
as exogenous factors), but with the assumption
that neither of the first two X’s is affected by X 3.
Adding still more variables in this recursive fash-
ion, and for the time being assuming linear and
additive relationships, one arrives at the system of
equations shown in equation system 1,
X 1 = ε 1X 2 = β 21 X 1 + ε 2X 3 = β 31 X 1 + β 32 X 2 +ε 3Xk = βk 1 X 1 + βk 2 X 2 + βk 3 X 3 +...+βk,k−1Xk−1+εk...( 1 )in which the disturbance terms are represented by
the εi and where for the sake of simplicity the
constant terms have been omitted.
The essential property of recursive equations
that provides a simple causal interpretation is that
changes made in any given equation may affect
subsequent ones but will not affect any of the prior
equations. Thus, if a mysterious demon were to
change one of the parameters in the equation for
X 3 , this would undoubtedly affect not only X 3 but
also X 4 , X 5 , through Xk, but could have no effect on
either of the first two equations, which do not
depend on X 3 or any of the later variables in the
system. As will be discussed below, this special
property of recursive systems does not hold in the
more general setup involving variables that may be
reciprocally interrelated. Indeed, it is this recur-
sive property that justifies one’s dealing with the
equations separately and sequentially as single
equations. The assumptions required for such a
system are therefore implicit in all data analyses
(e.g., log-linear modeling, analysis of variance, or
comparisons among means) that are typically dis-
cussed in first and second courses in applied
statistics.Assumptions are always critical in causal analyses
or—what is often not recognized—in any kind of
theoretical interpretation of empirical data. Some
such assumptions are implied by the forms of
one’s equations, in this case linearity and additivity.
Fortunately, these types of assumptions can be
rather simply modified by, for example, introduc-
ing second- or higher-degree terms, log functions,
or interaction terms. It is a mistake to claim—as
some critics have done—that causal modeling re-
quires one to assume such restrictive function-
al forms.Far more important are two other kinds of
assumptions—those about measurement errors
and those concerning the disturbance terms rep-
resenting the effects of all omitted variables. Sim-
ple causal modeling of the type represented by