Computational Methods in Systems Biology

(Ann) #1

286 P. Boba and K. Hamacher


p(Z) of a random variableZfrom a sample spaceZto another distribution
q(Z): KLDp|q=



z∈Z

p(z) log 2 pq((zz)). In the subsequent parts of this study, we use

the convention “0·log = 0”. Note, that the KLDp|qis therefore well defined as
long as∀z∈Z:q(z)>0. TheMutual Information(MI) is a special case of the
Kullback-Leibler-Divergence which is used to quantify the information that one
random variableXcontains about another variableY - and vice versa. Here,
the distributionqis set to the independent joint probability ofXandY, namely
q(X, Y):=p(X)·p(Y), whilep(X, Y) is the empirical joint probability of the
two random variables, andp(X)andp(Y) the respective marginals. The MI
measures how much the empiricalp(X, Y) deviates from the independent case.
Eventually, the MI becomes a generalized correlation coefficient. Then:


MIX,Y=



x∈X,y∈Y

p(x, y) log 2

(


p(x, y)
p(x)·p(y)

)


(1)


using the same notation as above.
The obvious extension of the MI to address dynamics in time series of sta-
tionary processes, is a time-delayed MI (TDMI) [ 5 ]. To this end, letXandYbe
two time series written in vector form. Then, the probability to find a realization
xn∈Xat timenisp(xn). The same holds foryn, then one can compute the MI
between time-laggedxn+n′andynwith lagn′. As previously discussed, however,
such a TDMI is inferior to other methods of time series analysis in the detection
of potential causal relations [ 4 ].
The motivation for the TE is to quantify the dependency of one process
({xn}) on another ({yn}) by a mutual information ofconditionalprobabilities,
rather than time-lagged ones.
Then, we can – in analogy to Eq. 1 – express the dependency of one variable
xin relation to the other variableyin a time dependent manner by using:


TEY→X=



p(x(nk+1),x(nk),yn(l)) log

p(xn+1|x(nk),y(nl))
p(xn+1|x(nk))

(2)


The indiceskandlrepresent the time windows being used (time lags) for each
variable creating the multi-dimensional probabilities via histogram techniques.
We now detect how much information flows fromY toXby checking whether
the state ofXis dependent on the history of both variables – expressed in a
non-trivialp(xn+1|x(nk),y(nl)) – or only depends on its own history which can be


quantitatively assessed viap(xn+1|x(nk)). Note, that TEY→X=TEX→Y.


1.2 Statistical Significance via Permutation Tests


To test data for statistical significance under the null hypothesis of independence
between thexandymeasurements we employ permutation tests. OurRpackage
TransferEntropyPTfeatures the parallelized calculation of such a null model.

Free download pdf