Computational Methods in Systems Biology

(Ann) #1

172 A. L ̈uck et al.


of patterns 0 and 64 (which corresponds to no methylation/full methylation of
all sites) in L1 and pattern 64 in all loci, where the difference between WT and
the numerical solution is about 10%, the difference is always small (<5%) as
seen in the insets.
In general all 16 models show a similar performance for all loci and positions
in terms of accuracy of the prediction. On the large scale the differences are
not visible and even for the smaller scale the differences are small, as shown
for mSat in Fig. 7. This is in accordance to the corresponding Kullback-Leibler
divergences


KL=

∑^4 L


j=1

πj(WT) log

(


πj(WT)
πj(pred)

)


(23)


that we list in Table 2. The difference inKLbetween the “best” and the “worst”
case is about 0.01. The mean and standard deviation forKLwas obtained via
bootstrapping of the wild-type data (10.000 bootstrap samples for each model).
Since no confidence intervals of the parameters are included, this standard devi-
ation can be regarded as a lower bound. However, even with these lower bounds
the intervals ofKLoverlap for all models, such that no model can be favorized.


Table 2.Kullback-Leibler divergenceKLfor the 16 models.

Model (1,1) (1,2) (1,3) (1,4)
KL 0. 1398 ± 0. 0134 0. 1398 ± 0. 0134 0. 1398 ± 0. 0134 0. 1337 ± 0. 0127
Model (2,1) (2,2) (2,3) (2,4)
KL 0. 1438 ± 0. 0137 0. 1439 ± 0. 0136 0. 1439 ± 0. 0137 0. 1374 ± 0. 0133
Model (3,1) (3,2) (3,3) (3,4)
KL 0. 1399 ± 0. 0134 0. 1399 ± 0. 0134 0. 1398 ± 0. 0133 0. 1337 ± 0. 0127
Model (4,1) (4,2) (4,3) (4,4)
KL 0. 1410 ± 0. 0137 0. 1411 ± 0. 0136 0. 1409 ± 0. 0135 0. 1349 ± 0. 0130

5 Related Work


In [ 4 ] location- and neighbor-dependent models are proposed for single-stranded
DNA methylation data in blood and tumor cells. The (de-)methylation rates
depend on the position of the CpG relative to the 3’ or 5’ end and/or on the
methylation state of the left neighbor only. The dependency is realized by the
introduction of an additional parameter. In our proposed models we use double-
stranded DNA and can therefore include hemi-methylated sites and even distin-
guish on which strand the site is methylated. Furthermore we allow dependen-
cies on both neighbors by introducing two different dependency parameters. In
contrast [ 6 ] copes with the neighborhood dependency indirectly by allowing dif-
ferent parameter values for different sites. In order to reduce the dimensionality

Free download pdf