Computational Methods in Systems Biology

(Ann) #1

132 E. Klinger and J. Hasenauer


General form:A density estimateKis expressed as sum of normally distrib-
uted kernels


K(θ′)=

∑n

i=1

wiN(θ′|θi,Σ(P, θi)),

in whichP={(wi,θi)}ni=1is a population of weighted parameters with weights
wi∈R+,



iwi= 1 and parametersθi∈R

dpar. The kernelN(θ′|θ, Σ)isa

normal density with meanθ∈Rdparand covariance matrixΣ∈Rdpar×dpar,in
the following referred to as bandwidth, evaluated at the parameterθ′∈Rdpar.
Three different strategies were used to determine the bandwidthΣ.


Global bandwidth:The global bandwidth is a scaled covariance of the com-
plete population. The population covariance matrix Cov(P) is calculated from
the populationP={(wi,θi)}ni=1, taking into account the sample weights:


Cov(P)=

∑n

i=1

wi(θi−μ)(θi−μ)t,μ=

∑n

i=1

wiθi.

The scaling factorbSilvis estimated with Silverman’s rule of thumb [ 21 ],


bSilv=

(


4


neff(dpar+2)

) 1 /(dpar+4)
,neff=

1



iw
2
i

,


in whichdpardenotes the parameter dimension,{wi}ni=1 the sample weights
and neff the effective population size. The kernel bandwidth is then Σ =
bSilv^2 Cov(P). The bandwidth does thus not depend on the sample locationθ
and is therefore called “global”.


Local bandwidth:The global bandwidth can be ill-suited for an accurate local
approximation [ 19 , 21 ]. We therefore considered local bandwidths as well. The
local bandwidthsΣk,nn(P, θi) are constructed for each sampleθiindividually as
twice the covariance matrix of theknearest neighbors (in Euclidean distance)
of sampleθi. The overall densityKis then given by


K(θ′)=

∑n

i=1

wiN(θ′|θi,Σk,nn(P, θi)).

Similar bandwidths were examined before and were shown to yield good accep-
tance rates [ 6 ].


Cross-validated bandwidth:Since the scaling factorbSilvis known to be too
large for multimodal distributions [ 21 ], we also used cross-validated selection of
the scaling factor for the population covariance matrix Cov(P), according to
the following scheme: the largest probed scaling factor is the Silverman scaling
factorbSilv. Five-fold cross-validation is used to determine the best of the down-
scaled factorsbc=2−e(c)bSilvwithe(c)=c/(2C),c∈{ 0 , 1 ,...,C},C=4.

Free download pdf