356 Generative Models
24.8 Exercises
- Prove that the maximum likelihood estimator of the variance of a Gaussian
variable is biased. - Regularization for Maximum Likelihood: Consider the following regularized
loss minimization:
1
m
∑m
i=1
log(1/Pθ[xi]) +^1
m
(log(1/θ) + log(1/(1−θ))).
- Show that the preceding objective is equivalent to the usual empirical error
had we added two pseudoexamples to the training set. Conclude that
the regularized maximum likelihood estimator would be
θˆ=^1
m+ 2
(
1 +
∑m
i=1
xi
)
.
- Derive a high probability bound on|θˆ−θ?|.Hint: Rewrite this as|θˆ−E[θˆ]+
E[θˆ]−θ?|and then use the triangle inequality and Hoeffding inequality. - Use this to bound the true risk. Hint: Use the fact that nowθˆ≥m^1 +2to
relate|θˆ−θ?|to the relative entropy.
3.• Consider a general optimization problem of the form:
maxc
∑k
y=1
νylog(cy) s.t. cy> 0 ,
∑
y
cy= 1,
whereν∈Rk+is a vector of nonnegative weights. Verify that the M step
of softk-means involves solving such an optimization problem.
- Letc?=∑y^1 νyν. Show thatc?is a probability vector.
- Show that the optimization problem is equivalent to the problem:
minc DRE(c?||c) s.t. cy> 0 ,
∑
y
cy= 1.
- Using properties of the relative entropy, conclude thatc?is the solution to
the optimization problem.