Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Exercise 2: Expectation-Maximization as Alternating Minimization (10 pts) In this exercise we derive the expectation-maximization (EM) algorithm for minimizing the negative log- likelihood of a
Exercise 2: Expectation-Maximization as Alternating Minimization (10 pts) In this exercise we derive the expectation-maximization (EM) algorithm for minimizing the negative log- likelihood of a mixture model. Let p(x) := p.(x;fk, Ek) be the multivaraite Gaussian density with mean Mke Rd and covariance matrix Ex E Rdxd for k= 1,...,K. Given a finite dataset D = {x1,...,xn} CRd, we fit a mixture of K Gaussians (where K is a hyperparameter that is pre-determined) by solving: K min TEAK,{vk.2x} log #kpx(Xi; fik, Ik), i=1 k=1 where the i-th log term is the log-likelihood of data point x;, according to the mixture density p(x) Ek TkPx(x). We remind that AK := {T E R: Ekk = 1} and each covariance Ex is (symmetric) positive definite, or in notation Ex-0. In practice we usually refrain from applying the projected gradient algorithm (where the projections are expensive anyways). In all questions below, you need to give sufficient details to justify your solution. Simply state the result without justification or derivation will get 0. The questions are self-contained and you do not need anything that is not mentioned so far in this course. 1. (1 pt) Let p ERK. Prove that log P* = min q log (5) Ok PR k k You may use the fact that the KL divergence is nonnegative and attains 0 iff the two densities coincide. 2. (1 pt) Introduce the variables qik and prove (4) is equivalent as: min min qik log (6) TEAK:{H2,5x20} {94.EAK} RPR (X,) 3. (2 pts) We now apply alternating minimization to the equivalent problem (6). Fixing # and all flk and Ek, derive the optimal solution for all dik. Mind the constraint! 4. (2 pts) We now fix all qik, puk, Ek and derive the optimal solution for AK. Mind the constraint! 5. (2 pts) Fix all qik, Ek, 7 and derive the optimal solution for each plk. 6. (2 pts) Fix all qik, Mk, and derive the optimal solution for each Ek ?0. Mind the constraint! [You may assume I is non-degenerate, i.e. invertible, and use Vlog det I = 5-1 and V(A, 2-?) = -2-143-1 for any symmetric A. Note that (A,B) := tr(AB) = tr(BA) for two symmetric matrices.] Exercise 2: Expectation-Maximization as Alternating Minimization (10 pts) In this exercise we derive the expectation-maximization (EM) algorithm for minimizing the negative log- likelihood of a mixture model. Let p(x) := p.(x;fk, Ek) be the multivaraite Gaussian density with mean Mke Rd and covariance matrix Ex E Rdxd for k= 1,...,K. Given a finite dataset D = {x1,...,xn} CRd, we fit a mixture of K Gaussians (where K is a hyperparameter that is pre-determined) by solving: K min TEAK,{vk.2x} log #kpx(Xi; fik, Ik), i=1 k=1 where the i-th log term is the log-likelihood of data point x;, according to the mixture density p(x) Ek TkPx(x). We remind that AK := {T E R: Ekk = 1} and each covariance Ex is (symmetric) positive definite, or in notation Ex-0. In practice we usually refrain from applying the projected gradient algorithm (where the projections are expensive anyways). In all questions below, you need to give sufficient details to justify your solution. Simply state the result without justification or derivation will get 0. The questions are self-contained and you do not need anything that is not mentioned so far in this course. 1. (1 pt) Let p ERK. Prove that log P* = min q log (5) Ok PR k k You may use the fact that the KL divergence is nonnegative and attains 0 iff the two densities coincide. 2. (1 pt) Introduce the variables qik and prove (4) is equivalent as: min min qik log (6) TEAK:{H2,5x20} {94.EAK} RPR (X,) 3. (2 pts) We now apply alternating minimization to the equivalent problem (6). Fixing # and all flk and Ek, derive the optimal solution for all dik. Mind the constraint! 4. (2 pts) We now fix all qik, puk, Ek and derive the optimal solution for AK. Mind the constraint! 5. (2 pts) Fix all qik, Ek, 7 and derive the optimal solution for each plk. 6. (2 pts) Fix all qik, Mk, and derive the optimal solution for each Ek ?0. Mind the constraint! [You may assume I is non-degenerate, i.e. invertible, and use Vlog det I = 5-1 and V(A, 2-?) = -2-143-1 for any symmetric A. Note that (A,B) := tr(AB) = tr(BA) for two symmetric matrices.]
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started