(Exercise 18 continued.) Consider again Example 2.10 with (mathbf{D}=operatorname{diag}left(lambda_{1}, ldots, lambda_{p} ight)) for some nonnegative model-selection parameter...

Question:

(Exercise 18 continued.) Consider again Example 2.10 with \(\mathbf{D}=\operatorname{diag}\left(\lambda_{1}, \ldots, \lambda_{p}\right)\) for some nonnegative model-selection parameter \(\lambda \in \mathbb{R}^{p}\). A Bayesian choice for \(\lambda\) is the maximizer of the marginal likelihood \(g(\boldsymbol{y} \mid \lambda)\); that is,

\[ \boldsymbol{\lambda}^{*}=\underset{\boldsymbol{\lambda} \geqslant \mathbf{0}}{\operatorname{argmax}} \iint g\left(\boldsymbol{\beta}, \sigma^{2}, \boldsymbol{y} \mid \boldsymbol{\lambda}\right) \mathrm{d} \boldsymbol{\beta} \mathrm{d} \sigma^{2} \]


where

\[ \ln g\left(\boldsymbol{\beta}, \sigma^{2}, \boldsymbol{y} \mid \boldsymbol{\lambda}\right)=-\frac{\|\boldsymbol{y}-\mathbf{X} \boldsymbol{\beta}\|^{2}+\boldsymbol{\beta}^{\top} \mathbf{D}^{-1} \boldsymbol{\beta}}{2 \sigma^{2}}-\frac{1}{2} \ln |\boldsymbol{D}|-\frac{n+p}{2} \ln \left(2 \pi \sigma^{2}\right)-\ln \sigma^{2} \]

To maximize \(g(\boldsymbol{y} \mid \lambda)\), one can use the EM algorithm with \(\boldsymbol{\beta}\) and \(\sigma^{2}\) acting as latent variables in the complete-data log-likelihood \(\ln g\left(\boldsymbol{\beta}, \sigma^{2}, \boldsymbol{y} \mid \lambda\right)\). Define

\[ \begin{align*} & \boldsymbol{\Sigma}:=\left(\mathbf{D}^{-1}+\mathbf{X}^{\top} \mathbf{X}\right)^{-1} \tag{6.42}\\ & \overline{\boldsymbol{\beta}}:=\boldsymbol{\Sigma} \mathbf{X}^{\top} \mathbf{y} \\ & \widehat{\sigma}^{2}:=\left(\|\mathbf{y}\|^{2}-\mathbf{y}^{\top} \mathbf{X} \overline{\boldsymbol{\beta}}\right) / n \end{align*} \]

(a) Show that the conditional density of the latent variables \(\boldsymbol{\beta}\) and \(\sigma^{2}\) is such that

\[ \begin{aligned} & \left(\sigma^{-2} \mid \boldsymbol{\lambda}, \boldsymbol{y}\right) \sim \operatorname{Gamma}\left(\frac{n}{2}, \frac{n}{2} \widehat{\sigma}^{2}\right) \\ & \left(\boldsymbol{\beta} \mid \boldsymbol{\lambda}, \sigma^{2}, \boldsymbol{y}\right) \sim \mathscr{N}\left(\overline{\boldsymbol{\beta}}, \sigma^{2} \boldsymbol{\Sigma}\right) \end{aligned} \]

(b) Use Theorem C. 2 to show that the expected complete-data log-likelihood is

\[ -\frac{\overline{\boldsymbol{\beta}}^{\top} \mathbf{D}^{-1} \overline{\boldsymbol{\beta}}}{2 \widehat{\sigma}^{2}}-\frac{\operatorname{tr}\left(\mathbf{D}^{-1} \boldsymbol{\Sigma}\right)+\ln |\mathbf{D}|}{2}+c_{1} \]


where \(c_{1}\) is a constant that does not depend on \(\lambda\).

(c) Use Theorem A. 2 to simplify the expected complete-data log-likelihood and to show that it is maximized at \(\lambda_{i}=\sum i i+\left(\bar{\beta}_{i} / \widehat{\sigma}\right)^{2}\) for \(i=1, \ldots, p\). Hence, deduce the following \(\mathrm{E}\) and \(\mathrm{M}\) steps in the EM algorithm:
E-step. Given \(\lambda\), update \(\sum, \bar{\beta}, \widehat{\sigma}^{2}\) via the formulas (6.42).
M-step. Given \(\sum, \bar{\beta}, \widehat{\sigma}^{2}\), update \(\lambda\) via \(\lambda_{i}=\sum_{i i}+\left(\bar{\beta}_{i} / \widehat{\sigma}^{2}\right)^{2}, i=1, \ldots, p\).

(d) Write Python code to compute \(\lambda^{*}\) via the EM algorithm, and use it to select the best polynomial model in Example 2.10. A possible stopping criterion is to terminate the EM iterations when

\[ \ln g\left(\boldsymbol{y} \mid \boldsymbol{\lambda}_{t+1}\right)-\ln g\left(\boldsymbol{y} \mid \boldsymbol{\lambda}_{t}\right)<\varepsilon \]

for some small \(\varepsilon>0\), where the marginal \(\log\)-likelihood is

\[ \ln g(\boldsymbol{y} \mid \boldsymbol{\lambda})=-\frac{n}{2} \ln \left(n \pi \widehat{\sigma}^{2}\right)-\frac{1}{2} \ln |\mathbf{D}|+\frac{1}{2} \ln |\boldsymbol{\Sigma}|+\ln \Gamma(n / 2) \]

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Data Science And Machine Learning Mathematical And Statistical Methods

ISBN: 9781118710852

1st Edition

Authors: Dirk P. Kroese, Thomas Taimre, Radislav Vaisman, Zdravko Botev

Question Posted: