Recall from equation (7.16) that the decision boundaries of the multi-logit classifier are linear, and that the
Question:
Recall from equation (7.16) that the decision boundaries of the multi-logit classifier are linear, and that the pre-classifier can be written as a conditional pdf of the form:
\[ g(y \mid \mathbf{W}, \boldsymbol{b}, \boldsymbol{x})=\frac{\exp \left(z_{y}+1\right)}{\sum_{i=1}^{c} \exp \left(z_{i}\right)}, \quad y \in\{0, \ldots, c-1\} \]
where \(\boldsymbol{x}^{\top}=\left[1, \widetilde{\boldsymbol{x}}^{\top}\right]\) and \(z=\mathbf{W} \widetilde{\boldsymbol{x}}+\boldsymbol{b}\).
(a) Show that the linear discriminant pre-classifier in Section 7.
4 can also be written as a conditional pdf of the form \(\left(\boldsymbol{\theta}=\left\{\alpha_{y}, \boldsymbol{\Sigma}_{y}, \boldsymbol{\mu}_{y}\right\}_{y=0}^{c-1}\right)\) :
\[ g(y \mid \boldsymbol{\theta}, \boldsymbol{x})=\frac{\exp \left(z_{y}+1\right)}{\sum_{i=1}^{c} \exp \left(z_{i}\right)}, \quad y \in\{0, \ldots, c-1\} \]
where \(\boldsymbol{x}^{\top}=\left[1, \widetilde{\boldsymbol{x}}^{\top}\right]\) and \(z=\mathbf{W} \widetilde{\boldsymbol{x}}+\boldsymbol{b}\). Find formulas for the corresponding \(\boldsymbol{b}\) and \(\mathbf{W}\) in terms of the linear discriminant parameters \(\left\{\alpha_{y} \boldsymbol{\mu}_{y}, \Sigma_{y}\right\}_{y=0}^{c-1}\), where \(\boldsymbol{\Sigma}_{y}=\boldsymbol{\Sigma}\) for all \(y\).
(b) Explain which pre-classifier has smaller approximation error: the linear discriminant or multi-logit one? Justify your answer by proving an inequality between the two approximation errors.
Step by Step Answer:
Data Science And Machine Learning Mathematical And Statistical Methods
ISBN: 9781118710852
1st Edition
Authors: Dirk P. Kroese, Thomas Taimre, Radislav Vaisman, Zdravko Botev