Recall from equation (7.16) that the decision boundaries of the multi-logit classifier are linear, and that the

Question:

Recall from equation (7.16) that the decision boundaries of the multi-logit classifier are linear, and that the pre-classifier can be written as a conditional pdf of the form:

\[ g(y \mid \mathbf{W}, \boldsymbol{b}, \boldsymbol{x})=\frac{\exp \left(z_{y}+1\right)}{\sum_{i=1}^{c} \exp \left(z_{i}\right)}, \quad y \in\{0, \ldots, c-1\} \]

where \(\boldsymbol{x}^{\top}=\left[1, \widetilde{\boldsymbol{x}}^{\top}\right]\) and \(z=\mathbf{W} \widetilde{\boldsymbol{x}}+\boldsymbol{b}\).

(a) Show that the linear discriminant pre-classifier in Section 7.

4 can also be written as a conditional pdf of the form \(\left(\boldsymbol{\theta}=\left\{\alpha_{y}, \boldsymbol{\Sigma}_{y}, \boldsymbol{\mu}_{y}\right\}_{y=0}^{c-1}\right)\) :

\[ g(y \mid \boldsymbol{\theta}, \boldsymbol{x})=\frac{\exp \left(z_{y}+1\right)}{\sum_{i=1}^{c} \exp \left(z_{i}\right)}, \quad y \in\{0, \ldots, c-1\} \]

where \(\boldsymbol{x}^{\top}=\left[1, \widetilde{\boldsymbol{x}}^{\top}\right]\) and \(z=\mathbf{W} \widetilde{\boldsymbol{x}}+\boldsymbol{b}\). Find formulas for the corresponding \(\boldsymbol{b}\) and \(\mathbf{W}\) in terms of the linear discriminant parameters \(\left\{\alpha_{y} \boldsymbol{\mu}_{y}, \Sigma_{y}\right\}_{y=0}^{c-1}\), where \(\boldsymbol{\Sigma}_{y}=\boldsymbol{\Sigma}\) for all \(y\).

(b) Explain which pre-classifier has smaller approximation error: the linear discriminant or multi-logit one? Justify your answer by proving an inequality between the two approximation errors.

Fantastic news! We've Found the answer you've been seeking!