Consider again Example 9.4, where we used a softmax output function (S_{L}) in conjunction with the cross-entropy
Question:
Consider again Example 9.4, where we used a softmax output function \(S_{L}\) in conjunction with the cross-entropy loss: \(C(\boldsymbol{\theta})=-\ln g_{y+1}(\boldsymbol{x} \mid \boldsymbol{\theta})\). Find formulas for \(\frac{\partial C}{\partial \boldsymbol{g}}\) and \(\frac{\partial \boldsymbol{S}_{L}}{\partial \boldsymbol{z}_{L}}\). Hence, verify that:
\[ \begin{equation*} \frac{\partial \boldsymbol{S}_{L}}{\partial \boldsymbol{z}_{L}} \frac{\partial C}{\partial \boldsymbol{g}}=\boldsymbol{g}(\boldsymbol{x} \mid \boldsymbol{\theta})-\boldsymbol{e}_{y+1} \tag{333} \end{equation*} \]
where \(\boldsymbol{e}_{i}\) is the unit length vector with an entry of 1 in the \(i\)-th position.
Step by Step Answer:
Data Science And Machine Learning Mathematical And Statistical Methods
ISBN: 9781118710852
1st Edition
Authors: Dirk P. Kroese, Thomas Taimre, Radislav Vaisman, Zdravko Botev