Consider a binary classification problem where the response (Y) takes values in ({-1,1}). Show that optimal prediction
Question:
Consider a binary classification problem where the response \(Y\) takes values in \(\{-1,1\}\). Show that optimal prediction function for the hinge loss \(\operatorname{Loss}(y, \tilde{y})=(1-y \tilde{y})_{+}:=\max \{0,1-y \tilde{y}\}\) is the same as the optimal prediction function \(g^{*}\) for the indicator loss:
\[ g^{*}(\boldsymbol{x})= \begin{cases}1 & \text { if } \quad \mathbb{P}[Y=1 \mid \boldsymbol{X}=\boldsymbol{x}]>1 / 2 \\ -1 & \text { if } \quad \mathbb{P}[Y=1 \mid \boldsymbol{X}=\boldsymbol{x}]<1 / 2\end{cases} \]
That is, show that
\[ \begin{equation*} \mathbb{E}(1-Y h(\boldsymbol{X}))_{+} \geqslant \mathbb{E}\left(1-Y g^{*}(\boldsymbol{X})\right)_{+} \tag{7.29} \end{equation*} \]
for all functions \(h\).
Step by Step Answer:
Data Science And Machine Learning Mathematical And Statistical Methods
ISBN: 9781118710852
1st Edition
Authors: Dirk P. Kroese, Thomas Taimre, Radislav Vaisman, Zdravko Botev