Question: Loss functions. Consider the following two loss functions, including (1) mean-squared error Loss ( T , O ) = 1 2 ( T O
Loss functions. Consider the following two loss functions, including (1) mean-squared error , and (2) cross-entropy for binary classification. Assume the activation function is sigmoid.
a. Show the derivation of the error for the output unit in backpropagation process and compare the two loss functions (e.g., potential problems they might produce).
b. Now, we wish to generalize the cross-entropy loss to the scenario of multiclass classification. The target output is a one-hot vector of length (i.e., the number of total classes), and the index of nonzero element (i.e., 1) represents the class label. The output is also a vector of the same length . Show the derivation of the categorical cross-entropy loss and the error of the output unit. (hint: there are two key steps, including (1) normalizing the output values by scaling between 0 and 1 , and (2) deriving the cross entropy loss following the definition for the binary case where the loss can be represented as .)
Step by Step Solution
3.38 Rating (164 Votes )
There are 3 Steps involved in it
a For mean squared loss the error d for the output unit c... View full answer
Get step-by-step solutions from verified subject matter experts
