Answered step by step
Verified Expert Solution
Question
1 Approved Answer
3. a) Derive the update rule for the weights in the output layer of a neural network using gradient descent rule. Assume that the sigmoid
3. a) Derive the update rule for the weights in the output layer of a neural network using gradient descent rule. Assume that the sigmoid function is used as an activation function, the quadratic loss as the error function and L1 regularisation is applied. b) Assume the network's error function is Eo. How is it modified when L2 regularisation is applied? Describe how this type of regularization works and what is the difference with LI regularisation. c) Assume that you wish to train a classifier on a large dataset. How would you estimate its generalization performance and optimize its pararneters? Describe briefly the procedure that you would follow d) Compute the classification rate for the given confusion matrix. Do you think the classification rate is a suitable performance measure in this case? Explain your reasoning and the alternatives. Class 1 . Predicted Class 2 - Class 3 Class 1 - Actual 1000 Class 2 - Actual20 Class 3 - Actual Predicted 100 0 10 Predicted 50 10 0 10 e four parts carry, respectively, 40%, 20%, 20%, 20% of the marks. 3. a) Derive the update rule for the weights in the output layer of a neural network using gradient descent rule. Assume that the sigmoid function is used as an activation function, the quadratic loss as the error function and L1 regularisation is applied. b) Assume the network's error function is Eo. How is it modified when L2 regularisation is applied? Describe how this type of regularization works and what is the difference with LI regularisation. c) Assume that you wish to train a classifier on a large dataset. How would you estimate its generalization performance and optimize its pararneters? Describe briefly the procedure that you would follow d) Compute the classification rate for the given confusion matrix. Do you think the classification rate is a suitable performance measure in this case? Explain your reasoning and the alternatives. Class 1 . Predicted Class 2 - Class 3 Class 1 - Actual 1000 Class 2 - Actual20 Class 3 - Actual Predicted 100 0 10 Predicted 50 10 0 10 e four parts carry, respectively, 40%, 20%, 20%, 20% of the marks
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started