Answered step by step
Verified Expert Solution
Question
1 Approved Answer
For training data set (x1,y1)=([21],3)(x2,y2)=([12],2)(x3,y3)=([10],0)(x4,y4)=([01],1) We apply L2 loss as the training objective function: minimizee2(w^)=41p=14(w^Tx^pyp)2 with initial design w^0=000 a) Compute w^2 by conducting 2
For training data set (x1,y1)=([21],3)(x2,y2)=([12],2)(x3,y3)=([10],0)(x4,y4)=([01],1) We apply L2 loss as the training objective function: minimizee2(w^)=41p=14(w^Tx^pyp)2 with initial design w^0=000 a) Compute w^2 by conducting 2 iterations of Stochastic Gradient Descent Algorithm with adaptive learning rate k=(1K1k)0+K1kK1 where 0=0.1,K1=0.01,K1=10. In the first iteration, we randomly choose 2 samples (x2,y2),(x4,y4) to construct the objective function. In the second iteration, we randomly choose 2 samples (x1,y1),(x3,y3) to construct the objective function. b) Compute w^2 by conducting 1 epoch of Mini-Batch Optimization with batch size set to B=2 and constant learning rate =0.1. (Hint: For both SGD and Mini-Batch Optimization, we regard the objective function in k th iteration as minimizee2(k)(w^)=m1pIk(w^Tx^pyp)2 where m is the number of samples involved in the k th iteration, and Ik is the index set of the m samples. )
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started