Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

SoftSVM optimization In this question you will implement the SGD optimization method for SoftSVM to find a linear with minimal empirical loss. The first dataset

image text in transcribed

image text in transcribed

SoftSVM optimization In this question you will implement the SGD optimization method for SoftSVM to find a linear with minimal empirical loss. The first dataset is stored in the file, bg.txt The file contains only the feature vectors. There is one feature vector per line. The first 100 lines correspond to positive instances (label +1) and the next 100 lines are negative instances (label -1). The data can be loaded into a matrix with the load function in Matlab/Octave. (a) Implement SGD for SoftSVM. You may skip the averaging over weight vectors for the output and instead, simply output the last iterate. During the optimization, keep track of both the empirical and the hinge loss of the current weight vector Include a printout of your code. (2 marks) b) Run the optimization method with various values for the regularization parameter A100, 10,11,.01,.001) on the data (remember to first add an additional feature with value 1 to each datapoint, so that you are actually training a general linear classifier). For each value plot the binary loss of the iterates and the hinge loss of the iterates (in separate plots). Include three plots of your choice where you observe distinct behavior (you may run the method several times for each parameter setting and choose) 4 marks) (c) Discuss the plots. Are the curves monotone? Are they approximately monotone? Why or why not? How does the choice of A affect the optimization? How would you go about finding a linear predictor of minimal binary loss? (4 marks) (d) Download the "seeds" data set from the UCI repository: https://archive.ics.uci.edu/ml/datasets/seeds That data is also stored in text file and can be loaded the same way. It contains 210 instances, with three different label (the last column in the file corresponds to the label) (e) Train three binary linear predictors. w1 should separate the first class from the other two (ie the first 70 instances are labeled +1 and the next 140 instances-1) w2 should separate the second class from the other two and ws should separate the third class from the first two classes ie for training w label the middle 70 instances positive and the rest negative and analogously for wg). Report the binary loss that you achieve with wi, w2 and w3 for each of these tasks (f) Turn the three linear separators into a multi-class predictor for the three different classes in the seeds dataset using the following rule: y(x) = argmaXe(1,2,3) (wi, x) SoftSVM optimization In this question you will implement the SGD optimization method for SoftSVM to find a linear with minimal empirical loss. The first dataset is stored in the file, bg.txt The file contains only the feature vectors. There is one feature vector per line. The first 100 lines correspond to positive instances (label +1) and the next 100 lines are negative instances (label -1). The data can be loaded into a matrix with the load function in Matlab/Octave. (a) Implement SGD for SoftSVM. You may skip the averaging over weight vectors for the output and instead, simply output the last iterate. During the optimization, keep track of both the empirical and the hinge loss of the current weight vector Include a printout of your code. (2 marks) b) Run the optimization method with various values for the regularization parameter A100, 10,11,.01,.001) on the data (remember to first add an additional feature with value 1 to each datapoint, so that you are actually training a general linear classifier). For each value plot the binary loss of the iterates and the hinge loss of the iterates (in separate plots). Include three plots of your choice where you observe distinct behavior (you may run the method several times for each parameter setting and choose) 4 marks) (c) Discuss the plots. Are the curves monotone? Are they approximately monotone? Why or why not? How does the choice of A affect the optimization? How would you go about finding a linear predictor of minimal binary loss? (4 marks) (d) Download the "seeds" data set from the UCI repository: https://archive.ics.uci.edu/ml/datasets/seeds That data is also stored in text file and can be loaded the same way. It contains 210 instances, with three different label (the last column in the file corresponds to the label) (e) Train three binary linear predictors. w1 should separate the first class from the other two (ie the first 70 instances are labeled +1 and the next 140 instances-1) w2 should separate the second class from the other two and ws should separate the third class from the first two classes ie for training w label the middle 70 instances positive and the rest negative and analogously for wg). Report the binary loss that you achieve with wi, w2 and w3 for each of these tasks (f) Turn the three linear separators into a multi-class predictor for the three different classes in the seeds dataset using the following rule: y(x) = argmaXe(1,2,3) (wi, x)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

New Trends In Databases And Information Systems Adbis 2019 Short Papers Workshops Bbigap Qauca Sembdm Simpda M2p Madeisd And Doctoral Consortium Bled Slovenia September 8 11 2019 Proceedings

Authors: Tatjana Welzer ,Johann Eder ,Vili Podgorelec ,Robert Wrembel ,Mirjana Ivanovic ,Johann Gamper ,Mikolaj Morzy ,Theodoros Tzouramanis ,Jerome Darmont

1st Edition

3030302776, 978-3030302771

More Books

Students also viewed these Databases questions