Problem 2. [11 question 1, we explicitly constructed the feature map and find the corresponding kernel to help classify the instances using linear separator in the feature space. However} in most cases it is hard to manually construct the desired feature map} and the dimensionality of the feature space can be very high even infmity} which makes explicit computation in the feature space infeasible in practice. In this question we will develop the dual of the primal optimization problem to avoid worlcing in the feature space explicitly- Suppose we have a sample set .5' = (1:1; 3:1), , (1,\which is equivalent to the following dual optimization minimize Zn nr- Z Z otoiyiyitil- )TEILHJ E: [:1 i=1j=1 Subject to 011 no 1: C vi=1,....n 1:21\"ny = {.1 Vi: 1......\" Recall from the lecture notes 51, . 5,1 are called slaclr variables. The optimal slack variables have intuitive geometric interpretation as shown in Fig. 3. Basically, when sf= [L the corresponding feature vector Eln) is correctly classied and it will either he on the margin of the separator or on the correct side of the margin. Feature vector with I] c: si 1: 1 lies within the margin but is still be correctly classied. When s- :1 l. the corresponding feature vector is misclassified. Support vectors correspond to the instances with si s [I or instances that lie on the margin. The optimal vector 15' can he represented in terms of or E 1.. n as W= 2L1 EmsJ 3) Suppose the optimal 1. s\" have been computed. Use the st- to obtain an upper bound s = -1 on the number of misclassied instances. {ll} points) h} In the primal optimization of SW, what's the role of the coefcient (3'? Briey explain your answer by considering two extreme cases= i.e.= (3) t] and C } co. {lfl points) c} Explain how to use the kernel triclc to avoid the explicit computation of the feature vector Mari)? Also= given a new instance it, how to malce prediction on the instance without explicitly computing the feature vector Ms)? {ll} points}