Answered step by step
Verified Expert Solution
Question
1 Approved Answer
in python with pyGAM package please In this question, we will do a binary classification with multivariate input data. To handle the multivariate nature, we
in python with pyGAM package please
In this question, we will do a binary classification with multivariate input data. To handle the multivariate nature, we will use a generalized additive model. Let XRp represent the input random variable and Y represent the output random variable for Binary classification (note we let Y{0,1} instead of Y{1,1} which we typically did in class, as PyGAM package follows that convention). Let the conditional distributions be as follows: (a) For even j, the jth-coordinate of X is distributed as Xj(Y=1) is a t-distribution with 1 degree of freedom with mean 2 Xj(Y=0) is a t - distribution with 1 degree of freedom with mean 0 . (b) For odd j, the jth-coordinate of X is distributed as Xj(Y=1) is an exponential distribution with =1. Xj](Y=0) is an exponential distribution with =3. and let P(Y=1)=0.5. Details about t-distribution and exponential distribution could be found in the wikipedia links here and here, respectively. You could use np. random.standard.t, numpy.random. exponent ial and np. random.binomial for this question. (a) Let p=10. Repeat the following procedure for 100 trails: Generate n=100 training data samples (x1,y1),,(x100,y100) from the above model. Note that here each xiRp, and for all i,xij represents the jth co-ordinate of the ith training sample, which follows the above generating process. Train a logistic generalized additive model classifier on this training data (you could use LogisticGAM from the pyGAM package). Generate n=100 testing data from the same model. Note that you will know the true labels in this testing data as you generated it. Plot a box-plot of the test error. What is the mean and variance of the test errors? (Here, for each trail, the test error is defined as the number of misclassified samples on the testing data. Also, when running LogisticGAM command, there might be warnings on non-convergence; please feel free to ignore such warnings. Finally, this experiment might take sometime to run (about 10 minutes on a reasonable laptop)) 1 (b) Repeat the above procedure with p=30. Comment on the running time and test error differences form the previous case Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started