Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

in python with pyGAM package please In this question, we will do a binary classification with multivariate input data. To handle the multivariate nature, we

in python with pyGAM package please
image text in transcribed
In this question, we will do a binary classification with multivariate input data. To handle the multivariate nature, we will use a generalized additive model. Let XRp represent the input random variable and Y represent the output random variable for Binary classification (note we let Y{0,1} instead of Y{1,1} which we typically did in class, as PyGAM package follows that convention). Let the conditional distributions be as follows: (a) For even j, the jth-coordinate of X is distributed as Xj(Y=1) is a t-distribution with 1 degree of freedom with mean 2 Xj(Y=0) is a t - distribution with 1 degree of freedom with mean 0 . (b) For odd j, the jth-coordinate of X is distributed as Xj(Y=1) is an exponential distribution with =1. Xj](Y=0) is an exponential distribution with =3. and let P(Y=1)=0.5. Details about t-distribution and exponential distribution could be found in the wikipedia links here and here, respectively. You could use np. random.standard.t, numpy.random. exponent ial and np. random.binomial for this question. (a) Let p=10. Repeat the following procedure for 100 trails: Generate n=100 training data samples (x1,y1),,(x100,y100) from the above model. Note that here each xiRp, and for all i,xij represents the jth co-ordinate of the ith training sample, which follows the above generating process. Train a logistic generalized additive model classifier on this training data (you could use LogisticGAM from the pyGAM package). Generate n=100 testing data from the same model. Note that you will know the true labels in this testing data as you generated it. Plot a box-plot of the test error. What is the mean and variance of the test errors? (Here, for each trail, the test error is defined as the number of misclassified samples on the testing data. Also, when running LogisticGAM command, there might be warnings on non-convergence; please feel free to ignore such warnings. Finally, this experiment might take sometime to run (about 10 minutes on a reasonable laptop)) 1 (b) Repeat the above procedure with p=30. Comment on the running time and test error differences form the previous case

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Concepts

Authors: David Kroenke, David J. Auer

3rd Edition

0131986252, 978-0131986251

More Books

Students also viewed these Databases questions

Question

2. How do they influence my actions?

Answered: 1 week ago