This comes from a problem from faraway-extending-the-linear-model-with-r.
2. [Exercise 2.2 on page 52 of ELMRI; related to Exercise 2.2 on page LIE4? in ELMR2.) The dataset ubca comes from a study of breast cancer in Wisconsin. There are 531 cases of potentially cancerous tumors of which 233 are actually malig nant. Determining whether a tumor is really malignant is traditionally determined by an invasive surgical procedure. The purpose of this study was to determine whether a new procedure, called ne needle aspiration, which draws only a small sample of tissue, could be eective in determining tumor status. [a] Fit the binomial regression with Class as the response and the other nine variables as predictors. Report the residual deviance and associated degrees of 'eezlom. lCan this information be used to determine if this model ts the data? Explain. [b] Use All] as the criterion to determine the best subset of variables, via the step() function. [c] Use the reduced model to predict the outcome for a new patient with predictor variables :7: = [1,1,3,2,1,1,4,1,1]T, in the same order as in the data set. Give a condence interval for your prediction. [d] Suppose that cancer is classied as benign if p 3.:- Il and malignant if p -:: 9.5; here p denotes the tted value, or probability, from the t of the reduced model. Compute the number of errors of both types that will be made it' this method is applied to the current data. [e] Suppose we change the cuto from 13.5 to 0.9, so that p :- ll'El is benign and p -:: {1.9 is malignant. Compute the number of mm in this case. Discuss the difculty in determining the cuto. [f] It is usually misleading to use the same data to t a model and test is predictive ability. To investigate this, split the data into two partsassign every third observation to a test set and the remaining two thirds to a training set. Use the training set to determine the model and the test to assess its predictive performance. Compare the results to those obtained in the previous parts