Question
Suppose you are working on developing a classification procedure, where you have 1,200 candidate predictors (features) and only 200 observations for the class labels. To
Suppose you are working on developing a classification procedure, where you have 1,200 candidate predictors (features) and only 200 observations for the class labels. To reduce the number of candidate predictors, and focus on the more promising subset of them, you select the 200 of them having the largest absolute value of their correlation with the observed class labels. Then you fit various classification models using the subset of highly correlated predictors, and you would like to select the one model, which is expected to perform the best on a test sample. You decide to use 10-fold cross validation to estimate the test set performance of the candidate models on the subset of highly correlated predictors.Do you expect these 10-fold cross-validation estimates to be valid? Explain your answer.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started