Question
Problem 3 : varying counts of predictors in R For all possible pairwise combinations of the numbers of variables associated with outcome (`nClassVars=2` and `5`)
Problem 3 : varying counts of predictors in R
For all possible pairwise combinations of the numbers of variables associated with outcome (`nClassVars=2` and `5`) and those not associated with the outcome (`nNoiseVars=1`, `3` and `10`) -- six pairwise combinations in total -- obtain and present graphically test errors from random forest, LDA and KNN.Choose signal magnitude (`deltaClass`) and training data sample size so that this simulation yields non-trivial results -- noticeable variability in the error rates across those six pairwise combinations of attribute counts.Describe the results: what is the impact of the increase of the number of attributes associated with the outcome on the classifier performance?What about the number of attributes not associated with outcome - does it affect classifier error rate?Are different classifier methods affected by these simulation parameters in a similar way?
The following example below illustrates the main ideas on a 3D dataset with two of the three attributes associated with the outcome:
# How many observations:
nObs <- 1000
# How many predictors are associated with outcome:
nClassVars <- 2
# How many predictors are not:
nNoiseVars <- 1
# To modulate average difference between two classes' predictor values:
deltaClass <- 1
# Simulate training and test datasets with an interaction
# between attribute levels associated with the outcome:
xyzTrain <- matrix(rnorm(nObs*(nClassVars+nNoiseVars)),nrow=nObs,ncol=nClassVars+nNoiseVars)
xyzTest <- matrix(rnorm(10*nObs*(nClassVars+nNoiseVars)),nrow=10*nObs,ncol=nClassVars+nNoiseVars)
classTrain <- 1
classTest <- 1
for ( iTmp in 1:nClassVars ) {
deltaTrain <- sample(deltaClass*c(-1,1),nObs,replace=TRUE)
xyzTrain[,iTmp] <- xyzTrain[,iTmp] + deltaTrain
classTrain <- classTrain * deltaTrain
deltaTest <- sample(deltaClass*c(-1,1),10*nObs,replace=TRUE)
xyzTest[,iTmp] <- xyzTest[,iTmp] + deltaTest
classTest <- classTest * deltaTest
}
classTrain <- factor(classTrain > 0)
table(classTrain)
# plot resulting attribute levels colored by outcome:
pairs(xyzTrain,col=as.numeric(classTrain))
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started