Answered step by step
Verified Expert Solution
Question
1 Approved Answer
In this problem, you will develop a model to predict whether a person in the US Census earns more than $ 5 0 K or
In this problem, you will develop a model to predict whether a person in the US Census earns more than $K or not. Consider Income as the target variable and include Age, MaritalStatus, Race, Sex, and WeeklyHours as predictors. We use the Census dataset for this. Use a QDA model. Use the previously created folds for the crossvalidation on the training set.
Calculate and show the confusion matrix for both the training and the test set. What is the performance with respect to qdamodel
# specify that the model is a quadratic discriminant analysis
discrimquad
# note: there are several potential engines for QDA, here we just use the default one
setengineMASS
# select the binary classification mode
setmodeclassification
# then, let's put everything into a workflow
qdaworkflow workflow
# add the recipe data preprocessing
addrecipemodelrecipe
# add the ML model
addmodelqdamodel
set.seed
control controlresamplessavepred TRUE,
eventlevel "second"
qdafit
qdaworkflow
fitdata datatrain
# investigate the result
qdafit
# to get the evaluation metrics for the test data:
qdafinalfit
qdaworkflow
lastfitdatasplit # with the fit function, we train the model on the training data
# note that we use the test data here!
testpredictionsqda
qdafinalfit
augment
testpredictionsqda$Income asfactortestpredictionsqda$Income
# note: you need to select the truth and estimate variables based on the column names of the test object
classificationmetricsdata testpredictionsqda,
truth Income,
estimate predclass,
predK # use the second outcome Yes as the level of interest
eventlevel 'second' # note: the "second" indicates that we use the second class AHD Yes as the level of interest
# finally, let's create the confusion matrix and ROC curve
confusionMatrixdata testpredictionsqda$predclass,
reference testpredictionsqdatargetvar
positive positiveclass
twoclasscurvetestqda roccurvedata testpredictionsqda,
truth Income,
predK
eventlevel 'second'
autoplottwoclasscurvetestqda sensitivity and specificity, and AUC? Create and print the ROC curves. I am using a five fold cross validation.
How would I create the confusion matrix for the training set?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started