Answered step by step
Verified Expert Solution
Question
1 Approved Answer
You are provided with a dataset containing some medical history information for 7 5 0 patients that might be at risk of cancer. Each patient
You are provided with a dataset containing some medical history information for patients that might be at risk of cancer. Each patient in our dataset has been biopsied to obtain a direct ground truth label so we know each patient's actual cancer status binary variable, means has cancer, means does not We want to build classifiers to predict whether a patient likely has cancer from easiertoget information, so we could avoid painful biopsies unless they are necessary.
It is known that older patients with a family history of cancer have a higher probability of harboring cancer. So we can use Age and Famhistory variables in the data as inputs to predict cancer status. A clinical chemist has recently discovered a realvalued biomarker called Marker in the data file that she believes can distinguish between patients with and without cancer. We wish to assess whether or not the new marker does indeed identify patients with and without cancer well.
You are tasked to build and assess the performance of the following classification models:
Decision Tree with maximum depth of
Decision Tree with maximum depth of
Naive Bayes
KNearest Neighbors with neighbors
KNearest Neighbors with neighbors
Support Vector Machines with polynomial kernel
Support Vector Machines with radial basis function kernel
You need to build the above modes for each of the following two cases. For each casemodel combination, report the model accuracy. Use a test size of and set the random state for data splitting to Report your answers in the table below.
Case : Predict Cancer status through Age and Family history only.
Case : Predict Cancer status through Age, Family history and the biomarker.
Model Model Accuracy
Case : Age & Famhistory Case : Age, Famhistory & Marker
Decision Tree with maximum depth of
Decision Tree with maximum depth of
Naive Bayes
KNearest Neighbors with neighbors
KNearest Neighbors with neighbors
Support Vector Machines with polynomial kernel
Support Vector Machines with radial basis function kernel
Based on your analysis above, do you think the variable Marker is important in predicting the cancer status? Why?
You are provided with a dataset containing some medical history information for patients that
might be at risk of cancer. Each patient in our dataset has been biopsied to obtain a direct ground
truth label so we know each patient's actual cancer status binary variable, means has cancer,
means does not We want to build classifiers to predict whether a patient likely has cancer from
easiertoget information, so we could avoid painful biopsies unless they are necessary.
It is known that older patients with a family history of cancer have a higher probability of harboring
cancer. So we can use Age and Famhistory variables in the data as inputs to predict cancer status.
A clinical chemist has recently discovered a realvalued biomarker called Marker in the data file
that she believes can distinguish between patients with and without cancer. We wish to assess
whether or not the new marker does indeed identify patients with and without cancer well.
You are tasked to build and assess the performance of the following classification models:
Decision Tree with maximum depth of
Decision Tree with maximum depth of
Naive Bayes
KNearest Neighbors with neighbors
KNearest Neighbors with neighbors
Support Vector Machines with polynomial kernel
Support Vector Machines with radial basis function kernel
You need to build the above modes for each of the following two cases. For each casemodel
combination, report the model accuracy. Use a test size of and set the random state for data
splitting to Report your answers in the table below.
Case : Predict Cancer status through Age and Family history only.
Case : Predict Cancer status through Age, Family history and the biomarker.tableModelModel AccuracyCase : Age & Famhistory,tableMarkertable Decision Tree withmaximum depth of table Decision Tree withmaximum depth of Naive Bayes,,table KNearest Neighborswith neighborstable KNearest Neighborswith neighborstable Support VectorMachines withpolynomial kerneltable Support VectorMachines with radialbasis function kernel
Based on your analysis above, do you think the variable Marker is important in predicting the cancer status? Why?tableModelModel AccuracyCase : Age & Famhistory,tableMarkertable Decision Tree withmaximum depth of table Decision Tree withmaximum depth of Naive Bayes,,table KNearest Neighborswith neighborstable KNearest Neighborswith neighborstable Support VectorMachines
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started