Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Please use R Programming for this question. Data for Question 1: breast_cancer_updated data:https://drive.google.com/file/d/1m-zZI1nGd5qBFd5FymhgzxLLhFshwXBW/view?usp=sharing Question 1: For this problem, you will load the breast_cancer_updated.csv data and

Please use R Programming for this question.

Data for Question 1: breast_cancer_updated data:https://drive.google.com/file/d/1m-zZI1nGd5qBFd5FymhgzxLLhFshwXBW/view?usp=sharing

image text in transcribed
Question 1: For this problem, you will load the breast_cancer_updated.csv data and perform a straightforward training and evaluation of Decision Trees algorithm. a. As a prep rooessing step, remove the lDNumber oolumn and exclude rows with NA in the dataset b. Apply Decision Tree algorithm [use rpart] to the data to predict breast cancer and report the accuracy using 10-fold cross validation. c. WSUHITIE the decision tree. d. Generate the confusion matrix and comment on the confusion matrix. Howr does the accuracy with the confusion matrix he re oompa re with the accuracy in b]. Are they the same or different? e. Generate rules for the decision tree using lF-THEN statements. Question 2: In this problem you will generate decision trees with a set of parameters. You will be using storms [Storm tracking data} data, which is part of dplyr library. This data is a subset of the \"BAA Atlantic hurricane database best track data, htmsffwwwnhc.noaa.govfdataf#hurdat. The data includes the positions and attributes of 198 tropical storms, measured every six hours during the lifetime of a storm. As a preproor-'Issing step, view the data and make sure the target variable which is a string is converted to a factor. a. Build a decision tree using the following hyperpara meters, maxdepth = 2, minsplit = 5 and minbucket = 3. Make sure you don't pretune your tree using the cp parameter. Be careful to use the right method of training so that you are not automatically tuning the op parameters, but you are controlling the aforementioned parameters specically. Use cross validation to report your accuracy soore.'l11ese parameters will result in a relatively small tree. b. To see hovur this performed with respect to classes and if that is different on train versus test, create a partition, train on the training set and create confusion matrix for both train and test partitions. Compare the confusion matrices and report which classes it has problem classifying. Do you think that both are performing similarly and what does that imply about overfitting for the model

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Differential Geometry And Continuum Mechanics

Authors: Gui Qiang G Chen, Michael Grinfeld, R J Knops

1st Edition

331918573X, 9783319185736

More Books

Students also viewed these Mathematics questions

Question

Are career plateaus inevitable for most employees? Why or why not?

Answered: 1 week ago

Question

What is a verb?

Answered: 1 week ago