Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question 5: Programming (40 points): Use decision tree and random forest to train the titanic.csv dataset included in the assignment. Step 1: Read in Titanic.csv

image text in transcribedimage text in transcribed

Question 5: Programming (40 points): Use decision tree and random forest to train the titanic.csv dataset included in the assignment. Step 1: Read in Titanic.csv and observe a few samples, some features are categorical, and others are numerical. If some features are missing, fill them in using the average of the same feature of other samples. Take a random 80% samples for training and the rest 20% for test. Step 2: Fit a decision tree model using independent variables 'pclass + sex + age + sibsp' and dependent variable 'survived'. Plot the full tree. Make sure 'survived' is a qualitative variable taking 1 (yes) or 0 (no) in your code. You may see a tree similar to this one (the actual structure and size of your tree can be different): Step 3: Use the GridSearchCV() function to find the best parameter max_leaf_nodes to prune the tree. Plot the pruned tree which shall be smaller than the tree you obtained in Step 2. Step 4: For the pruned tree, report its accuracy on the test set for the following: percent survivors correctly predicted (on test set) percent fatalities correctly predicted (on test set) Step 5: Use the RandomForestClassifier() function to train a random forest using the value of max_leaf_nodes you found in Step 3. You can set n _estimators as 50. Report the accuracy of random forest on the test set for the following: percent survivors correctly predicted (on test set) percent fatalities correctly predicted (on test set) Check whether there is improvement as compared to a single tree obtained in Step 4

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Processing

Authors: David J. Auer David M. Kroenke

13th Edition

B01366W6DS, 978-0133058352

Students also viewed these Databases questions

Question

Has the priority order been provided by someone else?

Answered: 1 week ago

Question

Compare the current team to the ideal team.

Answered: 1 week ago