Question
Suppose we wish to train a classification tree to predict whether a new smartphone will be profitable on Amazon.com. To build the classification tree, we
Suppose we wish to train a classification tree to predict whether a new smartphone will be profitable on Amazon.com. To build the classification tree, we have a dataset which consists of hundreds of smartphones that have been sold in the past year. A sample of the rows of the dataset is shown below:
Product ID | Large screen (>= 5in) | Expensive (>= $600) | Long battery life | Profitable |
x8395wkk33 | Yes | No | Yes | Yes |
552kx52jlk1 | No | No | Yes | Yes |
35q593jpop2 | Yes | Yes | No | No |
.... | ... | ... | ... | ... |
Suppose that we trained a classification tree to predict the column "Profitable" using the other four columns as the independent variables. If we constructed the tree using the information gain approach, would you expect the resulting tree to overfit the data? Why or why not?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started