Answered step by step
Verified Expert Solution
Question
1 Approved Answer
2 . Decision Trees ( 1 5 marks ) ( a ) Explain how decision trees can be used to address regression and classification problems.
Decision Trees marks
a Explain how decision trees can be used to address regression and classification problems. marks
b Decision trees are trained by maximising the purity of the corresponding partition of the training data. Does the purity of a set depend on the attributes,
on the labels, or on both? Briefly describe how you evaluate the purity of the
partition associated with a given tree. marks
c Consider the following data set
D xn yn in R
times one two, three, four, five
n
x yx yx yx y
Choose y y in one two, three, four, five to maximize the purity of D
marks
d Suppose you have trained a classification tree to predict Y yes no
Explain how you can use a Bayes classifier and the estimated conditional
probability, ProbY X to predict the label of a test object.
Hint: In this case, a Bayes classifier based on ProbY X is
y arg max
y in yesno
ProbY yX
marks
e A standard approach to building a decision tree is to perform iterative binary
splits of the input space. How do you find the optimal split at each iteration?
Give the formula of the information gain and explain how it depends on a
realvalued threshold.
Hint: The information gain is a function of the size and the entropy of
a bin before the split, B and
the two bins, B and B you obtain after splitting B
Let D be the cardinality of D Then the entropy of D is
HD
X
y in Y
py log py py D X
x
y in D
y
y
marks
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started