Exercise 7.3 Suppose we have a system that observes a persons TV watching habits in order to

Question:

Exercise 7.3 Suppose we have a system that observes a person’s TV watching habits in order to recommend other TV shows the person may like. Suppose that we have characterized each show by whether it is a comedy, whether it features doctors, whether it features lawyers, and whether it has guns. Suppose we are given the examples of Figure 7.18 about whether the person likes various TV shows. We want to use this data set to learn the value of Likes (i.e., to predict which TV shows the person would like based on the attributes of the TV show).

You may find the AIspace.org applets useful for this assignment. (Before you start, see if you can see the pattern in what shows the person likes.)

(a) Suppose the error is the sum of absolute errors. Give the optimal decision tree with only one node (i.e., with no splits). What is the error of this tree?

(b) Do the same as in part (a), but with the sum-of-squares error.

(c) Suppose the error is the sum of absolute errors. Give the optimal decision tree of depth 2 (i.e., the root node is the only node with children). For each leaf in the tree, give the examples that are filtered to that node. What is the error of this tree?

(d) Do the same as in part

(c) but with the sum-of-squares error.

(e) What is the smallest tree that correctly classifies all training examples? Does a top-down decision tree that optimizes the information gain at each step represent the same function?

(f) Give two instances that do not appear in the examples of Figure 7.18 and show how they are classified using the smallest decision tree. Use this to explain the bias inherent in the tree. (How does the bias give you these particular predictions?)

(g) Is this data set linearly separable? Explain why or why not.

Fantastic news! We've Found the answer you've been seeking!