Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Two new strains of corona-viruses named Omicron(O), Delta(D) are detected and scientists want to discriminate between them based on the symptoms the patients display including
Two new strains of corona-viruses named Omicron(O), Delta(D) are detected and scientists want to discriminate between them based on the symptoms the patients display including smell {Y,N}, taste { sweet, sour }, headache intensity {4,5}, hair loss {Y,N}. Scientists greedily learn a decision tree using this dataset and split based on the maximum information gain strategy. Table 1: Coronaviruses symptoms sheet (a) Which attribute would the algorithm choose to split at the root of the tree and what is the information gain? (2 Points) (b) Draw the complete decision tree that would be built on this dataset. (8 points) (c) Express the learned concept for Omicron strain of the virus as a set of conjunctive rules. (e.g., if (Smell=Y and Headache Intensity =4 and Hair loss =N and Taste=sweet), then Omicron; else if ... then Delta; ...; else Omicron). (2 points) (d) In the solution for question (c), each conjunction uses up to 4 attributes (i.e. symptoms). Find a set of conjunctive rules for both strains of viruses that uses only 2 attributes per conjunction and still results in zero error on the training set. Represent these simpler sets of conjunctive rules by a decision tree of depth 2. (5 points) (e) Why did the decision tree based on the maximum information gain strategy (from Q a) resulted in a longer tree than the optimal one (from Q d)? (3 points)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started