Question: 5. The following table34 lists a dataset containing the details of five participants in a heart disease study, and a target feature RISK which describes
5. The following table34 lists a dataset containing the details of five participants in a heart disease study, and a target feature RISK which describes their risk of heart disease. Each patient is described in terms of four binary descriptive features EXERCISE, how regularly do they exercise SMOKER, do they smoke OBESE, are they overweight FAMILY, did any of their parents or siblings suffer from heart disease

a. As part of the study researchers have decided to create a predictive model to screen participants based on their risk of heart disease. You have been asked to implement this screening model using a random forest. The three tables below list three bootstrap samples that have been generated from the above dataset. Using these bootstrap samples create the decision trees that will be in the random forest model (use entropy based information gain as the feature selection criterion).

b. Assuming the random forest model you have created uses majority voting, what prediction will it return for the following query:
EXERCISE=rarely, SMOKER=false, OBESE=true, FAMILY=yes
ID EXERCISE SMOKER OBESE FAMILY RISK 1 daily false false yes low 2 weekly true false yes high 3 daily false false no low 4 rarely true true yes high 5 rarely true true no high
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
