Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

We want to use the following training set to predict whether it is suitable to play tennis?(Play variable) based on four input variables: Outlook, temperature,

image text in transcribed
image text in transcribed
We want to use the following training set to predict whether it is suitable to play tennis?"(Play variable) based on four input variables: Outlook, temperature, humidity, windy. Instance Outlook Temperature Humidity Windy Play di Rainy Cool Normal Y NO d2 Rainy Mild High Y NO d3 Overcast Cool High Y NO d4 Sunny Hot High Y NO ds Rainy Mild High YES d6 Rainy Mild Normal N N N YES d7 Overcast Hot High YES Sunny Cool Normal IN YES ds d9 Sunny Mild High N NO Idio Sunny Hot N NO Q1: Suppose we want to build a decision tree classifier, we need to determine which attribute should be used first to split the root node a) Briefly specify how can we use Gini Index to find the best split. b) If we use Windy attribute to split the data set into two subsets (If Windy=Y, place the data point in node 1; If Windy=N, place the data point in node 2), calculate the Average weighted Gini Index after the split. Please try to be as detailed as possible. Q2: Suppose we want to classify the following new instance d11 using KNN method. Instance Outlook Temperature Humidity Windy d11 Sunny Cool Normal IN a) What is the distance between instance di and instance d11? If we set k=1, what outcome do we get in classifying this new instance? Why? Please try to be as detailed as possible. Hint: use nominal distance to calculate the distance between each instance in the training data and instance d11. Find the nearest neighbor of d11. We want to use the following training set to predict whether it is suitable to play tennis?"(Play variable) based on four input variables: Outlook, temperature, humidity, windy. Instance Outlook Temperature Humidity Windy Play di Rainy Cool Normal Y NO d2 Rainy Mild High Y NO d3 Overcast Cool High Y NO d4 Sunny Hot High Y NO ds Rainy Mild High YES d6 Rainy Mild Normal N N N YES d7 Overcast Hot High YES Sunny Cool Normal IN YES ds d9 Sunny Mild High N NO Idio Sunny Hot N NO Q1: Suppose we want to build a decision tree classifier, we need to determine which attribute should be used first to split the root node a) Briefly specify how can we use Gini Index to find the best split. b) If we use Windy attribute to split the data set into two subsets (If Windy=Y, place the data point in node 1; If Windy=N, place the data point in node 2), calculate the Average weighted Gini Index after the split. Please try to be as detailed as possible. Q2: Suppose we want to classify the following new instance d11 using KNN method. Instance Outlook Temperature Humidity Windy d11 Sunny Cool Normal IN a) What is the distance between instance di and instance d11? If we set k=1, what outcome do we get in classifying this new instance? Why? Please try to be as detailed as possible. Hint: use nominal distance to calculate the distance between each instance in the training data and instance d11. Find the nearest neighbor of d11

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Fraud Risk Assessment Building A Fraud Audit Program

Authors: Leonard W. Vona

1st Edition

047012945X, 978-0470129456

More Books

Students also viewed these Accounting questions

Question

5. Describe how contexts affect listening

Answered: 1 week ago