Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

various problems. Questions with the mark , are required for graduate students and bonus for undergraduate students Learning problems Datasets of three problems are provided

image text in transcribed

various problems. Questions with the mark "", are required for graduate students and bonus for undergraduate students Learning problems Datasets of three problems are provided to you. Please download the data.zip file from the Canvas system. Each problem will have the following files: - A "problem.mat" file, which will have the examples and attributes. One attribute will be an " id" attribute which will be useful for identifying examples, but which you will not use when learning. - A "problem.info" file, which gives additional information about the problem, such as how the data was generated. This is for your information only and does not affect the implementation in any way. Programming Requirements You do not have to implement the algorithms but are expected to understand and know how to use them. You have been provided with the initial python code to read in the data. - Please use Python3, not Python2. - Define a function for each question, such as def_I_Ia a0. - Use sklearn as the machine learning library and NumPy to process data. If not mentioned in the question, use default settings. - Use the last four digits of your student ID as the random state seed for both data split and the method initialization. ( 10 points if this requirement is not followed) 1. Decision Tree Learner (60) points) 1) Split each of the three datasets into training and testing subsets randomly by the ratio 80/20. a. On each dataset, train the decision tree classifiers with entropy as the node selection criteria, what is the prediction accuracy of each classifier? What are the height and the number of leaves for each tree? b. For voting, what is the prediction accuracy of the classifier with gini as the node selection criteria? Which feature provides the highest gimi value during the first node selection process? c. For spam, train the decision tree classifier with entropy as the node selection criteria but with different depths. Plot the accuracy as the depth of the tree is increased from 1 to 50 (the x axis is the depth of the tree and y-axis is the accuracy of the model). What have you observed from the graph? 2) Split the volcanoes dataset into training and testing subsets by the ratio 90/10,70/30,60/40 and 40/60, and report the accuracies. Which partition shows the highest precision and why

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Information Modeling And Relational Databases

Authors: Terry Halpin, Tony Morgan

2nd Edition

0123735688, 978-0123735683

More Books

Students also viewed these Databases questions

Question

5. Structure your speech to make it easy to listen to

Answered: 1 week ago

Question

1. Describe the goals of informative speaking

Answered: 1 week ago