Decision Trees ( DTs ) are a non parametric supervised learning method used for classification and regression The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features A tree can be obtained after training In practice, a predictive decision tree model will incrementally select the best decisions to split on ( evaluated based on the entropy principle ) to provide an output classification based on our input data For this assessment, you will be describing a new problem and utilising some machine learning Python modules to create an ID 3 predictive decision tree model, along with a visualisation to better understand the classification process This Task 3 will measure your ability to 1 ) study a problem that can be tackled by an artificial intelligence method, e g , a decision tree 2 ) implement the decision tree using the given instructions 3 ) visualize and analyse the decision tree The objective of this assessment is to utilise the pandas the scikit learn library to implement the ID 3 decision tree machine learning algorithm to create a classifier that can tackle a specific problem By the end of this assessment, you should have a better understanding of how decision trees work Utilising the trained tree visualisation, you should have a reinforced understanding of the core decision tree principles and how the trees split evaluations operate Problem Description You will be creating a decision tree that will predict wine classes based on provided attributes Imagine that you are a wine producer compiling data for a study The data is the results of a chemical analysis of wines grown in the same region in Italy by three different cultivators There are thirteen different measurements taken for different constituents found in the three types of wine, class 0 , class 1 , and class 2 Those thirteen different measurements include Alcohol, Malic acid, Ash, Alcalinity of ash, Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins, Color intensity, Hue, OD 2 8 0 OD 3 1 5 of diluted wines and Proline Therefore, the model s input parameters can be Those thirteen different measurements and the model output should be the wine classes Implementation instructions Assignment dependency installation You will have to install the following required dependency scikit learn pandas matplotlib Imports You will be utilising a number of well known machine learning Python modules in this task These steps allow for ease of implementation of our decision tree and include numerous learning tools to help boost your understanding import pandas as pd import sklearn import matplotlib pyplot as plt Load the dataset and Format the training testing data You will be using the following codes to load the dataset Load wine dataset from sklearn datasets import load wine data load wine ( ) The pandas python module is a very powerful, frequently used data analysis tool in all forms of machine learning it allows you to store and manipulate large datasets very easily and is highly compatible integrated with other machine learning tools modules You will be storing your dataset into a pandas DataFrame A DataFrame is very similar to a dictionary in standard Python but has many additional useful features Add our data into this DataFrame by specifying the data keys and corresponding values Then, what we need to do is to create a training set for training the classifier and a test set to evaluate the quality of the trained classifier Creating the training and test sets can be quite easy for this task, we can simply split the collected data into two groups For example, if we have collected 1 0 0 records, we can use 8 0 records as the training set, leaving the remaining 2 0 records as the test set Train the decision tree Once you have correctly formatted your data, you can move on to creating the decision tree Create a new scikit learn DecisionTreeClassifier, pass the entropy key as the criterion for the information gain Scikit learn is a powerful machine learning framework You will be utilising the included DecisionTreeClassifier class to create and train your decision tree

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Jul 28, 2024

Decision Trees ( DTs ) are a non - parametric supervised learning method used for classification and regression. The goal is to create a model

Decision Trees

(

DTs

)

are a non

-

parametric supervised learning method used for

classification and regression. The goal is to create a model that predicts the value of a

target variable by learning simple decision rules inferred from the data features. A tree

can be obtained after training. In practice, a predictive decision tree model will

incrementally select the best decisions to split on

(

evaluated based on the entropy

principle

)

to provide an output classification based on our input data. For this

assessment, you will be describing a new problem and utilising some machine learning

Python modules to create an ID

3

predictive decision tree model, along with a

visualisation to better understand the classification process.

This Task

3

will measure your ability to

1)

study a problem that can be tackled by an

artificial intelligence method, e

.

.,

a decision tree;

2)

implement the decision tree using

the given instructions;

3)

visualize and analyse the decision tree. The objective of this

assessment is to utilise the pandas the scikit

-

learn library to implement the ID

3

decision tree machine learning algorithm to create a classifier that can tackle a specific

problem. By the end of this assessment, you should have a better understanding of how

decision trees work. Utilising the trained tree visualisation, you should have a reinforced

understanding of the core decision tree principles and how the trees

split evaluations

operate.

Problem Description

You will be creating a decision tree that will predict wine classes based on provided attributes.

Imagine that you are a wine producer compiling data for a study. The data is the results of a

chemical analysis of wines grown in the same region in Italy by three different cultivators. There

are thirteen different measurements taken for different constituents found in the three types of

wine, class

_0,

class

_1,

and class

_2 .

Those thirteen different measurements include: Alcohol, Malic acid, Ash, Alcalinity of ash,

Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins, Color

intensity, Hue, OD

280 /

315

of diluted wines and Proline.

Therefore, the model

s input parameters can be Those thirteen different measurements and the

model output should be the wine classes.

Implementation instructions

Assignment dependency installation

You will have to install the following required dependency:

scikit

-

learn

pandas

matplotlib

Imports

You will be utilising a number of well

-

known machine learning Python modules in this

task. These steps allow for ease of implementation of our decision tree and include

numerous learning tools to help boost your understanding.

import pandas as pd

import sklearn

import matplotlib.pyplot as plt

Load the dataset and Format the training

/

testing data

You will be using the following codes to load the dataset:

# Load wine dataset

from sklearn.datasets import load

_

wine

data

=

load

_

wine

()

The pandas python module is a very powerful, frequently used data analysis tool in all

forms of machine learning; it allows you to store and manipulate large datasets very

easily and is highly compatible

/

integrated with other machine learning tools

/

modules

.

You will be storing your dataset into a pandas DataFrame. A DataFrame is very similar to

a dictionary in standard Python but has many additional useful features. Add our data

into this DataFrame by specifying the data keys and corresponding values.

Then, what we need to do is to create a training set for training the classifier and a test

set to evaluate the quality of the trained classifier. Creating the training and test sets

can be quite easy for this task, we can simply split the collected data into two groups.

For example, if we have collected

100

records, we can use

80

records as the training

set, leaving the remaining

20

records as the test set.

Train the decision tree

Once you have correctly formatted your data, you can move on to creating the decision

tree. Create a new scikit

-

learn DecisionTreeClassifier, pass the

entropy

key as the

criterion for the information gain. Scikit

-

learn is a powerful machine learning

framework. You will be utilising the included DecisionTreeClassifier class to create

and train your decision tree

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional SQL Server 2012 Internals And Troubleshooting

Authors: Christian Bolton, Justin Langford

1st Edition

★★★★★

7. Explain in detail how to establish a market-competitive pay plan.pg 87

Answered: 1 week ago

Previous Question Next Question