Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Decision Trees ( DTs ) are a non - parametric supervised learning method used for classification and regression. The goal is to create a model
Decision Trees DTs are a nonparametric supervised learning method used for
classification and regression. The goal is to create a model that predicts the value of a
target variable by learning simple decision rules inferred from the data features. A tree
can be obtained after training. In practice, a predictive decision tree model will
incrementally select the best decisions to split on evaluated based on the entropy
principle to provide an output classification based on our input data. For this
assessment, you will be describing a new problem and utilising some machine learning
Python modules to create an ID predictive decision tree model, along with a
visualisation to better understand the classification process.
This Task will measure your ability to study a problem that can be tackled by an
artificial intelligence method, eg a decision tree; implement the decision tree using
the given instructions; visualize and analyse the decision tree. The objective of this
assessment is to utilise the pandas the scikitlearn library to implement the ID
decision tree machine learning algorithm to create a classifier that can tackle a specific
problem. By the end of this assessment, you should have a better understanding of how
decision trees work. Utilising the trained tree visualisation, you should have a reinforced
understanding of the core decision tree principles and how the trees split evaluations
operate.
Problem Description
You will be creating a decision tree that will predict wine classes based on provided attributes.
Imagine that you are a wine producer compiling data for a study. The data is the results of a
chemical analysis of wines grown in the same region in Italy by three different cultivators. There
are thirteen different measurements taken for different constituents found in the three types of
wine, class class and class
Those thirteen different measurements include: Alcohol, Malic acid, Ash, Alcalinity of ash,
Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins, Color
intensity, Hue, ODOD of diluted wines and Proline.
Therefore, the models input parameters can be Those thirteen different measurements and the
model output should be the wine classes.
Implementation instructions
Assignment dependency installation
You will have to install the following required dependency:
scikitlearn
pandas
matplotlib
Imports
You will be utilising a number of wellknown machine learning Python modules in this
task. These steps allow for ease of implementation of our decision tree and include
numerous learning tools to help boost your understanding.
import pandas as pd
import sklearn
import matplotlib.pyplot as plt
Load the dataset and Format the trainingtesting data
You will be using the following codes to load the dataset:
# Load wine dataset
from sklearn.datasets import loadwine
data loadwine
The pandas python module is a very powerful, frequently used data analysis tool in all
forms of machine learning; it allows you to store and manipulate large datasets very
easily and is highly compatibleintegrated with other machine learning toolsmodules
You will be storing your dataset into a pandas DataFrame. A DataFrame is very similar to
a dictionary in standard Python but has many additional useful features. Add our data
into this DataFrame by specifying the data keys and corresponding values.
Then, what we need to do is to create a training set for training the classifier and a test
set to evaluate the quality of the trained classifier. Creating the training and test sets
can be quite easy for this task, we can simply split the collected data into two groups.
For example, if we have collected records, we can use records as the training
set, leaving the remaining records as the test set.
Train the decision tree
Once you have correctly formatted your data, you can move on to creating the decision
tree. Create a new scikitlearn DecisionTreeClassifier, pass the entropy key as the
criterion for the information gain. Scikitlearn is a powerful machine learning
framework. You will be utilising the included DecisionTreeClassifier class to create
and train your decision tree
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started