Question
PLEASE complete the code IN PYTHON (URGENT) decision tree should work for four cases: i) discrete features, discrete output ii) discrete features, real output; iii)
PLEASE complete the code IN PYTHON (URGENT)
decision tree should work for four cases: i) discrete features, discrete output
ii) discrete features, real output;
iii) real features, discrete output;
iv) real features, real output.
decision tree should be able to use GiniIndex or InformationGain as the criteria for splitting.code should also be able to plot/display the decision tree.
import numpy as np import pandas as pd import matplotlib.pyplot as plt from .utils import entropy, information_gain, gini_index
np.random.seed(42)
class DecisionTree(): def __init__(self, criterion, max_depth): """ Put all infromation to initialize your tree here. Inputs: > criterion : {"information_gain", "gini_index"} # criterion won't be used for regression > max_depth : The maximum depth the tree can grow to """ pass
def fit(self, X, y): """ Function to train and construct the decision tree Inputs: X: pd.DataFrame with rows as samples and columns as features (shape of X is N X P) where N is the number of samples and P is the number of columns. y: pd.Series with rows corresponding to output variable (shape of Y is N) """ pass
def predict(self, X): """ Funtion to run the decision tree on a data point Input: X: pd.DataFrame with rows as samples and columns as features Output: y: pd.Series with rows corresponding to output variable. THe output variable in a row is the prediction for sample in corresponding row in X. """ pass
def plot(self): """ Function to plot the tree Output Example: ?(X1 > 4) Y: ?(X2 > 7) Y: Class A N: Class B N: Class C Where Y => Yes and N => No """ pass
util - which need to be completed tooo
def entropy(Y): """ Function to calculate the entropy Inputs: > Y: pd.Series of Labels Outpus: > Returns the entropy as a float """ pass
def gini_index(Y): """ Function to calculate the gini index Inputs: > Y: pd.Series of Labels Outpus: > Returns the gini index as a float """ pass
def information_gain(Y, attr): """ Function to calculate the information gain Inputs: > Y: pd.Series of Labels > attr: pd.Series of attribute at which the gain should be calculated Outputs: > Return the information gain as a float """ pass
""" | |
The current code given is for the Assignment 1. | |
You will be expected to use this to make trees for: | |
> discrete input, discrete output | |
> real input, real output | |
> real input, discrete output | |
> discrete input, real output | |
""" | |
import numpy as np | |
import pandas as pd | |
import matplotlib.pyplot as plt | |
from tree.base import DecisionTree | |
from metrics import * | |
np.random.seed(42) | |
# Test case 1 | |
# Real Input and Real Output | |
N = 30 | |
P = 5 | |
X = pd.DataFrame(np.random.randn(N, P)) | |
y = pd.Series(np.random.randn(N)) | |
for criteria in ['information_gain', 'gini_index']: | |
tree = DecisionTree(criterion=criteria) #Split based on Inf. Gain | |
tree.fit(X, y) | |
y_hat = tree.predict(X) | |
tree.plot() | |
print('Criteria :', criteria) | |
print('RMSE: ', rmse(y_hat, y)) | |
print('MAE: ', mae(y_hat, y)) | |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started