Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Task 1: Implementing Random Forest (50 marks) You are asked to implement. random forest for regression. Random forest was discussed in Teaching Week 6. You

image text in transcribed
Task 1: Implementing Random Forest (50 marks) You are asked to implement. random forest for regression. Random forest was discussed in Teaching Week 6. You are not allowed to use any existing implementations of decision trees or random forests in R or any other language , and you must code random forest from first principles You should apply your random forest program to theBoston dataset to predict medv . In other words, medv is the label, and the other 13 variables in the dataset are the attributes Split the dataset randomly into two equal parts, which will serve as the training set and the test set. Use your birthday (in the format MMDD) as the seed for the pseudorandom number generator . The same training set and test set should be used throughout this assignment .You need to complete the following parts: (a) Generate B = 100 bootstrapped training sets (BTS) from the training set. (b) Use each BTS to train for a decision tree of heighth : 3. Be reminded that you are implementing random forest, so at each node you do not consider all attributes ,but only a sample of them . (c) Find the training MSE and test MSE . Include it in your report . ((1) Repeat the above parts using different values ofB and h. In your report , plot the training MSE and test MSE as functions ofB or / and h, and discuss your observations. In the code file, you should leave comments to clearly indicate which of your code snippets deals with which part (among (a), (b), (c) or (d) above) This helps the graders to understand your code more easily. Feel free to include in your report anything else that you find interesting. Training for a Decision Tree In Task 1(b], you are asked to train for a decision tree of height 3 (see Figure 1 for an example of such a decision tree). To begin, you should do the following steps at the root of the decision tree: (i) When there are p attributes , sample [13/3] attributes without replacement (ii) Use the sampled attributes to determine the optimal split at the root of the decision tree. Recall that the optimal split is the split that minimizes the sum of the two RSS (residual sum of squares) at the two child nodes of the root

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Managing Information Technology

Authors: Carol Brown, Daniel DeHayes, Jeffrey Hoffer, Wainright Marti

7th Edition

132146320, 978-0132146326

More Books

Students also viewed these General Management questions

Question

What is Constitution, Political System and Public Policy? In India

Answered: 1 week ago

Question

What is Environment and Ecology? Explain with examples

Answered: 1 week ago