Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Aug 29, 2024

1. Move code to a library Jupyter Notebooks are not good for managing code. They are best for visualization and quick iteration. So, we'll move

1. Move code to a library

Jupyter Notebooks are not good for managing code. They are best for visualization and quick iteration. So, we'll move useful code to a library. Later, we can import the library and call its methods from the notebooks. Useful things to move to the library:

1. Data preprocessing: load from csv, clean up data.

2. Dataset split.

3. Metrics and score calculation.

Steps:

1. Create a Python package foloder1 . It's just a subdirectory, with an empty file __init__.py in it.

2. Create features.py in the csc665 subdirectory and implement the following functions:

A. def train_test_split(X, y, test_size, shuffle, random_state=None) : X, y - features and the target variable. test_size - between 0 and 1 - how much to allocate to the test set; the rest goes to the train set. shuffle - if True, shuffle the dataset, otherwise not. random_state, integer; if None, then results are random, otherwise fixed to a given seed. Example: X_train, X_test, y_train, y_test = train_test_split(feat_df, y, 0.3, True, 12)

B. create_categories(df, list_columns) Converts values, in-place, in the columns passed in the list_columns to numerical values. Follow the same approach: "string" -> category -> code. Replace values in df, in-place.

C. X, y = preprocess_ver_1(csv_df) Apply the feature transformation steps to the dataframe, return new X and y for entire dataset. Do not modify the original csv_df . Remove all rows with NA values Convert datetime to a number Convert all strings to numbers. Split the dataframe into X and y and return these. 3. Create metrics.py : A. def mse(y_predicted, y_true) - return Mean-Squared Error. B. def rmse(y_predicted, y_true) - return Root Mean-Squared Error. C. def rsq(y_predicted, y_true) - return .

Copy the in-class notebook, and replace all relevant data processing and feature calculations with your own functions.

Evaluate the score as a function of the number of trees (i.e. n_estimators) used in the random forest. That is, set the number of trees to 1, 5, 10, 20, ..., 200, and measure on both the train and test sets. Plot both train and test scores as the function on the number of trees used in the random forest model. Analyze the result. Is it overfitting / underfitting? How overfitting / underfitting changes with the number of tre

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Oracle Database 11g SQL

Oracle Database 11g SQL

Authors: Jason Price

1st Edition

0071498508, 978-0071498500

More Books

Students also viewed these Databases questions

Question

★★★★★

You have been hired by the U. S. Department of Education to determine why graduation rates vary from 8% to 100% among U. S. colleges and universities. To facilitate your assignment, you are provided...

Answered: 1 week ago

Question

★★★★★

An investment project has annual cash inflows of $5,000, $3,300, $4,500, and $3,700, and a discount rate of 14 percent. What is the discounted payback period for these cash flows if the initial cost...

Answered: 1 week ago

Question

★★★★★

=+ (b) Suppose that 0 p ,, 1, and put , = min{p ,,, 1 -p}. Show that, if E ,, &, diverges, then no discrete probability space can contain independent events

Answered: 1 week ago

Question

★★★★★

Colwyn buys a 10% corporate bond with a current yield of 6%. When he sells the bond 1 year later, the current yield on the bond is 7%. How much did Col make on this investment?

Answered: 1 week ago

Question

★★★★★

Selected data from Gibson Company follow: Required a. Compute the accounts receivable turnover for Year 3. b. Compute the inventory turnover for Year 3. c. Compute the net margin for Year 2. Note:...

Answered: 1 week ago

Question

★★★★★

Disaggregating Return Measures for Analyzing Leverage Selected financial information from Syntex Corporation is reproduced below: 1. NOA turnover (average NOA equals ending NOA) is 2. 2. NOPAT margin...

Answered: 1 week ago

Question

★★★★★

54. The following are production and cost data for two products, A and B, produced in batches of 100 units. Contribution margin per batch Machine set-ups needed per batch Product A $450 Product B...

Answered: 1 week ago

Question

★★★★★

Manager T. C. Downs of Plum Engines, a producer of lawn mowers and leaf blowers, must develop an aggregate plan given the forecast for engine demand shown in the table. The department has a normal...

Answered: 1 week ago

Question

★★★★★

Question 1 (2 points) According to McGregor's X-Y theory, which of the following is a characteristic of Theory Y managers? O 1) They believe that most workers avoid responsibility. O 2) They believe...

Answered: 1 week ago

Question

★★★★★

Systems Analyst role is difficult and challenging. In addition, you likely have a greater appreciation for the significant contributions these professionals make to the results of the organization...

Answered: 1 week ago

Question

★★★★★

Points: 25 Graduation Party Stakeholder Management Plan Stakeholder Management Graduation Project 1. Identify & Analyze Stakeholders: Drawing on the charter you created, create a stakeholder register...

Answered: 1 week ago

Question

★★★★★

Describe how language reflects, builds on, and determines context?

Answered: 1 week ago

Question

★★★★★

Have you ever found yourself in a situation where you are entirely sure that you are using a term precisely (a good restaurant, a fun party, an affordable car) only to have someone wholeheartedly...

Answered: 1 week ago

Question

★★★★★

What connotative meanings does each of the following words have for you: religion, divorce, money, exercise, travel, dancing, parenthood? Why do you have the reaction you do to each word?

Answered: 1 week ago

Previous Question Next Question