Answered step by step
Verified Expert Solution
Link Copied!

Question

00
1 Approved Answer

I need help on the following question: Q2 (Linear Regression) Write python codes in a Jupyter notebook that implement the gradient descent algorithm to train

I need help on the following question:

Q2 (Linear Regression) Write python codes in a Jupyter notebook that implement the gradient descent algorithm to train a linear regression model for the Boston housing data set:

https://towardsdatascience.com/linear-regression-on-boston-housing-dataset-f409b7e4a155

Split the dataset to a training set (70% samples) and a testing set (30% samples). Print the root mean squared errors (RMSE) on the training and testing sets in the Jupyter notebook.

This is the following code I got:

import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error

# Load the Boston Housing dataset boston_dataset = load_boston()

# Convert the dataset into a Pandas dataframe boston = pd.DataFrame(boston_dataset.data, columns=boston_dataset.feature_names) boston['MEDV'] = boston_dataset.target

# Split the dataset into a training set (70% samples) and a testing set (30% samples) train_set, test_set, train_target, test_target = train_test_split(boston.drop("MEDV", axis=1), boston["MEDV"], test_size=0.3, random_state=42)

# Add a column of ones to the dataset (bias term) train_set = np.c_[np.ones((train_set.shape[0], 1)), train_set] test_set = np.c_[np.ones((test_set.shape[0], 1)), test_set]

# Initialize the weights and learning rate weights = np.zeros(train_set.shape[1]) alpha = 0.01

# Define the number of iterations for gradient descent num_iters = 1000

# Compute the cost function def compute_cost(X, y, weights): y_pred = X.dot(weights) cost = (1/2*m) * np.sum(np.power((y_pred - y), 2)) return cost

# Perform gradient descent m = train_set.shape[0] for i in range(num_iters): y_pred = train_set.dot(weights) weights = weights - (alpha/m) * (train_set.T.dot(y_pred - train_target)) # Compute the root mean squared error (RMSE) on the training set train_rmse = np.sqrt(mean_squared_error(train_target, train_set.dot(weights))) print("RMSE on the training set: ", train_rmse)

# Compute the root mean squared error (RMSE) on the testing set test_rmse = np.sqrt(mean_squared_error(test_target, test_set.dot(weights))) print("RMSE on the testing set: ", test_rmse)

I'm getting the following error: "ValueError: Input contains NaN, infinity or a value too large for dtype('float64')". How do I fix this? Please let me know, thank you!

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions