Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hello, below is my code for a MULTIVARIATE LINEAR REGRESSION model. I am including a graph with the data points from the dataset and the

Hello, below is my code for a MULTIVARIATE LINEAR REGRESSION model. I am including a graph with the data points from the dataset and the line that it plots. I included a shot of what the graph looks like when the code runs. Is there any way to clean the graph to make it look more like a regression line? Thank You.

import numpy as np

import matplotlib.pyplot as plt

from sklearn.datasets import load_diabetes

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import PolynomialFeatures

import operator

# Load the diabetes dataset

diabetes = load_diabetes()

data = diabetes['data']

target = diabetes['target']

# Shuffle the dataset

np.random.shuffle(data)

indices = np.random.permutation(data.shape[0])

data, target = data[indices], target[indices]

# Split the data into train, dev, and test sets

train_data = data[:int(0.7 * data.shape[0])]

train_target = target[:int(0.7 * target.shape[0])]

dev_data = data[int(0.7 * data.shape[0]):int(0.85 * data.shape[0])]

dev_target = target[int(0.7 * target.shape[0]):int(0.85 * target.shape[0])]

test_data = data[int(0.85 * data.shape[0]):]

test_target = target[int(0.85 * target.shape[0]):]

# Add a column of ones for the bias term

train_data = np.hstack([np.ones((train_data.shape[0], 1)), train_data])

dev_data = np.hstack([np.ones((dev_data.shape[0], 1)), dev_data])

test_data = np.hstack([np.ones((test_data.shape[0], 1)), test_data])

# Create polynomial features

poly = PolynomialFeatures(degree=2)

train_data_poly = poly.fit_transform(train_data)

test_data_poly = poly.transform(test_data)

model = LinearRegression() #create the model

model.fit(train_data_poly, train_target) #fit the model

# Make predictions on the test set

test_predictions = model.predict(test_data_poly)

# Plot the points from the model

plt.scatter(test_target, test_predictions)

#sort the values of test_target before plotting the line

sort_axis = operator.itemgetter(0)

sorted_indices = np.argsort(train_data_poly[:, 0])

train_data_poly = train_data_poly[sorted_indices]

train_target = train_target[sorted_indices]

sorted_zip = zip(train_data_poly, train_target)

train_data_poly, train_target = zip(*sorted_zip)

plt.plot(test_target, test_predictions, color='r')

plt.xlabel("True Values")

plt.ylabel("Predictions")

plt.show()

image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Database Management

Authors: Donald A. Carpenter Fred R. McFadden

1st Edition

8178088045, 978-8178088044

More Books

Students also viewed these Databases questions

Question

Depreciation is the loss in value of a non-current asset. Discuss.

Answered: 1 week ago