Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question: Implement Logistic regression using NumPy. You should implement your code in the provided logistic _ regression.py file. This python file takes two csv files

Question:
Implement Logistic regression using NumPy. You should implement your code in the provided logistic_regression.py file. This python file takes two csv files as inputs, which are training and testing csv files. For example, in the following usage scenario, log_training.csv is the training input csv file and log_testing.csv is the testing input csv file.
$python3 logistic_regression.py log_training.csv log_testing.csv
Output the Accuracy score of prediction.
provided logistic_regression.py file is as follow:
import numpy as np
import pandas as pd
import math
import sys
import os
#todo define necessary functions
def logistic_regression(xtrain, ytrain, xtest, ytest):
"""
return: Accuracy value
"""
#todo fill code here
return -1
# do not modify this function
def load_data():
train_filename = sys.argv[1]
test_filename = sys.argv[2]
train_feature_matrix = pd.read_csv(train_filename)
test_feature_matrix = pd.read_csv(test_filename)
train_feature_matrix = train_feature_matrix.dropna()
test_feature_matrix = test_feature_matrix.dropna()
X_TRAIN = train_feature_matrix.iloc[:, :len(train_feature_matrix.columns)-1]
Y_TRAIN = train_feature_matrix.iloc[:,-1]
X_TEST = test_feature_matrix.iloc[:, :len(test_feature_matrix.columns)-1]
Y_TEST = test_feature_matrix.iloc[:,-1]
return X_TRAIN, Y_TRAIN, X_TEST, Y_TEST
if __name__=="__main__":
xtrain, ytrain, xtest, ytest = load_data()
ACCURACY_SCORE = logistic_regression(xtrain, ytrain, xtest, ytest)
print("ACCURACY score is : ", ACCURACY_SCORE)
-log_testing.csv contains in the following format:
male,age,education,currentSmoker,cigsPerDay,BPMeds,prevalentStroke,prevalentHyp,diabetes,totChol,sysBP,diaBP,BMI,heartRate,glucose,TenYearCHD
(1,54,3.0,0,0.0,0.0,0,1,0,258.0,146.0,98.5,26.05,60.0,68.0,0)
(0,61,1.0,0,0.0,,0,0,0,218.0,148.0,80.0,37.04,82.0,78.0,0)
There are 2120 lines of data structured in this way
-log_training.csv contains in the following format:
male,age,education,currentSmoker,cigsPerDay,BPMeds,prevalentStroke,prevalentHyp,diabetes,totChol,sysBP,diaBP,BMI,heartRate,glucose,TenYearCHD
(1,39,4.0,0,0.0,0.0,0,0,0,195.0,106.0,70.0,26.97,80.0,77.0,0)
(0,46,2.0,0,0.0,0.0,0,0,0,250.0,121.0,81.0,28.73,95.0,76.0,0)
There are 2120 lines of data structured in this way
i have attached the image of log_training.csv file
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions