Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

I would like some modifications to this code with explanation. ( I use Logistic regression, and Dataset: Female Diabetes ) . - The accuracy for

I would like some modifications to this code with explanation.
(I use Logistic regression, and Dataset: Female Diabetes).
- The accuracy for training the model on all features is approximately: 0.8116883116883117. I tried to improve the models performance by deleting outliers or feature selection, and the accuracy was either lower or was the same as the previous accuracy. I do not know if I was determining the most influential features correctly or incorrectly.
I want to improve the model in any way, with a diagram that shows...and an explanation of the steps in detail
- Also, one of the things required in this lab is to provide interesting information that I found while performing the lab. What can I provide?
-Please be creative and show me your best. This is also one of the things required in the laptop. How can I achieve it in this model?
*I have attached the requirements and part of the data*
-------------------------------------------------
-CODE:
import pandas as pd
# Assuming the dataset is in CSV format
df = pd.read_csv('/path/to/your/female_diabetes.csv')
print(df.head()) # Display the first few rows
print(df.info()) # Get info on data types and non-null counts
print(df.describe()) # Summary statistics
print(df.isnull().sum())
df.fillna(df.mean(), inplace=True)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
features = df.iloc[:, :-1] # All columns except the last one
scaled_features = scaler.fit_transform(features)
df.iloc[:, :-1]= scaled_features
from sklearn.model_selection import train_test_split
X = df.iloc[:, :-1] # Features
y = df.iloc[:,-1] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report, roc_auc_score, roc_curve, auc
model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:,1]
print("Logistic Regression Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:
", confusion_matrix(y_test, y_pred))
print("Classification Report:
", classification_report(y_test, y_pred))
import matplotlib.pyplot as plt
import seaborn as sns
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix - Logistic Regression')
plt.show()
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)
plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area =%0.2f)'% roc_auc)
plt.plot([0,1],[0,1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0,1.0])
plt.ylim([0.0,1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic - Logistic Regression')
plt.legend(loc="lower right")
plt.show()
----------------------------------------------------------------------------------------------
Dataset (0):
Female Diabetes dataset contains 768 records. It has 8 features and one label for each record as
described below:
1- Number of times pregnant
2- Plasma glucose concentration a 2 hours in an oral glucose tolerance test
3- Diastolic blood pressure (mm Hg)
4- Triceps skin fold thickness (mm)
5-2-Hour serum insulin (mu U/ml)
6- Body mass index (weight in kg/(height in m)^2)
7- Diabetes pedigree function
8- Age (years)
9- Class variable (0{has diabetes} or 1{has not diabetes})
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions