Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Help with Exercise 2 Exercise 1 for Reference: Exercise 1: a) Use the Machine Learning algorithms: k-NN, and Nave Bayes to classify multiphase flow patterns,

image text in transcribedimage text in transcribed

Help with Exercise 2

Exercise 1 for Reference:

Exercise 1: a) Use the Machine Learning algorithms: k-NN, and Nave Bayes to classify multiphase flow patterns, using the database BDOShohamIML.csv and evaluate the performance. b) Apply parameters optimization to (a) and evaluate the performance. c) Explain the Confusion Matrix and metrics obtained in (a) y (b), that is, before and after parameters optimization.

Code for Exercise 1:

image text in transcribedimage text in transcribedimage text in transcribed
In [14]: import numpy as np import pandas as pd from sklearn. neighbors import KNeighborsClassifier from sklearn. metrics import accuracy_score from sklearn. model_selection import GridSearchCV from sklearn. naive_bayes import GaussianNB from sklearn. metrics import confusion_matrix data = pd. read_csv( 'Data_Glioblastoma5Patients_SC. csv' ) print ( ' Shape: ' , data . shape) data . head ( ) Shape: (430, 5949) Out [14 ] : A2M AAAS AAK1 AAMP AARS AARSD1 AASDH AASDHPPT AA 0 -3.80147 -3.889900 -3.985616 2.651558 2.170748 -2.550822 4.807330 3.961170 -0.1926 1 -3.80147 -3.889900 -3.158708 2.358992 -6.041792 -0.056092 3.606735 -2.632250 2.2493 2 -3.80147 -3.889900 1.733125 -5.820241 -6.041792 -0.576957 -2.473517 -4.354127 0.063 3 -3.80147 -3.889900 -1.665669 3.514271 -6.041792 -3.699171 4.509461 -4.354127 2.985 4 -3.80147 3.742495 -2.166992 -5.820241 2.094729 4.021873 5.535007 4.019633 2.5603 5 rows x 5949 columns In [22]: #Knn Code # a ) clf1 = KNeighborsClassifier(n_neighbors=3) . fit(data . iloc[ : , : -1] . values, data. il oc[ : , -1: ] . values . ravel( )) y_pred = clf1. predict(data. iloc[ : , : -1] . values) print( ' Accuracy score: ', accuracy_score (data. iloc[ : , -1: ] . values, y_pred) ) confusion_matrix(data. iloc[ : , -1: ] . values, y_pred) Accuracy score: 0. 9506607929515418 Out [22]: array([[ 973, 0 , 26, 0 , 34] , 121, 1, 3, 0 , 01, 1, 550, 41, 0, 2], 67. 9, 38, 2768, 4, 19]. 0, 0, 0 , 2, 136, 2] , 20, 2, 1 , 8 , 847]], dtype=int64) In [15 ]: # b) leaf_size = list(range(1, 10) ) n_neighbors = list(range(1, 5) )In [ ]: hyperparameters = dict(leaf_size=leaf_size, n_neighbors=n_neighbors) clf2 = KNeighborsClassifier() clf3 = GridSearchCV(clf2, hyperparameters, cv=5) best_model = clf3. fit(data . iloc[ : , : -1] . values, data. iloc[ :, -1: ] . values. ravel( )) print( 'Best leaf_size: ', best_model. best_estimator_. get_params( ) ['leaf_size' ]) print( 'Best n_neighbors: ', best_model. best_estimator_. get_params( ) ['n_neighbor s' ] ) In [ ]: clf4 = KNeighborsClassifier(n_neighbors=3, leaf_size=1) . fit(data. iloc[ :, : -1] . va lues, data . iloc[ : , -1: ]. values . ravel( ) ) y_pred = clf1. predict(data. iloc[ :, : -1] . values) print( ' Accuracy score: ' , accuracy_score(data. iloc[:, -1: ] . values, y_pred) ) print ( ' confusion Matrix: ') confusion_matrix (data . iloc [ : , -1: ]. values, y_pred) In [26]: #Naive Bayes Code # a ) clf1 = GaussianNB( ) . fit(data. iloc[ :, : -1] . values, data. iloc[ : , -1: ] . values. ravel ( ) ) y_pred = clf1. predict(data. iloc[ : , : -1] . values) print( 'Accuracy score: ', accuracy_score (data. iloc[ : , -1: ]. values, y_pred) ) confusion_matrix(data . iloc[ : , -1: ] . values, y_pred) Accuracy score: 0. 6754185022026432 Out [26]: array ([[ 879, 0, 0, 143, 1, 10], 0 , 121, 0 , 4, 0 , 0], 1, 3, 471, 115, 4, 0], [ 124, 53, 192, 2228, 240 68] , 0 , 0 , 0 , 11, 129 0] , [ 307, 0 , 9 , 488, 69, 5]], dtype=int64) In [27 ]: # b) hyperparameters = {'var_smoothing' : np. logspace(0, -9, num=100) } clf2 = GaussianNB( ) clf3 = GridSearchCV(clf2, hyperparameters, cv=5) best_model = clf3. fit(data. iloc[ : , : -1] . values, data. iloc[ :, -1: ] . values. ravel( ) ) print( 'Best var_smoothing: ', best_model. best_estimator_. get_params ( ) ['var_smoo thing ' ]) Best var_smoothing: 1. 873817422860387e-09 In [ ]:In [20]: clf4 = GaussianNB(var_smoothing= 1. 2328467394420635e-09) . fit(data. iloc[ : , :-1]. values, data. iloc[ :, -1: ] . values . ravel( )) y_pred = clf4. predict(data. iloc[ : , : -1] . values) print( 'Accuracy score: ', accuracy_score(data. iloc[ : , -1: ] . values, y_pred) ) print( ' confusion Matrix: ') confusion_matrix(data . iloc[ : , -1: ]. values, y_pred) Accuracy score: 0. 6755947136563877 confusion Matrix: Out [20]: array ([[ 879, 0 0 , 143, 1, 10], 0 , 121, 0 , 4 0 , 01, 1, 2 , 471, 116 , 4 , 01, 124, 52, 192, 2229, 240, 68], 0 , 0 , 0 , 11, 129 01, 307, 9 , 488 , 69 5]], dtype=int64) C) Explain the Confusion Matrix and Metrics before a & b The accuracy score associated with the confusion matrix with the K-nn before and parameters optizimation was identical, high 0.95. The accuracy score associated with the confusion matrix with Naive Bayes before and after parameters optimization was very close, both around .68. These data seem to indicate that a higher accuracy of prediction is obtained through the K-nn prediction method

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Discrete Mathematics Mathematical Reasoning And Proof With Puzzles, Patterns, And Games

Authors: Douglas E Ensley, J Winston Crawley

1st Edition

1118226534, 9781118226537

More Books

Students also viewed these Mathematics questions

Question

Date the application was sent

Answered: 1 week ago

Question

8. What are the costs of collecting the information?

Answered: 1 week ago

Question

1. Build trust and share information with others.

Answered: 1 week ago