Question
This has to be solved using Python > Jupyter Notes For this assignment you are going to work with the data that was collected and
This has to be solved using Python > Jupyter Notes
For this assignment you are going to work with the data that was collected and made available by National Institute of Diabetes and Digestive and Kidney Diseases as part of the Pima Indians Diabetes Database. All patients in this dataset belong to the Pima Indian heritage and are females of ages 21 and above.
The following features have been provided to help predict whether a person is diabetic or not:
Pregnancies: Number of times pregnant Glucose: Plasma glucose concentration over 2 hours in an oral glucose tolerance test BloodPressure: Diastolic blood pressure (mm Hg) SkinThickness: Triceps skin fold thickness (mm) Insulin: 2-Hour serum insulin (mu U/ml) BMI: Body mass index (weight in kg/(height in m)2) DiabetesPedigreeFunction: Diabetes pedigree function (a function which scores likelihood of diabetes based on family history) Age: Age (years) Outcome: Class variable (0 if non-diabetic, 1 if diabetic)
Build both Decision Trees and kNN models to gain insights about possible features that can help us predict diabetics. Unlike previous assignments, this one is open ended. You need to dig into data to gain insights. The data is clean, so you don't need to worry about data cleaning issues except maybe checking for outliers. Before building your models try to explore and visualize the data. Make sure to highlight interesting things that you find during your analysis.
OOO 64 10 18 84 31 3 84 50 A B D E 1 Pregnancl Glucose Blood Pres Skin Thicki Insulin 2 6 148 72 35 3 1 85 66 29 4 8 183 0 5 1 89 66 23 94 6 137 40 35 168 7 5 116 74 0 0 8 3 78 50 32 88 9 115 0 0 0 10 2 197 70 45 543 11 8 125 96 0 0 12 4 110 92 0 13 168 74 0 0 14 10 139 80 0 0 15 1 189 60 23 846 16 5 166 72 19 175 17 7 100 0 0 0 0 118 47 230 19 7 107 74 0 0 20 1 103 30 38 83 21 1 115 70 30 96 22 3 126 88 41 235 23 8 99 0 0 24 7 196 90 0 0 25 9 119 80 35 0 26 11 143 94 33 146 27 10 125 70 26 115 28 7 147 76 0 0 29 1 97 66 15 140 30 13 145 82 19 110 31 117 92 0 0 32 109 75 26 0 33 158 76 36 245 34 88 11 54 35 92 92 0 0 36 10 122 78 31 0 37 4 103 60 33 192 38 11 138 0 0 39 9 102 76 37 0 40 2 90 68 42 0 41 4 111 72 47 207 42 3 180 64 25 70 43 7 133 84 0 0 44 7 106 92 18 0 45 9 171 110 24 240 46 7 159 64 0 0 47 0 180 66 39 0 48 1 146 56 0 49 2 71 70 27 0 F G H BMI DiabetesP Age Outcome 33.6 0.627 50 1 26.6 0.351 31 0 23.3 0.672 32 1 28.1 0.167 21 0 43.1 2.288 33 1 25.6 0.201 30 0 31 0.248 26 1 35.3 0.134 29 0 30.5 0.158 53 1 0 0.232 54 1 37.6 0.191 30 0 38 0.537 34 1 27.1 1.441 57 0 30.1 0.398 59 1 25.8 0.587 51 1 30 0.484 32 1 45.8 0.551 1 29.6 0.254 31 1 43.3 0.183 33 0 34.6 0.529 32 1 39.3 0.704 27 0 35.4 0.388 0 39.8 0.451 41 1 29 0.263 29 1 36.6 0.254 51 1 31.1 0.205 41 1 39.4 0.257 43 1 23.2 0.487 22 0 22.2 0.245 57 0 34.1 0.337 38 0 36 0.546 60 0 31.6 0.851 28 1 24.8 0.267 0 19.9 0.188 28 0 27.6 0.512 45 0 24 0.966 33 0 33.2 0.42 35 0 32.9 0.665 46 1 38.2 0.503 27 1 37.1 1.39 56 1 34 0.271 26 0 40.2 0.696 37 0 22.7 0.235 48 0 45.4 0.721 54 1 27.4 0.294 40 0 42 1.893 25 1 29.7 0.564 29 0 28 0.586 awwuu w 58 22 76 22 o OOO 64 10 18 84 31 3 84 50 A B D E 1 Pregnancl Glucose Blood Pres Skin Thicki Insulin 2 6 148 72 35 3 1 85 66 29 4 8 183 0 5 1 89 66 23 94 6 137 40 35 168 7 5 116 74 0 0 8 3 78 50 32 88 9 115 0 0 0 10 2 197 70 45 543 11 8 125 96 0 0 12 4 110 92 0 13 168 74 0 0 14 10 139 80 0 0 15 1 189 60 23 846 16 5 166 72 19 175 17 7 100 0 0 0 0 118 47 230 19 7 107 74 0 0 20 1 103 30 38 83 21 1 115 70 30 96 22 3 126 88 41 235 23 8 99 0 0 24 7 196 90 0 0 25 9 119 80 35 0 26 11 143 94 33 146 27 10 125 70 26 115 28 7 147 76 0 0 29 1 97 66 15 140 30 13 145 82 19 110 31 117 92 0 0 32 109 75 26 0 33 158 76 36 245 34 88 11 54 35 92 92 0 0 36 10 122 78 31 0 37 4 103 60 33 192 38 11 138 0 0 39 9 102 76 37 0 40 2 90 68 42 0 41 4 111 72 47 207 42 3 180 64 25 70 43 7 133 84 0 0 44 7 106 92 18 0 45 9 171 110 24 240 46 7 159 64 0 0 47 0 180 66 39 0 48 1 146 56 0 49 2 71 70 27 0 F G H BMI DiabetesP Age Outcome 33.6 0.627 50 1 26.6 0.351 31 0 23.3 0.672 32 1 28.1 0.167 21 0 43.1 2.288 33 1 25.6 0.201 30 0 31 0.248 26 1 35.3 0.134 29 0 30.5 0.158 53 1 0 0.232 54 1 37.6 0.191 30 0 38 0.537 34 1 27.1 1.441 57 0 30.1 0.398 59 1 25.8 0.587 51 1 30 0.484 32 1 45.8 0.551 1 29.6 0.254 31 1 43.3 0.183 33 0 34.6 0.529 32 1 39.3 0.704 27 0 35.4 0.388 0 39.8 0.451 41 1 29 0.263 29 1 36.6 0.254 51 1 31.1 0.205 41 1 39.4 0.257 43 1 23.2 0.487 22 0 22.2 0.245 57 0 34.1 0.337 38 0 36 0.546 60 0 31.6 0.851 28 1 24.8 0.267 0 19.9 0.188 28 0 27.6 0.512 45 0 24 0.966 33 0 33.2 0.42 35 0 32.9 0.665 46 1 38.2 0.503 27 1 37.1 1.39 56 1 34 0.271 26 0 40.2 0.696 37 0 22.7 0.235 48 0 45.4 0.721 54 1 27.4 0.294 40 0 42 1.893 25 1 29.7 0.564 29 0 28 0.586 awwuu w 58 22 76 22 oStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started