Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Please fill missing values in balance column with group mean. There are two different groups of columns in the dataset. To check whether to use

Please fill missing values in balance column with group mean. There are two different groups of columns in the dataset. To check whether to use group mean of a column with categorical data, study their group means of balance; if there are big differences between group means, the column is a good candidate. To check whether to use group mean of a column with numerical data, exam the density plots we have created, based on their patterns, select the column having similar distribution as the balance column. If a column eventually selected has numerical data, create bins first, then use bin mean to fill the missing value. Please make this decision by yourself after checking all candidates, and explain your decision.

region age income long_Month longten internet balance class
2 44 64 3.7 37.45 0 2.014903021 1
3 33 136 4.4 42 0 2.724579503 1
3 52 116 18.15 1300.6 0 3.409496184 0
2 39 78 11.8 487.4 0 2.602689685 0
3 22 19 10.9 504.5 1 2.1690537 1
2 35 76 6.05 239.55 1 3.146305132 0
3 59 166 9.75 449.05 0 2.48490665 0
1 41 72 24.15 1659.7 0 2.803360381 0
2 33 125 4.85 17.25 1 NaN 1
3 35 80 7.1 47.45 0 3.16758253 0
1 38 37 8.55 308.7 0 3.731699451 0
1 38 75 5.1 146.25 1 2.420368129 0
3 57 162 16.15 946.9 0 3.401197382 0
1 29 77 6.7 140.95 1 3.188416617 1
3 30 16 3.75 25.65 1 NaN 1
1 52 120 20.7 1391.05 0 3.091042453 0
3 33 101 5.3 253.35 1 3.286534473 0
3 48 67 15.05 810.45 0 3.305053521 0
3 43 36 12.5 153.75 0 2.890371758 0
2 21 33 2.2 2.2 0 3.701301974 0
2 40 37 8.25 399.15 1 3.33220451 0
1 37 36 10.6 582.6 1 2.90416508 0
1 53 155 21 1519.2 1 3.526360525 0
1 50 140 6.5 247.55 0 3.555348061 0
1 27 55 4.8 54.1 1 NaN 0
2 46 163 33.9 1947.95 1 2.621038824 0
3 35 52 4.25 82.7 1 NaN 1
2 60 211 21.15 1228.7 1 3.993602992 0
1 57 186 9.8 428.25 0 3.583518938 0
1 41 39 6.55 67.8 0 2.983153491 1
2 57 22 41.75 3043.05 0 2.931193752 0
3 41 30 2.5 31.25 0 3.984343667 0
2 28 29 4.25 78 0 2.876385516 0
1 43 76 14.7 897.05 0 2.397895273 0
1 41 74 14.5 963.3 1 2.833213344 0
1 51 63 12.85 585.6 1 2.656756907 0
3 41 36 7.75 361 1 2.917770732 1
3 34 33 2.95 18.9 0 NaN 1
1 36 29 3.25 16.8 1 2.740840024 1
2 34 27 6.3 150.9 1 NaN 1
1 52 49 24.75 1349.05 0 3.102342009 0
3 22 24 7.8 63 1 3.113515309 0
1 26 26 4.85 33.7 0 NaN 0
1 27 47 6.25 330.4 0 2.656756907 0
1 62 27 15.5 967.1 0 3.540959324 0
2 52 30 10.4 447.85 0 2.944438979 0
2 40 127 19.7 909.9 1 2.691243083 0
2 39 137 12.7 676.9 0 2.442347035 1
2 50 80 28.8 1558.1 0 3.056356895 0
3 55 30 10.25 395.85 0 2.756840365 0
2 51 438 29 1815.4 1 3.208825489 0
2 39 79 15.25 653.7 1 3.305053521 1
3 47 63 9.05 373.65 0 1.909542505 0
1 67 51 57.05 4168.25 0 2.957511061 0
3 43 61 3.15 7.9 1 2.788092909 1
1 57 22 8.55 381.5 0 2.983153491 0
2 48 91 24.5 1531.9 0 2.525728644 0
1 68 244 30.25 2186.2 0 3.425889994 0
1 42 80 7.85 196.65 1 2.351375257 0
3 34 83 8.9 379.55 1 1.749199855 0
2 31 21 1.65 3.2 0 NaN 1
3 48 24 11.85 230.7 0 3.555348061 1
2 53 351 5.5 185.3 0 2.957511061 0
2 52 169 14.65 618.15 1 2.656756907 0
2 54 50 17.25 1067.8 1 4.471638793 1
2 35 161 3.4 10.35 1 NaN 1
1 47 212 7.45 320.9 0 3.208825489 0
3 61 53 12.25 631.7 1 2.1690537 0
3 33 73 5.85 216.6 1 3.102342009 0
3 20 17 3.75 12.3 0 NaN 1
2 33 23 10.2 196.3 1 2.674148649 1
3 36 107 11.55 491.55 0 3.068052935 1
2 25 21 8.65 426.6 0 2.197224577 1
3 58 83 19.3 1323.2 0 3.124565145 0
1 20 17 3.05 40.3 1 2.63905733 0
3 25 76 24.05 1536.55 0 3.056356895 0
2 24 19 4 46 0 3.305053521 1
1 61 41 9.6 353.55 0 2.251291799 0
3 39 105 6.6 159.7 0 3.617651945 0
1 54 31 5.85 97 1 NaN 0
2 40 41 25.25 1725.1 0 3.496507561 0
3 50 102 12.6 760.3 1 2.525728644 0
2 22 25 12.05 666 1 1.871802177 0
1 42 68 17.3 997.85 0 2.957511061 0
1 55 79 13.8 668.65 0 3.238678452 0
3 31 28 3.45 13.3 1 3.157000421 1
2 48 64 14.8 708.45 0 2.833213344 0
2 36 38 10.15 538.65 0 2.876385516 0
3 23 37 5.35 144.05 1 2.197224577 0
2 64 98 26.15 1805.1 0 2.420368129 0
1 52 195 23.15 1420.95 0 3.583518938 1
2 35 47 5.2 46.9 0 NaN 1
1 47 65 12.4 744.45 0 3.597312261 0
1 50 150 19.3 930.8 1 3.926911618 1
2 39 106 9.05 568.15 0 2.505525937 0
2 43 33 4.25 30 0 2.788092909 0
3 25 38 13.9 255 0 1.32175584 0
3 32 125 3.8 31.75 1 2.944438979 1
1 37 145 8.6 241.45 1 2.876385516 0
1 44 99 22.05 841.55 0 2.277267285 0
1 34 22 4.85 75.45 1 3.548179572 0
2 60 31 32.25 2349.25 0 3.511545439 0
1 36 25 1.9 3.6 0 1.609437912 0
3 25 57 20.25 923.95 1 2.277267285 0
2 51 41 12.1 492.8 1 2.277267285 0
3 25 20 8.15 209.35 1 NaN 1
3 43 101 13.65 817.65 0 3.032546247 0
1 37 56 10.25 681.95 0 2.374905755 0
2 27 22 17.2 893.3 0 2.583997552 0
1 37 108 21.8 1292 1 2.079441542 0
3 48 115 12.8 785.7 1 2.847812143 0
3 54 53 8.1 225.4 1 4.034240638 0
3 53 242 8.95 396.35 0 3.349904087 0
1 24 20 3.35 7.55 1 2.224623552 0
2 47 123 2.5 9.25 1 2.621038824 0
3 44 47 7.05 182.3 0 2.674148649 0
3 37 48 12.9 669.75 0 2.1690537 0
2 64 13 14.6 921.7 0 2.756840365 0
2 52 78 18.6 1179.05 0 0
3 26 34 6.5 198.7 0 2.772588722 0
1 46 81 17.45 1067 1 2.327277706 0
1 45 86 45.05 3235.9 1 3.555348061 0
3 27 26 12.2 477.5 1 2.224623552 1
3 34 51 4.7 61.5 0 2.803360381 0
2 56 57 6.2 135.2 1 2.351375257 0
1 48 41 4.35 59.15 0 2.079441542 0
2 46 168 14.2 928.8 1 3.657130756 0
2 33 29 4.65 35.45 1 NaN 1
3 58 167 24.3 1642.35 0 2.847812143 0
3 24 29 4.45 43.7 1 NaN 1
1 55 104 15 960.95 0 3.669951444 0
3 22 46 9.35 311.85 1 1.791759469 1
3 26 46 11 350.05 1 1.791759469 0
1 33 51 2.9 12.05 0 NaN 1
2 42 65 14.2 1007.5 1 3.449987546 0
2 33 48 3.15 39.75 0 3.417726684 0
3 34 46 7.1 125.05 1 NaN 0
2 63 43 14.85 808.95 1 2.397895273 0
3 53 110 17.45 1249.8 0 2.691243083 0
1 39 26 14.45 349.9 1 3.135494216 1
3 28 34 15 934.05 0 2.862200881 0
2 54 21 10.6 565.55 0 2.833213344 0
3 33 38 5.05 48.6 1 2.525728644 0
1 21 19 7.9 98.35 1 1.609437912 1
1 39 40 36.25 2553.7 0 3.650658241 0
3 69 58 6.35 118.05 1 2.463853241 0
2 65 128 21.2 1325.05 0 2.674148649 0
3 66 460 89.4 6353.9 1 2.931193752

0

region, category, and class are categorical data and the rest are numerical data

I am not too sure how to approach this question. Can someone explain this in detail.

Needs to be done using python

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

DNA Databases

Authors: Stefan Kiesbye

1st Edition

0737758910, 978-0737758917

More Books

Students also viewed these Databases questions

Question

Identify conflict triggers in yourself and others

Answered: 1 week ago