Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

SOLVE BY PYTHON PLEASE !!!! input 1 input_2 input_3 input_4 input_5 input_6 input 7 input_8 input_9 input_10 output Riyadh Male $735 18,180 9.2. 9.1 6.0

image text in transcribedimage text in transcribed

SOLVE BY PYTHON PLEASE !!!!

input 1 input_2 input_3 input_4 input_5 input_6 input 7 input_8 input_9 input_10 output Riyadh Male $735 18,180 9.2. 9.1 6.0 0.93 39.8 0 Private School Physics 1 Public School Mathematics Madinah Male $742 2,621 6.6 9.5 7.8 0.95 35.4 2 Public School Physics Madinah Female $675 17,692 6.1 7.6 7.9 0.92 35.7 3 Public School $841 17,738 9.9 9.1 9.4 0.17 41.1 Physics Al Khobar Female Chemistry Hail Female 4 Private School $596 10.604 7.7 6.7 9.6 0.17 37.3 A1.[6 marks] Do the following: 1. [4] Identify all the inconsistencies in the data. Resolve the inconsistencies. 2. [2] Replace all missing values in column 'input_1' with the mode of that column. In [ ]: #Answer for A1.1 In [ ]: #Answer for A1.2 A2. [4 marks] Do the following 1. [2] Draw a bivariate histogram for 'input_9' and 'input_10' columns using 10 bins. 2. [2] Display the total number of records for every combination of "input_2' and 'input_3'. Which combination has the highest number of records? In ]: #Ansiver for A2.1 In [ ]: #Answer for A2.2 A3. [4 marks] Do the following: 1. [2] Transform column "input_3" as follows: Riyadh', 'Al-Baha', 'Hail should be transformed as 'Central Region' Jeddah', 'Makkah', 'Madinah', 'Jizan', 'Taif, 'Tabuk' should be transformed as 'Western Region' 'Dammam', 'Al-Khobar' should be transformed as 'Eastern Region' 2. [2] Transform column "input_2" using one hot encoding method. Do not drop any of the encoded column. A4. [4 marks] Do the following: 1. [2] Display the correlation between all the numerical columns from the original data (that is, consider columns input_7, input_8, input_9, input_10 and output) 2. [1] Print the correlation value between 'input_8' and 'output' columns. 3. [1] Write about the strength and direction of the correlation value obtained in A4.2? ]: #Answer for 44.1 ]: #Answer for A4.2 ]: #Answer for 44.3 A5. [7 marks] Do the following: 1. [2] Find the first two principal components using all the input numerical columns from the original data (that is, consider columns input_7, input_8, input_9, input_10). Add the principal components to that dataframe as pc1 and pc2 columns, respectively. 2. [2] For the second principal component, which column has the minimum coefficient of the linear combinations. 3. [3] Construct a scatter plot using the first two principal components of the data. Differentiate the points in the above plot using 'input_4' and 'output columns. (Note: you should have only one plot/graph). input 1 input_2 input_3 input_4 input_5 input_6 input 7 input_8 input_9 input_10 output Riyadh Male $735 18,180 9.2. 9.1 6.0 0.93 39.8 0 Private School Physics 1 Public School Mathematics Madinah Male $742 2,621 6.6 9.5 7.8 0.95 35.4 2 Public School Physics Madinah Female $675 17,692 6.1 7.6 7.9 0.92 35.7 3 Public School $841 17,738 9.9 9.1 9.4 0.17 41.1 Physics Al Khobar Female Chemistry Hail Female 4 Private School $596 10.604 7.7 6.7 9.6 0.17 37.3 A1.[6 marks] Do the following: 1. [4] Identify all the inconsistencies in the data. Resolve the inconsistencies. 2. [2] Replace all missing values in column 'input_1' with the mode of that column. In [ ]: #Answer for A1.1 In [ ]: #Answer for A1.2 A2. [4 marks] Do the following 1. [2] Draw a bivariate histogram for 'input_9' and 'input_10' columns using 10 bins. 2. [2] Display the total number of records for every combination of "input_2' and 'input_3'. Which combination has the highest number of records? In ]: #Ansiver for A2.1 In [ ]: #Answer for A2.2 A3. [4 marks] Do the following: 1. [2] Transform column "input_3" as follows: Riyadh', 'Al-Baha', 'Hail should be transformed as 'Central Region' Jeddah', 'Makkah', 'Madinah', 'Jizan', 'Taif, 'Tabuk' should be transformed as 'Western Region' 'Dammam', 'Al-Khobar' should be transformed as 'Eastern Region' 2. [2] Transform column "input_2" using one hot encoding method. Do not drop any of the encoded column. A4. [4 marks] Do the following: 1. [2] Display the correlation between all the numerical columns from the original data (that is, consider columns input_7, input_8, input_9, input_10 and output) 2. [1] Print the correlation value between 'input_8' and 'output' columns. 3. [1] Write about the strength and direction of the correlation value obtained in A4.2? ]: #Answer for 44.1 ]: #Answer for A4.2 ]: #Answer for 44.3 A5. [7 marks] Do the following: 1. [2] Find the first two principal components using all the input numerical columns from the original data (that is, consider columns input_7, input_8, input_9, input_10). Add the principal components to that dataframe as pc1 and pc2 columns, respectively. 2. [2] For the second principal component, which column has the minimum coefficient of the linear combinations. 3. [3] Construct a scatter plot using the first two principal components of the data. Differentiate the points in the above plot using 'input_4' and 'output columns. (Note: you should have only one plot/graph)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Database Relational Model A Retrospective Review And Analysis

Authors: C. J. Date

1st Edition

0201612941, 978-0201612943

More Books

Students also viewed these Databases questions