Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Exercise #1: Principle Component Analysis ( 60 points): This exercise is based on the attached file 'soil,data' which includes three types of soils (represented as

image text in transcribed
Exercise \#1: Principle Component Analysis ( 60 points): This exercise is based on the attached file 'soil,data' which includes three types of soils (represented as 1,2,3 ) and the quantities of 13 constituents of these soils. 1. Read the file titled 'soil.data' (uploaded to this homework) into your Jupyter Notebook as a DataFrame using the pd.read_csv0 0 command, and name it 'data'. (1 point) Note: make sure you imported the needed packages first 2. Show the descriptive statistics of the data ( 1 point). Hint: use the describo().T method 3. How many observations are in the dataset? (1 point) Hint: use the shape method 4. Does the dataset contain missing values? ( 1 point) Hint: use the isnull() sum() method 5. a. How many Soil Types do we have and how are they represented in the data? (2 points) Hint: use the -value counts() method on the 'Soil Type' variable OR use the .unique0) method on the 'Soil Type' variable b. Convert 'Soil Type' integer values into strings by using the following mapping ( 2 points): 1: 'Soil Type 1' 2: 'Soil Type 2' 3: 'Soil Type 3' And replace it in the original 'Soil Type' column/variable. Hint: use the datapSoil Type'] = data[ [Soil Type"] replace(...) method c. Save the new 'Soil Type' column into a separate variable (not in the dataframe), and name it: 'soil_type'. (2 points) 6. Drop the 'Soil Type' column from the dataframe (because we already saved it into a sperate variable). Note: make sure your drop it from the original dataframe named 'data'. (2 points) Hint: use the data-data.drop(columis= ...) method 7. a. Get the correlation matrix between the remaining features/variables in the dataframe. (2 points) Hint use the corr() method b. Using seaborn, plot a heatmap of the obtained correlation matrix in part 7(a), with a cmap = 'seismic' color parameter, vmin =1, and vmax=1. (2 points) Hint: use the sns heatmap(...) function c. By a visual inspection of the obtained heatmap in part 7(b), what do red colors mean and what do blue colors mean? (2 points) d. By a visual inspection of the obtained heatmap in part 7(b), are the Exercise \#1: Principle Component Analysis ( 60 points): This exercise is based on the attached file 'soil,data' which includes three types of soils (represented as 1,2,3 ) and the quantities of 13 constituents of these soils. 1. Read the file titled 'soil.data' (uploaded to this homework) into your Jupyter Notebook as a DataFrame using the pd.read_csv0 0 command, and name it 'data'. (1 point) Note: make sure you imported the needed packages first 2. Show the descriptive statistics of the data ( 1 point). Hint: use the describo().T method 3. How many observations are in the dataset? (1 point) Hint: use the shape method 4. Does the dataset contain missing values? ( 1 point) Hint: use the isnull() sum() method 5. a. How many Soil Types do we have and how are they represented in the data? (2 points) Hint: use the -value counts() method on the 'Soil Type' variable OR use the .unique0) method on the 'Soil Type' variable b. Convert 'Soil Type' integer values into strings by using the following mapping ( 2 points): 1: 'Soil Type 1' 2: 'Soil Type 2' 3: 'Soil Type 3' And replace it in the original 'Soil Type' column/variable. Hint: use the datapSoil Type'] = data[ [Soil Type"] replace(...) method c. Save the new 'Soil Type' column into a separate variable (not in the dataframe), and name it: 'soil_type'. (2 points) 6. Drop the 'Soil Type' column from the dataframe (because we already saved it into a sperate variable). Note: make sure your drop it from the original dataframe named 'data'. (2 points) Hint: use the data-data.drop(columis= ...) method 7. a. Get the correlation matrix between the remaining features/variables in the dataframe. (2 points) Hint use the corr() method b. Using seaborn, plot a heatmap of the obtained correlation matrix in part 7(a), with a cmap = 'seismic' color parameter, vmin =1, and vmax=1. (2 points) Hint: use the sns heatmap(...) function c. By a visual inspection of the obtained heatmap in part 7(b), what do red colors mean and what do blue colors mean? (2 points) d. By a visual inspection of the obtained heatmap in part 7(b), are the

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Auditing And Assurance Services An Integrated Approach

Authors: Alvin Arens

13th Edition

0136084737, 9780136084730

More Books

Students also viewed these Accounting questions