Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write Python commands to Select a subset of the dataset and clean it a) [10 points] Create boolean mask and apply it to the CarsData

Write Python commands to Select a subset of the dataset and clean it

a) [10 points] Create  boolean mask and apply it to the CarsData dataframe to select the data of all sport cars (i.e.,'Sports Car?' column whith value =1) that has hoorsepower >=350. Name the resulted dataframe as SportCars

b) [10 points] There are missing values in the 'City Miles Per Gallon' and 'Highway Miles Per Gallon' columns in the dataframe SportCars (result of task 1.a). Write code to identify the location of these missing values, then replace the missing values with the minimum value in their corresponding columns. i.e. replace the missing value(s) in the City Miles Per Galon column with the minimum value in that column of the SportCars (not the entire dataset). Similarily the missing value(s) in 'Highway Miles Per Gallon' is to be replaced by the minimum value in the 'Highway Miles Per Gallon' column.

c) [5 points] From SportCars dataframe remove the column 'Sports Car?' and every column that has 0's (zeros) in ALL of its values. [2 extra points] for compact code to identify columns with 0's

d) [10 points] Add new column to SportCars labeled ScaledCityMPG which is calculated by normalizing (scaling) the City Miles Per Galon column to values in the range [0,1]

 

 

Extract some statistical data:

Write Python commands to Statistically describe the selected SportCars data using the following columns [ 'Suggested Retail Price', 'Engine Size', 'Number of Cylinders', 'Horsepower', 'City Miles Per Gallon', 'Highway Miles Per Gallon', 'Weight']

a) [5 points] Show descriptive statstics table of the data using all ordinal columns above

b) [5 points] Show and plot the correlation matrix of the selected data in(SportCars)

c) [5 points] Plot the scatter matrix of the selected data

d) [5 points] From the stats and the two plots in (b and c) above describe the following:

  • relationships (i.e. what happen to the others when one increse/decrease) between MPG (city or highway) and each of the following:
    • Number of Cylinders, Suggested Retail Price, horsepower, weight, and Engine Size
  • Pairwise Correlation (negative, positive, or no correlation) between Engine Size, Number of Cylinders, horsepower, and weight

Make a bar chart using a subset of the data:

Using the SportCars dataframe created in task 1 Write Python commands for each of the follwoing:

a) [5 points] create an array named mpgColors of 40 colors using the Greens colormap and map the colors to the values of scaledCityMPG column

b) [35 points] make a bar plot using the column 'City Miles Per Gallon' to create visual comparison between the selected set of cars in SportCars as follows [5 points each] :

  • i) set the plot figure size to 10 x 4 and use .bar() method to make the plot
  • ii) plot car names as x-axis vs their City Mpg as y-axis
    • Hint: Use a list/array of integers as x-axis instead of directly using the car name column
  • iii) let the space between the columns to be 30% of the space of each bar (hint: what would be the bar width?)
  • iv) Set the edge color of the bars as Green and the inside color from the colors mpgColors you created in subtask (3.a) above.
  • v) make the x-axis ticks label as the car names (Vehicle Name), and make them in the middle of the bars, and rotate the ticks labels 90 degrees
  • vi) add x-axis label as "Sport car make and model", y-axis label as "City MPG"
  • vii) Add plot title as "City MPG: Sport cars with 350 or more Horsepower"

c) [5 points] Save the plot in an image file as "CityMPG-Hp350plus-SportCars.png" .

Step by Step Solution

3.35 Rating (164 Votes )

There are 3 Steps involved in it

Step: 1

Certainly Here are Python commands to perform the tasks youve described a Create a boolean mask and apply it to select sport cars with horsepower 350 ... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction To Management Science and Business Analytics A Modeling And Case Studies Approach With Spreadsheets

Authors: Frederick S. Hillier, Mark S. Hillier

7th Edition

1260716295, 9781260716290

More Books

Students also viewed these Programming questions

Question

BPR always involves automation. Group of answer choices True False

Answered: 1 week ago