Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Problem 4 : Diamonds Data ( Part 2 ) We will continue working with the Diamonds dataset in this problem. We will start by using

Problem 4: Diamonds Data (Part 2)
We will continue working with the Diamonds dataset in this problem. We will start by using sorting to identify
the most expensive diamonds in the dataset.
Sort the contents of the diamonds DataFrame in descending order by price. Use show() to display the first
5 rows of the sorted DataFrame.
New, we will identify the largest five diamonds in the dataset.
Sort the contents of the diamonds DataFrame in descending order by carat. Use show() to display the first
5 rows of the sorted DataFrame.
In the next two cells, we will explore the price per carat for diamonds in the dataset.
Create a code cell to complete the following tasks:
1. Create a new DataFrame named diamonds_ppc. This DataFrame should contain all columns from
diamonds but should also contain a column named price_per_carat. Values in this new column
should be equal to the price of the diamond divided by the carat size, rounded to 2 decimal places.
2. Sort the contents of the diamonds_ppc DataFrame in descending order by price_per_carat. Use
show() to display the first 5 rows of the sorted DataFrame.
In the previous cell, we identified the diamonds with the highest price per carat. We will now identify the
diamonds with the lowest.
Sort the contents of the diamonds_ppc DataFrame in ascending order by price_per_carat. Use show()
to display the first 5 rows of the sorted DataFrame.
In the last part of this problem, we will graphically explore the relationship between price_per_carat and
carat.
Create a code cell to complete the following tasks:
1. Use the sample() method to draw a sample from diamonds_ppc. Use fraction=0.25 and
seed=1. Convert the sample to a Pandas DataFrame and store the result in ppc_ample_pdf.
2. Use the data in the sample to create a scatter plot of price_per_carat versus carat. When
creating the scatter plot, set alpha=0.5 and select a named color for the points. Label the x-axis
"Carat" and label the y-axis "Price per Carat". Use plt.show() to display the plot.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Knowledge Discovery In Databases

Authors: Gregory Piatetsky-Shapiro, William Frawley

1st Edition

0262660709, 978-0262660709

More Books

Students also viewed these Databases questions