Question
Python code using only libraries that are listed below. this is part 8 of an assignment so a lot of things went in early. Part
Python code using only libraries that are listed below. this is part 8 of an assignment so a lot of things went in early.
Part 8: Cylinder Distribution by Region In this problem, you will create a stacked bar chart displaying the distribution of the number of cylinders in vehicles manufactured in each region. Use np.unique() to create a list of unique values appearing in the cyl column of the auto DataFrame. Store the result in a variable named cyl_values. Print this array. In this cell, you will create a DataFrame with one row for each value of cyl and one column for each value of region. Each value in this DataFrame will represent the number of vehicles in the dataset corresponding to the relevant cyl/region pair. This can be accomplished by passing the cyl column and the region column (in this order) to the function pd.crosstab(). Store the value returned by this function in a variable named cyl_counts_by_region. Display this DataFrame (without using the print() function). Perform the following steps in a single cell: 1. Start by converting the count information into proportions. Create a DataFrame named cyl_props_by_region by dividing cyl_counts_by_region by the column sums of cyl_counts_by_region. The column sums can be calculated using np.sum() or the DataFrame sum() method. 2. We will be creating a stacked bar chart, so we need to know where the bottom of each bar should be located. We can calculate this using the following line of code: bar_bottoms = np.cumsum(cyl_props_by_region) - cyl_props_by_region 3. Create a list named colors2 containing five named colors that display well and are easy to distinguish from one another. 6 4. Create a Matplotlib figure, setting the figure size to [6, 4]. 5. Loop over the rows of cyl_props_by_region. Each time this loop executes, add a bar chart to the figure according to the following specifications. The bars should be labeled according to the region they represent. The height of the bars should be determined by the current row of cyl_props_by_region. The bottom position of each bar should be determined by the current row of bar_bottoms. Each bar should have a black border, and a fill color determined by the current value of colors2. The label should be set to the number of cylinders associated with the current row. 6. Set the labels for the x and y axes to be "Region" and "Proportion". Set the title to be "Distribution of Cylinder Numbers by Region". 7. Add a legend to the plot. Set the bbox_to_anchor parameter to place the legend to the right of the plot, near the top. 8. Display the figure using plt.show().
pre code
import numpy as np import pandas as pd import matplotlib.pyplot as plt
np.set_printoptions(precision=2)
auto = pd.read_csv(filepath_or_buffer='auto_mpg.txt', sep='\t')
this part of code works fine:
cyl_values = np.unique(auto['cyl']) print(cyl_values)
cyl_counts_by_region = pd.crosstab(auto['cyl'], auto['region']) cyl_counts_by_region
this part is broken:
cyl_props_by_region = cyl_counts_by_region / cyl_counts_by_region.sum()
bar_bottoms = np.cumsum(cyl_props_by_region) - cyl_props_by_region
colors2 = ['red', 'deeppink', 'magenta', 'purple', 'blue']
fig = plt.figure(figsize=[6, 4])
for i, row in cyl_props_by_region.iterrows(): plt.bar(x=row.index, height=row.values, bottom=bar_bottoms[i].values, color=colors2[i], edgecolor='black', label=str(i))
plt.xlabel('Region') plt.ylabel('Proportion') plt.title('Distribution of Cylinder Numbers by Region')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.show()
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started