Answered step by step
Verified Expert Solution
Question
1 Approved Answer
For this assignment, we look into a restaurant s tipping dataset, that consists of the following fields: Obs: id# of a restaurant s bills incidents
For this assignment, we look into a restaurants tipping dataset, that consists of the following fields:
Obs: id# of a restaurants bills incidents
totbill: total amount of the bill
tip: tip amount
sex: MF
smoker: YesNo
day: Mon thru Sun
time: DayNight
size: party number
To load the data using readcsv for the data files attached tipstxt or tips.csv save the files and
path of your own, to include the directory path of the loaded files, like below example of the
saved path:
In: tips pdreadcsvc:UsersmowafDatatipstxt sep
In: tips pdreadcsvc:UsersmowafDatatipscsv
To view the file dataset first rows:
In: tips.head
obs totbill tip sex smoker day time size
F No Sun Night
M No Sun Night
M No Sun Night
M No Sun Night
F No Sun Night
To make a stacked bar plot showing the percentage of data points for each party size on each
day, make a crosstabulation by day and party size and sh:
In: partycountpdcrosstabtipsdaytipssize
In : partycount
Out:
size
day
Fri
Sat
Sun
Thu
To normalize the data, so that each row sums to and make the plot:
In: partycounts partycount.loc::
In: partycounts
size
day
Fri
Sat
Sun
Thu
Normalize so that each row sums to to make a bar plot of the party size counts over the
weekdays:
# Normalize to sum to
In: partypcts partycounts.divpartycounts.sum axis
In: partypcts
size
day
Fri
Sat
Sun
Thu
Draw plotkind'bar' of the partypcts from step
Show your plot output.
To find the percentage of the tip for each bill, create a column tippct:
In: tipstippct tipstip tipstotbill
tips.head
Similarly, to list the first five rows of the dataset:
tips:
Show your output.
Draw a histogram plot frequency of the data points split into discrete, evenly spaced bins, with
the number of data points in each bin using the tip percentages of the total bill, tipstippct
of step :
tipstippctplot.histbins
Show your output.
To show a summary statistics of the percentag of tips tippct for day, smoker, sex, and time
columns, we can call the describe on a groupby object:
tips.groupbysmokertippctdescribe
Show your output.
Similarly, show the tippct aggregated on sexday and time
Show your output.
Aggregating a Series or all of the columns of a DataFrame is a matter of using aggregate with the
desired function or calling a method like mean or std However, we may want to aggregate using
a different function depending on the column, or multiple functions at once.
To group the tips by day and smoker:
grouped tips.groupbyday 'smoker'
grouped
Out:
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started