Question
Using numpy and pandas Create a data frame with 5000 rows with 5 columns as follows: Cust per day -- random int between 25
Using numpy and pandas
Create a data frame with 5000 rows with 5 columns as follows:
● Cust per day -- random int between 25 and 150
● Site ID -- one of 5 values randomly ['001', '002' 'A02', 'B02', '003', 'B03']
○ first site needs to be 0.25 probability of occurring, last site needs to be 0.3
○ the other sites you should decide probabilities that sum to 1
● Merch Restock -- [0,1] ○ 75 % they need to restock (1)
● Fuel Restock -- [0,1] ○ 90% they need to restock (1)
● Daily Revenue -- Random floating point between 500 and 5000
Create a state column
● if site 001 or B02 set state to Rhode Island
● if site 002 or A02 set state to Montana
● all remaining set state to Alabama After adding this column, describe the data sets statistical distributions using the pandas function.
For daily revenue:
○ Create a sum column that contains the sum for that state on every row. All states should have the same sum
○ Create another column for the mean for that state
Step by Step Solution
3.38 Rating (154 Votes )
There are 3 Steps involved in it
Step: 1
Answers NOTE There is an issue in the question in the description of the site ID column as it says 5 ...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started