Question: The program must provide following functions to extract some statistics. Note that thedata_list parameter specified in these functions may be the same for all functions

The program must provide following functions to extract some statistics. Note that thedata_list parameter specified in these functions may be the same for all functions or different for different functionsthat is your choice. A skeleton file is provided on Mirmir.

a) open_file()promptstheusertoenterayearnumberforthedatafile.Theprogramwill check whether the year is between 1990 and 2015 (both inclusive). If year number is valid, the program will try to open data file with file name yearXXXX.txt, where XXXX is the year. Appropriate error message should be shown if the data file cannot be opened or if the year number is invalid. This function will loop until it receives proper input and successfully opens the file. It returns a file pointer and year.

i. Hint: use string concatenation to construct the file name b) read_file(fp)has one parameter, a file pointer read. This function returns a list of your

choosing containing data you need for other parts of this project. c) find_average(data_list)takesalistofdata(ofsomeorganizationofyour

choosing) and returns the average salary. The function does not print anything. Hints:

This is NOT (!) the average of the last column of data. It is not mathematically valid to

find an average by finding the average of averagesfor example, in this case there are

many more in the lowest category than in the highest category.

How many wage earners are considered in finding the average (denominator)? There

are a couple of ways to determine this. I think the easiest uses the cumulative number column (Column 4), but using Column 3 is not hard and may make more sense to some students.

How does one find the total dollar value of income (numerator)? Notice that Column 6 is the combined income of all the individuals in this range of income.

For testing your function notice that for the 2014 data the average should be $44,569.20. As a check, note that that value is listed on the web page referenced above.

d) find_median(data_list) takes a list of data (of some organization of your choosing) and returns the median income. The function does not print anything. Unfortunately, this file of data is not sufficient to find the true median so we need to approximate it.

Here is the rule we will use: find the data line whose cumulative percentage (Column 5)

is closest to 50% and return its average income (Column 7). If both data lines are

equally close, return either one.

Hint: Pythons abs() function (absolute value) is potentially useful here.

Hint: your get_range() function should be useful here.

For testing your function, using our rule the median income for the 2014 data is

$27,457.00

e) get_range(data_list, percent) takes a list of data (of some organization of your

choosing) and a percent (float) and returns the salary range as a tuple (Columns 0 and 2) for the data line whose cumulative percentage (Column 5) is greater than or equal to thepercent parameter, the cumulative percentage value (Column 5) and the average income (Column 7). Stated another way: ((col_0,col_2),col_5,col_7) The function does not print anything.

i. For testing using the 2014 data and a percent value of 90 your function will return

 ((90000.0, 94999.99), 90.80624, 92420.5)

f) get_percent(data_list, income) takes a list of data (of some organization of your choosing) and an income (float) and returns the cumulative percentage (Column 5) for the data line that the specified income is in the income range (Columns 0 and 2), and income range (Columns 0 and 2) . Stated another way: ((col_0,col_2),col_5) The function does not print anything.

i. For testing using the 2014 data and an income value of 150,000 your function will return

((150000.0, 154999.99), 96.87301) g) do_plot(x_vals,y_vals,year)providedbyustakestwoequal-lengthlistsof

numbers and plots them. Note that if you plot the whole file of data, the income ranges are so skewed that the result is a nearly vertical plot at the leftmost edge so close to the edge that you cannot see it in the plotit looks like nothing was plotted. Plotting the lowest 40 income ranges results in a more easily readable plot.

2. main()

a) Open the data file

b) Read the data file (using the file pointer from the opened file).

c) Print the year, the average income, and the median income (and a header). Here is the

output format that I used: "{:<6d}${:<14,.2f}${:<14,.2f}"

d) Prompt whether to plot the data and if yes, plot the data: cumulative percentage (Column

5) vs. income (Column 0) only the lowest 40 income ranges.

e) Loop, prompting for either r for range , p for percent, or nothing

i. r: prompt for a percent (float) and output the income that is below that percent. Print an

error message, if an invalid number is entered (a percent must be between 0 and 100).

Here is the output format that I used:

"{:4.2f}% of incomes are below ${:<13,.2f}." ii.p: prompt for an income (float) and output the percent that earn more. Print an error

message, if an invalid income is entered (income must be positive). Here is the output format that I used:

"An income of ${:<13,.2f} is in the top {:4.2f}% of incomes."

iii. if only a carriage-return is entered, halt the program. 3. Call main() using

 if __name__ == "__main__": main()

Initial Program:

import pylab def do_plot(x_vals,y_vals,year): '''Plot x_vals vs. y_vals where each is a list of numbers of the same length.''' pylab.xlabel('Income') pylab.ylabel('Cumulative Percent') pylab.title("Cumulative Percent for Income in "+str(year)) pylab.plot(x_vals,y_vals) pylab.show() def open_file(): '''You fill in the doc string''' year_str = input("Enter a year where 1990 <= year <= 2015: ") pass # replace this line with your code def read_file(fp): '''You fill in the doc string''' pass # replace this line with your code def find_average(data_lst): '''You fill in the doc string''' pass # replace this line with your code def find_median(data_lst): '''You fill in the doc string''' pass # replace this line with your code def get_range(data_lst, percent): '''You fill in the doc string''' pass # replace this line with your code def get_percent(data_lst,salary): '''You fill in the doc string''' pass # replace this line with your code def main(): # Insert code here to determine year, average, and median print("For the year {:4d}:".format(year)) print("The average income was ${:<13,.2f}".format(avg)) print("The median income was ${:<13,.2f}".format(median)) response = input("Do you want to plot values (yes/no)? ") if response.lower() == 'yes': pass # replace this line # determine x_vals, a list of floats -- use the lowest 40 income ranges # determine y_vales, a list of floats of the same length as x_vals # do_plot(x_vals,y_vals,year) choice = input("Enter a choice to get (r)ange, (p)ercent, or nothing to stop: ") while choice: # Insert code here to handle choice choice = input("Enter a choice to get (r)ange, (p)ercent, or nothing to stop: ") if __name__ == "__main__": main()

Tests

There are unit tests for functions: find_average, find_median, get_range, and get_percent. The tests all call your read_file function to get your data structure to pass to those functions. The file read for these unit tests is the 2014 data.

Test 1 Enter a year where 1990 <= year <= 2015: 2014 Year Mean Median 2014 $44,569.20 $27,457.00 Do you want to plot values (yes/no)? no Enter a choice to get (r)ange, (p)ercent, or nothing to stop:

Test 2 Enter a year where 1990 <= year <= 2015: 2014 Year Mean Median 2014 $44,569.20 $27,457.00 Do you want to plot values (yes/no)? no Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r Enter a percent: 90 90.00% of incomes are below $90,000.00 . Enter a choice to get (r)ange, (p)ercent, or nothing to stop: p Enter an income: 100000 An income of $100,000.00 is in the top 92.57% of incomes. Enter a choice to get (r)ange, (p)ercent, or nothing to stop:

Test 3 Enter a year where 1990 <= year <= 2015: xxxx Error in year. Please try again. Enter a year where 1990 <= year <= 2015: 1900 Error in year. Please try again. Enter a year where 1990 <= year <= 2015: 1999 Error in file name: year1999.txt Please try again. Enter a year where 1990 <= year <= 2015: 2015 Year Mean Median 2015 $46,119.78 $27,459.59 Do you want to plot values (yes/no)? no Enter a choice to get (r)ange, (p)ercent, or nothing to stop: x Error in selection. Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r Enter a percent: 104 Error in percent. Please try again Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r Enter a percent: -2 Error in percent. Please try again Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r Enter a percent: 90 90.00% of incomes are below $90,000.00 . Enter a choice to get (r)ange, (p)ercent, or nothing to stop: p Enter an income: -20 Error: income must be positive Enter a choice to get (r)ange, (p)ercent, or nothing to stop: p Enter an income: 100000 An income of $100,000.00 is in the top 92.03% of incomes. Enter a choice to get (r)ange, (p)ercent, or nothing to stop:

Test 4 Enter a year where 1990 <= year <= 2015: 2000 Year Mean Median 2000 $30,846.09 $17,471.75 Do you want to plot values (yes/no)? no Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r Enter a percent: 40 40.00% of incomes are below $15,000.00 . Enter a choice to get (r)ange, (p)ercent, or nothing to stop: p Enter an income: 20000 An income of $20,000.00 is in the top 56.96% of incomes. Enter a choice to get (r)ange, (p)ercent, or nothing to stop:

Test 5 (not on Mirmir because this tests the plot TAs will run this test.) Enter a year where 1990 <= year <= 2015: 2015 Year Mean Median 2015 $46,119.78 $27,459.59 Do you want to plot values (yes/no)? yes Enter a choice to get (r)ange, (p)ercent, or nothing to stop:

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

CSE 231 Spring 2021 Computer Project #07 Assignment Overview This assignment focuses on the implementation of Python programs to read files and process data by using lists and functions. It is worth...

Need help creating PYTHON codes using functions to extract some statistics. Thank you in advance! Below are screenshots of notes to follow. Also, instructions/directions are included and must be...

Code Requirements: All code you write for this project should be contained in this file. Your program should include the following functions: read _ text - This function should accept a string...

Hi I am in Fundamentals of Programming One. We are working in C++. If you could answer this programming problem using functions, DO NOT use arrays. Thank you!! For all programs for the rest of the...

FOR PROGRAMMING INTERMEDIATE C+ + CIS 22B, provide a heading comment at the top using the following format: /* Fred Flintstone Spring 1984 Lab 1 Problem 0.0.0 Description of problem: a few lines...

C++ T opics: new, delete, string class provide a heading comment at the top using the following format: /* Fred Flintstone Spring 1984 Lab 1 Problem 0.0.0 Description of problem: a few lines...

provide a heading comment at the top using the following format /* Fred Flintstone Spring 1984 Lab 1 Problem 0.0.0 Description of problem: a few lines describing input, activity, and output of the...

NEED IT ASAP PLSSSSSSS 1 Introduction and purpose In this project you will write some functions to manipulate the instructions of a fictional simple processor named the MAD Raisin. The MAD Raisin CPU...

1. Calculate the sample size needed given these factors: one-tailed t-test with two independent groups of equal size small effect size (see Piasta, S.B., & Justice, L.M., 2010) alpha =.05 beta = .2...

CSE 231 Spring 2020 Programming Project 05 This assignment is worth 45 points (4.5% of the course grade) and must be completed and turned in before 11:59 PM on Monday, February 17, 2020. Assignment...

Landcruisers plus (LP) has operated an online retail store selling off-road truck parts. As the name implies, the firm specializes in parts for the venerable Toyota FJ40 that is known throughout the...

What are the exceptions to the rule that a bank will be liable for paying a check over an unauthorized indorsement?

How can a website improve communication between a business and its customers?

CT Corp Comprehensive Question Canadian Tire Corporation, Limited ( Canadian Tire ) is a family of companies that includes a retail segment and a financial services division, among others. The retail...

Prepare an electronic rsum.

Strengthen your personal presence.

Identify the steps to follow in preparing an oral presentation.