Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

You can choose to complete this assignment by yourself or with a group of at most two total participants. Each person must turn the assignment

You can choose to complete this assignment by yourself or with a group of at most two total participants. Each person must turn the assignment in for grading, and each person must contribute to the development of the program. Use the file Donors_Data.csv. Structured Data Processing For purposes of this writeup, we will use examples from the Donors data file. The main outline of your assignment is to write a program that will read in the data from a file, such as the .csv file saved from Excel. This will be in a format that is structured with lines of data representing one type of unit (i.e., one donor in the donors file). Your program will represent the data as Python data structures. You may choose for the overall structure to be one or both of the following: - A list of dictionaries, or some combination of lists, dictionaries, and NumPy arrays - A pandas dataframe You will do data exploration and cleaning on this data. Your program will do some processing to convert the data to a form that will answer at least two questions as described below and write files with the data suitable for answering each question. Graphing is optional. Data: You may choose a data set to work with. As a guideline, data sets should have somewhere between 500 and 4,000 lines of data with some number of columns between 4 and 50. These guidelines are not exact limits, just guidance for selecting data. If the data comes in an Excel spreadsheet with a number of columns, it is okay to first edit the file to remove columns that you do not need for your processing. For example, in the Donors data, you might wish to create a separate spreadsheet with only a few columns of data. Questions: For this assignment, at least one question that you choose to answer should look at the data in a different unit of analysis than is present in the data file. For example, instead of looking at individual donors, you could look at the donors of each of the nine income or wealth types. Simplest example question (you should do one more complex than this): For each wealth type, what is the average home value of all the donors of that type? - Unit of analysis: wealth types - Comparison: for each wealth type, compute the average home values of the neighborhoods of all the donors of that type - Output: should be in a file with nine rows of data (you may also produce header and label rows), where each row has an income type (19) and the average home values One way to increase the complexity of this particular question would be to add more items to be compared to the income types, e.g., add columns to the output with average total gifts or values of the last gifts. Another way is to introduce a more detailed unit of analysis; for example, suppose that for each income level you reported by gender, giving the average home value for both men and women in each category. Other ideas: Compare donors in the various zip codes with various types or amounts of giving. Compare donors by the number of promotions with the total amount of donations and the frequency of donations. Compare the number of months since the last donation to the donation amounts. What to Submit: In addition to the program that you write, you should write a small report. In it you should provide: - Data and its source - Description of your data exploration and data cleaning steps - At least two clearly stated comparison questions with the unit of analysis, the comparison values, and how they are computed - Brief description of the program - Description of the output and analysis of that output. - Final conclusion about the data and what your most important take away was. For your program, you may use any of the code developed in class as a template, but it is absolutely essential that you use appropriate variable names and that you write original comments for what your program does. Recall that good comments demonstrate your understanding of the code that you write and the problem that you are trying to solve.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

=+ (b) If F is continuous, then E[F(X)) =;.

Answered: 1 week ago