Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Problem 2 (6 point) Download the U.S. Senate 1976-2020 data set on the HARVARD Dataverse. Read the data in its original format (.csv) by using
Problem 2 (6 point) Download the U.S. Senate 1976-2020 data set on the HARVARD Dataverse. Read the data in its original format (.csv) by using the function read. cav () in an appropriate way. In this dataset, there are 3629 observations with 19 variables. The variables are listed as they appear in the data file. * year : year in which election was held . state : state name . state po : U.S. postal code state abbreviation . state fips : State FIPS code 1 * State_cen : U.S. Census state code . state_ic : ICPSR state code * office : U.S. SENATE (constant) . district : statewide (constant) . stage : electoral stage where "gen" means general elections, "runoff" means runoff elections, and "pri" means primary elections. * special : special election where "TRUE" means special elections and "FALSE" means regular elections * candidate : name of the candidate in upper case letters* party_detailed : party of the candidate (always entirely uppercase). Parties are as they appear in the House Clerk report. In states that allow candidates to appear on multiple party lines, separate vote totals are indicated for each party. Therefore, for analysis that involves candidate totals, it will be necessary to aggregate across all party lines within a district. For analysis that focuses on two-party vote totals, it will be necessary to account for major party candidates who receive votes under multiple party labels. Minnesota party labels are given as they appear on the Minnesota ballots. Future versions of this file will include codes for candidates who are endorsed by major parties, regardless of the party label under which they receive votes. * party simplified : party of the candidate (always entirely uppercase). The entries will be one of: "DEMOCRAT", "REPUBLICAN", "LIBERTARIAN", "OTHER" * writein : vote totals associated with write-in canditates where "TRUE" means write-in canditates and "FALSE" means non-write in canditates. * mode : mode of voting; states with data that doesn't break down returns by mode are marked as "total" * canditatevotes : votes received by this candidate for this particular party . totalvotes : total number of votes cast for this election . unofficial : TRUE/FALSE indicator for unofficial result (to be updated later); this appears only for 2018 data in some cases * version : date when this dataset was finalized(a) Turn the variables : year, state, and party simplified into factor variables. (b) Subset the dataset by extracting the data for the state of Texas. Only keep the columns: year, state, candidatevotes, totalvotes, and party simplified. Use this data subset for the rest of the question 2 (c) Calculate the average and median number of votes received by democratic, repub- lican, libertarian, and other candidates in the state of Texas. Round your numeric answer to the nearest whole number. (d) Identify the years in which the democratic candidate from Texas won. (e) Create a barplot that shows the number of votes for republican by year and then compute its average over years. If the votes that republican got is less than the above average then color that bar in blue, otherwise color the bar in red. (f) Create a barplot of the total number of voters by year. Normalize the total number of voters per year by dividing by the population of Texas that year. Make a barplot of this normalized data as well. Note: you must find the yearly population of Texas for the relevant years by yourself, label the x axis of the barplot with appropriate names
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started