Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

please use datahub (R studio) all this in one question. please respond to all. Thank you Part 2: Election data Now we are going to

please use datahub (R studio) all this in one question. please respond to all. Thank you

Part 2: Election data

Now we are going to look at data from the U.S. 2020 Presidential election published by the NYTimes. We will then look at how this data relates to the COVID-19 data we have been analyzing so far.

### JUST RUN THIS CELL (NO NEED TO EDIT ANYTHING) csv_path ='data/nytimes-election2020.csv' election2020 = suppressMessages(read_csv(csv_path)) str(election2020)

Below is a description of the election2020 data: In this dataset, there are 50 observations and 8 variables:

  • state: The state in which the votes were collected
  • electoral_votes: The number of Electoral College votes the state is allotted
  • votes2020: The total number of votes cast in 2020
  • margin2020: The margin in 2020. The percentage difference between the leading party. This value is positive if the leader party was 'republican' and negative if the leader party was 'democrat'.
  • party2020: The party that won the state in 2020
  • votes2016: The total number of votes cast in 2016
  • margin2016: The margin in 2016. The percentage difference between the leading party. This value is positive if the leader party was 'republican' and negative if the leader party was 'democrat'.
  • party2016: The party that won the state in 2016

2.1 Create a new variable in the election2020 data frame called flipped and assign to it whether or not a state voted differently in 2020 than in 2016 (Hint: Compare party2020 to party2016).

2.2 Use gf_point() to plot the margin2016 againsts margin2020. Color the points in the plot according to the flipped variable you made in 2.1.

2.3 Please inspect the plot you made in 2.2 and note any observations you make.

2.4 Create a new variable in election2020 called vote_change that shows how many more votes there were in 2020 than in 2016 (if there were fewer votes vote_change should be negative, if there were more, it will be positive). Then display the top 5 states that had the largest change in voter turnout (Hint: Try using arrange() by vote_change).

Part 3: Election and COVID-19

In the cell below we have created a new data frame called covid_election_data that has the us-state covid data with some information about the 2020 election.

In [ ]:

### JUST RUN THIS CELL (NO NEED TO EDIT ANYTHING) csv_path ='data/nytimes-covid19-election.csv' election_covid_data = suppressMessages(read_csv(csv_path)) str(election_covid_data)

3.1 Make a plot of the number new_cases over time using gf_line(). Color the points in the plot according to party2020. Then use gf_refine(scale_color_manual(values = c('democrat' = 'blue', 'republican'='red'))) to color the points according to the colors each party is associated with.

3.2 What patterns do you notice?

 

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Sybase Database Administrators Handbook

Authors: Brian Hitchcock

1st Edition

0133574776, 978-0133574777

More Books

Students also viewed these Databases questions

Question

Question Who can establish a Keogh retirement plan?

Answered: 1 week ago