Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

COVID-19 EDA: Perform an Experimental Data Analysis using R. Data source R code: data % mutate(dateRep = dmy(dateRep), countriesAndTerritories = as.factor(countriesAndTerritories), geoId = as.factor(geoId), countryterritoryCode

COVID-19 EDA: Perform an Experimental Data Analysis using R.

Data source R code:

data <- read.csv("https://opendata.ecdc.europa.eu/covid19/nationalcasedeath_eueea_daily_ei/csv", na.strings = "", fileEncoding = "UTF-8-BOM")

data <- data %>% select(-c("continentExp")) %>% mutate(dateRep = dmy(dateRep), countriesAndTerritories = as.factor(countriesAndTerritories), geoId = as.factor(geoId), countryterritoryCode = as.factor(countryterritoryCode))

A data dictionary for the dataset is available here: https://www.ecdc.europa.eu/sites/default/files/documents/Description-and-disclaimer_daily_reporting.pdf

Definitions:

* "Incidence rate" is equal to new daily cases per 100K individuals. Country population estimates can be found in 'popData2020.'

* "Fatality rate" is equal to new daily deaths per 100K individuals. Country population estimates can be found in 'popData2020.'

1. Descriptive Statistics Give example R code for each of the following:

* Creation of a vector, 'incidence_rate,' equal to the daily new cases per 100K individuals, per country. Country populations are provided in 'popData2020.' This vector should be added to the 'data' data frame. * Creation of a vector, 'fatality_rate,' equal to the new deaths per 100K individuals, per country. Country populations are provided in 'popData2020.' This vector should be added to the 'data' data frame. * A visualization exploring new cases or incidence rates, per country, over time. Your visualization should include at least five (5) countries and include the entire time frame of the dataset. * A visualization exploring new deaths or fatality rates, per country, over time. Again, your visualization should include at least five (5) countries. * A table or visualization exploring some other aspect of the data. For example, you could explore case fatality rates per country; the number of deaths divided by the total number of cases. You will want to look across the entire time of the dataset, looking at the total cases and deaths, per country.

2. Inferential Statistics Select two (2) countries of your choosing and compare their incidence or fatality rates using hypothesis testing.

Please give example R code for each of the following:

* Visualization(s) comparing the daily incidence or fatality rates of the selected countries, * A statement of the null hypothesis. * A short justification of the statistical test selected. + Why is the test you selected an appropriate one for the comparison we're making? * A brief discussion of any distributional assumptions of that test. + Does the statistical test we selected require assumptions about our data? + If so, does our data satisfy those assumptions? * Your selected alpha. * The test function output; i.e. the R output. * The relevant confidence interval, if not returned by the R test output. * A concluding statement on the outcome of the statistical test. + i.e. Based on our selected alpha, do we reject or fail to reject our null hypothesis?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Entrepreneurship

Authors: Andrew Zacharakis, William D Bygrave

5th Edition

1119563097, 9781119563099

Students also viewed these Mathematics questions

Question

Draw the bode plot for the network function. Briefly.

Answered: 1 week ago