Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

We will se a dataset toxins.and.cancer from library nutshell Using dplyr , create 2 new vectors in the data frame: cancer.rate = number of cancer

We will se a dataset toxins.and.cancer from library nutshell

 
 
 
 
  1. Using dplyr, create 2 new vectors in the data frame:
    • cancer.rate = number of cancer deaths / population
    • state.toxins = (total toxic chemicals released / surface area)
  2. Now, compare the overall cancer rate (cancer.rate) to the presence of toxins (state.toxins) by making a scatter plot with ggplot2
    • include a smooth regression line
    • label each point with its state code (State_Abbrev)

What if any conclusions can you draw from this plot?

Next, investigate whether there might be a stronger correlation between airborn toxins and lung cancer 15. Create a new columns in the data frame: - lung.deaths = deaths_lung / Population - airborne.toxins = air_on_site /Surface_Area - create a scatter plot using these data - label each point with the State ID code

 
 
  1. Comparing the 2 plots, does one show a stronger possible correlation with cancer rates, explain.

One more data set to explore. Again, from library nutshell, well now look at birthweight data. Recall from BIOL 1308 that birthweight is under stabilizing selection.

Load the data, and save it to the variable, births

 

Lets reduce the size of the data to make it easier to work with. 17. subset the data to contain just the first 10000 rows, and store this in births2

  1. First, First set up our plot with the reduced dataset, births2
 
  1. Second, starting with the base plot p, create a barchart (geom_bar) of the number of births per day of the week (DOB_WK) using births2 and ggplot
    • x axis is DOB_WK
      • note: we have to tell R that these values are factors, so define: aes(x=factor(DOB_WK)
    • store this as p1 - our starting point
  2. The data are rich enough, so lets add more aesthetics.
  • Fill each bar with different colors corresponding to what type of birth it was (C-section, Vaginal, or Unknown).
  • Start with our 1st plot (p), but add a barchart with the x-axis DOB_WK, and also fill columns using different colors corresponding to what type of birth it was (DMETH_REC).
  • store this as plot p2
  1. Now, lets make a histogram where the x axis is of all womens ages (x=MAGER), to get an idea of what this overall distribution looks like.
    • start with base plot (p)
    • create a histogram (geom_histogram)
    • color by what type of birth it was (DMETH_REC)
    • set the binwidth = 1 to set the width of the bins
  2. Now we want to look more deeply at the data.
  • In ggplot itt is easy to facet this single graph to make multiple graphs along a dimension.
  • Facet (facet_wrap) based on the dimension of day of birth DOB_WK .
  1. Lastly, lets tease apart date of birth (DOB_WK) by the sex (SEX) of the child.
    • Accomplish this using facet_grid
      • you want to facet DOB_WK by SEX

image text in transcribed

Exercises creating a member list in rstudio - Goo. data types numerical and categoricali Basic Probability rary Google Plottting You may want to refer to the documentation for ggp ot2 We will se a dataset toxins.and.cancer from library nutshell library (nutshell) *" Loading required package: nutshel1.bbdb ## Loading required package: nutshe11.audioserobbler data(toxins.and.cancer) help("toxins.and.cancer") attach(toxins.and.cancer) #8tr (toxins, and.cancer) 13. Using dplyr, create 2 new vectors in the data frame: cancer.rate number of cancer deaths/ population state.toxins (total toxic chemicals released/ surface area) o o 14. Now, compare the overall cancer rate (cancer.rate) to the presence of toxins (state.toxins) by making a scatter plot with ggplot2 e include atsmooth regression line label each point with it's state code (State Abbrev) o What if any conclusions can you draw from this plot? Next, investigate whether there might be a stronger correlation between airborn toxins and lung cancer 15. Create a new columns in the data frame:- lung.deaths deaths Jung /Population-airborne.toxins point with the State ID code air on site /Surface_Area - create a scatter plot using these data - label each create new columns g9plot 16. Comparing the 2 plots, does one show a stronger possible correlation with cancer rates, explain. Exercises creating a member list in rstudio - Goo. data types numerical and categoricali Basic Probability rary Google Plottting You may want to refer to the documentation for ggp ot2 We will se a dataset toxins.and.cancer from library nutshell library (nutshell) *" Loading required package: nutshel1.bbdb ## Loading required package: nutshe11.audioserobbler data(toxins.and.cancer) help("toxins.and.cancer") attach(toxins.and.cancer) #8tr (toxins, and.cancer) 13. Using dplyr, create 2 new vectors in the data frame: cancer.rate number of cancer deaths/ population state.toxins (total toxic chemicals released/ surface area) o o 14. Now, compare the overall cancer rate (cancer.rate) to the presence of toxins (state.toxins) by making a scatter plot with ggplot2 e include atsmooth regression line label each point with it's state code (State Abbrev) o What if any conclusions can you draw from this plot? Next, investigate whether there might be a stronger correlation between airborn toxins and lung cancer 15. Create a new columns in the data frame:- lung.deaths deaths Jung /Population-airborne.toxins point with the State ID code air on site /Surface_Area - create a scatter plot using these data - label each create new columns g9plot 16. Comparing the 2 plots, does one show a stronger possible correlation with cancer rates, explain

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Islamic Banks Positioning Study Regulatory Specificities And Audit Particularities

Authors: Hassen BEN OUHIBA

1st Edition

6206279790, 978-6206279792

More Books

Students also viewed these Accounting questions

Question

15% of what number is 18? A. 102 B. 110 C. 112 D. 120 E. 125

Answered: 1 week ago

Question

b. What groups were most represented? Why do you think this is so?

Answered: 1 week ago