Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Description of the data The Beijing PM2.5 Data Set is taken by the website https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data. It contains the PM2.5 data of US Embassy in Beijing

Description of the data The Beijing PM2.5 Data Set is taken by the website https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data. It contains the PM2.5 data of US Embassy in Beijing and the meteorological data from Beijing Capital International Airport. These data are collected in the R data frame Beijing_pollution.RData, which contains the follow- ing columns:

  1. No: row number
  2. year: year of data in this row
  3. month: month of data in this row
  4. day: day of data in this row
  5. hour: hour of data in this row
  6. pm2.5: PM2.5 concentration (ug/m3)
  7. DEWP: Dew Point
  8. TEMP: Temperature
  9. PRES:Pressure(hPa)
  10. cbwd: Combined wind direction:
  11. NE (North-East), SE (South-East), NW (North-West), cv (South-West)
  12. Iws: Cumulated wind speed (m/s)
  13. Is: Cumulated hours of snow
  14. Ir: Cumulated hours of rain

1

  • You may use both graphical and analytical methods (such as linear regression and scatterplot smoother) to find interesting patterns in the data that may help in building predictive statistical models.
  • The data analysis, which should address the following questions, needs to be summarized in the form of a report (a maximum of 1000 words, i.e, approximately 2 pages).
  • The variable pm2.5 has many missing values, which should be accounted for while carrying out any analysis.
  • While building a predictive model (question 4 below), you may consider transforming certain vari- ables and fitting linear regression models using original or transformed variables.
  • Questions:
  1. Provide a graphical summary, with explanations, to describe how the various numeric variables relate to each other.
  2. Brieflydescribeifyoufindanyrelationshipbetweenpm2.5andmonth.Howdosuchpatternschange across hour and cbwd?
  3. Consider the variables DEWP, TEMP and PRES and describe how the relationships among any pair of these variables vary across strata determined by factors such as year, month and hour. (You need to think carefully about how to present such information graphically).
  4. Can you identify a subset of variables that are effective in terms of predicting pm2.5? Are there differential effects across education month, hour and cbwd? Propose a predictive model based on your analysis.

I stuck at question 4. R gives me Error and I don't know how to solve it.

image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Discrete Mathematics, Edition

Authors: Seymour Lipschutz, Marc Lipson

4th Edition

126425881X, 9781264258819

More Books

Students also viewed these Mathematics questions