Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Description of the data The Beijing PM2.5 Data Set is taken by the website https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data. It contains the PM2.5 data of US Embassy in Beijing
Description of the data The Beijing PM2.5 Data Set is taken by the website https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data. It contains the PM2.5 data of US Embassy in Beijing and the meteorological data from Beijing Capital International Airport. These data are collected in the R data frame Beijing_pollution.RData, which contains the follow- ing columns:
- No: row number
- year: year of data in this row
- month: month of data in this row
- day: day of data in this row
- hour: hour of data in this row
- pm2.5: PM2.5 concentration (ug/m3)
- DEWP: Dew Point
- TEMP: Temperature
- PRES:Pressure(hPa)
- cbwd: Combined wind direction:
- NE (North-East), SE (South-East), NW (North-West), cv (South-West)
- Iws: Cumulated wind speed (m/s)
- Is: Cumulated hours of snow
- Ir: Cumulated hours of rain
1
- You may use both graphical and analytical methods (such as linear regression and scatterplot smoother) to find interesting patterns in the data that may help in building predictive statistical models.
- The data analysis, which should address the following questions, needs to be summarized in the form of a report (a maximum of 1000 words, i.e, approximately 2 pages).
- The variable pm2.5 has many missing values, which should be accounted for while carrying out any analysis.
- While building a predictive model (question 4 below), you may consider transforming certain vari- ables and fitting linear regression models using original or transformed variables.
- Questions:
- Provide a graphical summary, with explanations, to describe how the various numeric variables relate to each other.
- Brieflydescribeifyoufindanyrelationshipbetweenpm2.5andmonth.Howdosuchpatternschange across hour and cbwd?
- Consider the variables DEWP, TEMP and PRES and describe how the relationships among any pair of these variables vary across strata determined by factors such as year, month and hour. (You need to think carefully about how to present such information graphically).
- Can you identify a subset of variables that are effective in terms of predicting pm2.5? Are there differential effects across education month, hour and cbwd? Propose a predictive model based on your analysis.
I stuck at question 4. R gives me Error and I don't know how to solve it.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started