Here is the Data:
2. Recordings of the levels of pollutants and various meteorological conditions are made hourly at several stations by the Los Angeles Pollution Control District. This agency attempts to construct mathematica/statistical models to predict pollution levels and to gain a better understanding of the complexities of air pollution. Obviously, very large quantities of data are collected and analyzed, but only a small set of data will be considered in this problem. The file airpollution. cav (in the Data folder) contains the maximum level of an oxidant (a photochemical pollutant) and the morning average of four meteorological variables: wind speed, temperature, humidity, and insolation (a measure of the amount of sunlight). The data covers 30 days during one summer. a) Examine the relationship of oxidant level to each of the four meteorological variables and the relationship of the meteorological variables to each other. Which of the covariate is significantly associated with air pollution levels? b) The standard statistical model used for multiple linear regression assumes that the errors are random and independent of one another. In data that are collected over time, the error at any given time may well be correlated with the error from the preceding time. This phenomenon is called serial correlation, and in its presence, the estimated standard errors of the coefficients developed in this chapter may be incorrect. Can you detect serial correlation in errors from your model fits?Day Wind Temperatur Humidity Insolation Oxidant 77 67 78 15 50 47 80 66 77 20 57 75 77 73 13 38 72 73 69 21 71 75 78 12 74 75 80 12 78 64 75 12 82 59 78 11 82 60 75 12 82 62 58 20 82 59 76 11 40 80 66 76 17 20 42 31 68 71 74 23 40 85 62 48 82 70 73 17 79 66 72 16 16 50 55 72 63 69 10 57 11 18 52 72 61 19 48 76 60 74 11 20 52 77 59 72 9 52 73 58 67 5 21 22 48 68 63 30 5 67 65 4 23 65 23 72 7 24 53 71 53 78 18 25 36 75 54 26 45 81 44 81 17 43 84 46 78 23 28 42 83 43 78 23 29 87 44 77 24 35 43 92 35 79 25 30