Basic Data Analysis For Time Series With R 1st Edition DeWayne R. Derryberry - Solutions

7. Develop a simulation similar to the one in Section 20.5 with 0.65 < R2 < 0.85.Use theARMA(m,l) structure belowand pick your own parameters for the periodic functions.(i) Simulate the data with n =
6. In the book the phrase “residuals not distinguishable from white noise” is used a lot. Why would it be less correct to say “residuals that are white noise.”
5. A students argues: “It doesn’t matter where the filter (A) came from, if the result is quasi-independent observations [observations not distinguishable from white noise based on ar.yw() or
4. Show that, if a model is AR(1) with constant variance and gaps appear in equally spaced data, the ratio of the largest variance to the smallest variance is at most 1/[1 − a].
3. In the text, the time stamps for the temperature data were explored to determine the degree of equal or unequal spacing. Compute a “lag” variable for the CO2 time stamp and perform a similar
2. Show that ar.mle() and ar.yw() provide similar analysis for the residuals from the cubic Vostok model, both in the choice of model and in which models are considered close to the best models based
1. Filter the Vostok cubic regression with an initial AR(2) model, the one suggested by ar.yw(), and find the order of the residuals after filtering.
1. Perform a similar analysis comparing the high- and low-elevation groups.(i) Fit a Fourier series to each watershed and find the estimated AR(m) order of the residuals.(ii) Plot the averaged
6. Perform a complete analysis of the final regression (save all R code and turn it in).(i) Fit a model log(breast cancer)∼log(prostate cancer) +log(prostate cancer)2.(ii) Use the residuals to
5. For the second data set, why are the quadratic models smooth fits on the original scale, but rough/choppy fits on the filtered scale?
4. Verify the regression log(breast cancer) ∼ log(colon cancer) + log(colon cancer)2 must be filtered twice.(i) Fit a regression model to the data.(ii) Use ar.mle() to develop a filtering matrix A
3. For the female regressions:log(breast cancer) ∼ log(colon cancer) + log(colon cancer)2 log(breast cancer) ∼ log(lung cancer) + log(lung cancer)2 log(breast cancer) ∼ log(ovary cancer) +
2. Using ar.yw() for small samples: The purpose of this exercise is to show that the residuals are not filtered properly, when using ar.yw() for a specific data set.(i) Fit the model log(prostate
1. Perform a nonparametric analysis of the first data set. Compare the results to the t-tests found in the text.
8. (A very open-ended problem). Perhaps more a project than an exercise: simulate some periodic data with a varying mean, period, phase, and amplitude, but with a moderate amount of random ARMA(2,2)
7. (An open-ended problem). The one- and two-step ahead methods did not use the data as efficiently as possible, because the AR(m) model was built on the training set and never updated, as the model
6. In the one-step ahead model selection table (Table 17.1), the models with very small or very large windows have more complex structure in the residuals. Is this evidence that these models are not
5. Repeat the steps form the Lynx exercise with the file “Zuerich sunpots.txt” from the Exercises folder (DataMarket–Time Series Data Library–Physics–Zuerich monthly sunspot numbers
4. Modify the R code from the book to create a two-step ahead method and apply this to the Lynx data.
3. This exercise uses the file “LYNX.txt” from the Exercises folder.(i) Assess whether a transformation is required for the Lynx data. If a transformation is appropriate, use it to complete all
2. Consider the model y = 3 + 0.5x + 0.2x2 + ????, for −2 < x < 2.(i) Simulate n = 50 observations from this model with a definite signal and some definite noise, perhaps 0.7 < R2 < 0.9.(ii) Fit a
1. Verify the estimates of ????, B, and C for the simple periodic model when k = 10.7.
12. Produce prediction and prediction intervals for AR(2) errors ????j+1, ????j+1, and ????j+1, similar to Section 16.2.5.
11. (i) Simulate n = 800 observations with a simple periodic function, a period of 100, and errors of form AR(3) given in exercise 4, choose the simple periodic function so that initially 0.75 < R2 <
10. For the models compared in Table 16.9 (LA ozone), the models were all nested.Use hypothesis testing to pick the best model.
9. Perform an analysis of the passenger miles flown in theUK(“UK road deaths.txt”from the Exercises folder). Use model selection and filtering to find the best model. Are passenger miles
8. Perform an intervention analysis for the monthlyMinneapolis public drunkenness data (“Minn drunks.txt” from the Exercises folder or DataMarket–TSDLme–MonthlyMinneapolis public drunkenness
7. Use a trend and periodic model to fit the monthly Boston armed robberies data(“Boston robberies.txt” in the Exercises folder or DataMarket–TSDL–Crime–Monthly Boston armed robberies
6. Use a trend and periodic model to fit the monthly milk production data (“Milk production.txt”, Exercises folder). Use model selection and filtering to find the best model. When are the low and
5. Fit a periodicmodel to the Ozone, Arosa (or Ozone Azusa ) data in The Exercises folder (“Ozone Arosa.txt” and “Ozone Azura.txt”, Exercises folder or DataMarket– Time Series Data Library
4. Consider the AR(3) model(1 − 0.7B)(1 − [0.4 − 0.1i]B)(1 − [0.4 + 0.1i]B)????n = wn:(i) Find a1, a2 and a3.(ii) Simulate errors for this model with n = 400 and ???? = 2.5.(iii) Construct a
3. Consider the AR(2) model (1 − 0.7B)(1 + 0.1B)????n = wn:(i) Find a1 and a2.(ii) Simulate errors for this model with n = 300 and ???? = 1.5.(iii) Construct a filtering matrix A using a1 and
2. (There could be many reasonable approaches to this exercise. A good statistician will wrestle with this question all their life.) Consider a unique event such as the 2037 Super bowl of the 3011
1. Let b(B) = (1 − 0.6B)2(1 + 0.4B) = 1 − 0.8B − 0.12B2 + 0.144B3(i) Simulate errors for this model with n = 5000 and standard deviation 1.0.(ii) Plot acf(error) and pacf(error).(iii) Use
11. Consider the MA(2) model ????j = −b2wj−2 − b1wj−1 + wj with invertibility conditions(i) b1 + b2 < 1, (ii) b2 − b1 < 1, and (iii)|b2|< 1.Pick choices of b1 and b2 that violate each
10. Show that for MA(2), for all k,|Rk|< 1 for any choice of b1 and b2. In other words, it is always “well behaved.”
9. Consider the following three AR(2) (quadratic) problems of the form a(B) = (1 − c1B)(1 − c2B).Case 1: c1 = 0.7, c2 = 0.6 Case 2: c1 = 0.7, c2 = −0.6 Case 3: c1 = 0.6 − 0.3i, c2 = 0.6 +
8. Using (1 − .7B)(1 + [0.1 − 0.1i]B)(1 − [0.1 + 0.1i]B)????n = wn(i) Find the values for a1, a2, and a3.(ii) Why do we know this is a stationary process?(iii) Write out the Yule–Walker
7. Give the matrix representation for the Yule–Walker equations for AR(4).
6. Simulate AR(2) data with a1 = 0.7, a2 = −0.4, and n = 400. Find the autocorrelation plot and partial autocorrelation plot. Do the plots follow the pattern suggested in this chapter? Explain.
5. Simulate MA(2) data with b1 = −0.5, b2 = 0.1, and n = 500. Find the autocorrelation plot and partial autocorrelation plot. Do the plots follow the pattern suggested in this chapter? Explain.
4. Repeat Exercise 3 for an AR(2) model with a1 = 0.7 and a2 = −0.3.
3. Simulate 400 AR(1) errors with a = 0.8 and ???? = 2.0. Use ar.yw() to get the estimates of a and ???? and verify that these estimates satisfy the Yule–Walker equations.
2. Using pacf() and ar.yw() or ar.mle(), verify that the following models from Chapter 13 are, indeed, quite plausibly AR(1). That is, the residuals are AR(1)after fitting the signal: (i) the logging
1. Showthat, for complex numbers, if f = g∕h, then ̄f = ̄g∕̄h (a brute force approach involves defining g and h quite generally and showing the result).
9. Consider an AR(2) model and the required stability conditions.(i) Find values of a1 and a2 that violate one and only one of the three stationarity conditions (you should have three sets of values,
8. Find the impulse response function for the following and write a program to plot the first 20 values (g0 to g19) :(i) Exercise 2, (ii) Exercise 3, (iii) Exercise 4, (iv) Exercise 5, (v) Exercise 6
7. For the model ARMA(2,2) with AR(2) cj 0.7 and −0.6 , and MA(2) cj 0.2, and −0.8:(i) Find a(B) and b(B), (ii) simulate and plot the data, (iii) how is it known the model is invertible and
6. For the model ARMA(2,2) with AR(2) cj 0.7 ± 0.1i , and MA(2) cj 0.2 ± 0.6i:(i) Find a(B) and b(B), (ii) simulate and plot the data, (iii) how is it known the model is invertible and stationary?
5. For the model ARMA(2,2) with AR(2) cj 0.7 ± 0.1i , and MA(2) cj −0.45, and 0.65:(i) Find a(B) and b(B), (ii) simulate and plot the data, (iii) how is it known the model is invertible and
4. For the model ARMA(0,4) with cj 0.7 ± 0.1i , −0.45, and 0.65:(i) Find a(B) and b(B), (ii) simulate and plot the data, (iii) how is it known the model is invertible?
3. For the model ARMA(4,0) with cj 0.7 ± 0.1i, 0.45, and 0.65:(i) Find a(B) and b(B), (ii) simulate and plot the data, (iii) how is it known the model is stationary?
2. For the model ARMA(3,0) with cj 0.7 ± 0.1i and 0.65:(i) Find a(B) and b(B), (ii) simulate and plot the data, (iii) how is it known the model is stationary?
1. Find the impulse response function for (i) MA(1), (ii) MA(2), (iii) white noise, and (iv) AR(3)
18. Show log10(a)∕log10(b) = loge(a)∕loge(b).
17. Find the number of observations to skip in the following scenarios. Also, compute the achieved value of c and verify that it meets the criterion:(i) ̂a = 0.15, c ≤ 1.1(ii) ̂a = 0.25, c ≤
16. Suppose data has been collected and it is known the serial correlation is AR(1)with some fixeda. If samples were taken twice as often, the model would still be AR(1), how would a change?
15. Using the file “Mitta 36 68.txt” from the Exercises folder (DataMarket–Time Series Data Library–Hydrology–Monthly flows for Mitta Mitta River, Tallandoon, January 1936–December 1968),
14. If you have had a course involving the definitions of expectation, variance, covariance, and correlation,(i) Recall for the unfiltered regression model yk = ????0 + ????1xk + ????k, ????k =
13. Produce prediction intervals for the next 3 years for the global warming data.
12. Write the R code to filter the NYC (adjusted) data and produce a summary similar to that given in the book. Compute the correct R2 for this model.
11. In Section 13.5, it was found that for the one-step method, the width of the interval is ±2 ̂ ????w = ±2√(1 − ̂a2) ̂ ????AR(1) and, for the two-step method, the width of the interval is
10. Create the code for the one- and two-step AR(1) method graphs. Simulate some data with a = 0.5 and duplicate all four graphs (Figures 13.11 and 13.12 of Section 13.5)
9. For the Semmelweis case, analyze the data (producing confidence intervals and a hypothesis test) using the transformation y = loge(p∕[1 − p]). One modification will be required. There is one
8. Describe how the logging case would be carried out as an intervention (Semmelweis)and describe how the Semmelweis case would have been carried out in a fashion similar to the logging case.
7. A different modeling philosophy attaches more importance to the equal variance assumption when performing t-tests and suggests that the two series in, for example, the logging case have the same
6. Based on the case in Section 13.3, does the relationship ????2 AR(1)(1 − a2) = ????2w seem to hold, at least approximately, for this data?
5. Produce 2000 simulations like the one given in Section 13.3 and store the estimated values ofa. Make a histogram and produce summary statistics fora. How well is a estimated in these simulations.
4. “The lines are not much different.” Using the simulated data from Section 13.3, use the summaries of fit (Table 13.3) to make predictions when x = 5, 25, 45, 65, and 85 for both the filtered
3. Justify the comment made in the book, paraphrased as follows: “Because all confidence intervals are too narrow, all p-values are too small.”
2. For the simulated data presented in Section 13.3, use the summaries of fit (Table 13.3) to construct a confidence interval for the slope in the filtered and unfiltered model. How much wider is the
1. A conceptual question related to the patch-cut versus uncut watershed case. If there are differences, are they clearly due to patch-cut versus uncut forest?What could be learned by looking at the
2. From the Exercises folder scan the file “Furnas 31 78.txt” (aka DataMarket–Time seriesdata library–hydrology–Monthly riverflow in cms, Furnas–vazoes medias mensais, 1931–1978)(i)
1. Develop the sequential likelihood ratio test for the Boise river flow data. Is the model chosen that same as using AIC?
10. (For those familiar with likelihood functions in a general setting).(i) Simulate 200 observations from an exponential distribution and compute both the likelihood function and the average of the
9. Using the data “Melbmax.txt” from the exercises folder:Use data splitting, leave-one-out cross-validation (PRESS), and AIC on this data (remember the data is daily, f = 365) to find R2 values.
8. Justify each equality or approximate equality in the chain:AIC1 − AIC2 = n ⋅ loge([1 − R2 1,AIC]/[1 − R2 2,AIC])≈ n ⋅ loge([1 − R2 1,pred]/[1 − R2 2,pred])= n ⋅
7. Show R2 pred − R2 AIC ≈ 2 ⋅ (1 − ????2)∕n when all the “hat” values are equal. (Where, as sample size increases, R2 → ????2).
6. Compute AIC and BIC for the models from Table 11.7. How much more likely is the quadratic model than the straight-line model in this case (use both AIC and BIC).
5. Some authors define information criteria as follows:Akaike’s information: AIC = n ⋅ loge (SSE∕n) + 2 (p + 1)Schwarz information: BIC = n ⋅ loge (SSE∕n) + p ⋅ loge(n)Explain why this is
4. Simulate quadratic data using the previous formulation with the exception that sigma_2
3. Simulate quadratic data using the previous formulation with the exception that y_2
2. For Table 11.6, what would the ANOVA table for a linear fit look like in R?
1. For the first simulated data (y_1, simple regression), form confidence intervals for the slope and intercept using the summary and assess whether the true intervals contain the true value.
8. Plot the number of deaths and serious injuries in UK road accidents each month from January 1969–December 1984 versus time—a seatbelt law was introduced in February 1983 (“UK road
7. Monthly milk production in pounds per cow from January 1962 to December 1975 (“Milk production.txt” from the Exercises folder). Plot this data versus time.What complications do you foresee
6. For the lynx data (“LYNX.txt” from the Exercises folder):(i) Plot the original data versus time. Does it appear that the minima are curved/rounded and the maxima are sharply peaked?(ii) Plot
5. Consider the Melbourne maximum daily temperature data (“Melbmax.txt” from the Exercises folder). Fit a simple periodic model to this data and write up a conclusion similar to the summarizing
4. The impact of one outlier in 168 observations: fit the simple periodic model to the New York temperature data with the outlier still included: (i) recompute values of the summary paragraph; (ii)
3. Let M ⋅ cos (2???? [ft + ????]) = B ⋅ cos (2????ft) + C ⋅ sin(2????ft).Show that B = M ⋅ cos(2????????) and C = −M ⋅ sin(2????????).
2. In the following four cases, determine M and ???? from B and C:(i) B = 3.7, C = 2.1(ii) B = −2.1, C = −7.1(iii) B = −3.2, C = 1.3(iv) B = 2.2, C = −2.3
1. (i) Show that the partial derivatives of minΣ[y(j) − ???? − M ⋅ cos(2????[j∕k + ????])]2 do not form a system of linear equations.(ii) Show that the partial derivatives of minΣ[y(i) −
11. lynx data (“LYNX.txt” from the Exercise folder):Using the smoothing methods, estimate the period, in years, of the lynx data.Comment on each method.
10. Melbourne temperatures (“Melbmax.txt” from the Exercises folder):Develop smoothed periodograms for the Melbourne daily temperature data.What is the correct dominant frequency? Which smoothing
9. Find the weights associated with each of these smoothing approaches (you do not need to find these by hand):(i) sp3
8. Repeat the steps in Exercise 7 for MA(2) with n = 200, b1 = 0.3, b2 = −0.3.
7. Assessing the smoothers for MA(2) with n = 500, b1 = −0.8, b2 = 0.7:(i) Simulate the data.(ii) Fit the periodogram to the data and plot it with “log(spec)” as the y variable and “freq”
6. Suppose lowess() were used with f = 1∕√3 n, would the method produce consistent estimators? What about f = 1∕n2? Justify your answer in terms of bias and variance.
5. Repeat Exercise 4 with the sunspot data.
4. (i) Develop the explicit formula for df in smooth.spline() and r in spans() so that the smoothing of these smoothers is about the same as lowess() with f = ????∕√n, where 0.5 < ???? < 2.(ii)

Showing 1 - 100 of 1340

Basic Data Analysis For Time Series With R 1st Edition DeWayne R. Derryberry - Solutions

Step by Step Answers