Consider the US monthly unemployment rates from January 1948 for September 2017 for 837 observations. The data
Question:
Consider the US monthly unemployment rates from January 1948 for September 2017 for 837 observations. The data can be downloaded from FRED (code: UNRATE) and are seasonally adjusted. The series has strong serial correlations. The goal of this exercise is to explore the effect of strong serial dependence on the performance of bagging and random forests. To this end, use the first 700 observations as the modeling subsample and reserve the last 137 observations as the forecasting subsample. Let Xt =
(xt−1,…, xt−24)′ be the predictors, where xt is the unemployment rate at time t.
(a) Apply bagging with 1000 bootstrap iterations to the modeling subsample and compute the mean absolute forecast error and the root mean squared forecast errors of the out-of-sample predictions.
(b) Apply random forests with mtry = 5 and ntree=500 and compute the out-of-sample prediction measures as part (a).
(c) Repeat part
(b) with mtry = 8 and btree=500.
(d) Compare and comment on the forecasting accuracy of parts
(a) to (c).
In particular, what is the effect of strong serial correlations on bagging and random forests, if any?
Step by Step Answer: