Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

CAN SOMEONE PLEASE HELP ME WITH THE R CODE WITH EXPLANTIONS!!!? Question 1 (35 points) This question is related to the prediction of the weekly

image text in transcribed

image text in transcribed

image text in transcribed

CAN SOMEONE PLEASE HELP ME WITH THE R CODE WITH EXPLANTIONS!!!?

Question 1 (35 points) This question is related to the prediction of the weekly electricity consump- tion (called electricity load) in the Nord Pool electricity market (Scandina- vian countries) through Fourier expansions and regression. We shall attempt predicting the weekly load by using a predictor related to time which is meant to represent seasonality. Indeed, we shall use the following model: la = po + 32,5-10pin) + 1,cm) + 4 (1) Chains) = sin ( + 2321), C(cas) = cos (17 *) (2) where .t: denotes the week i.e. t = 0 is the first week, t = 1 is the second week, etc, L: is the weekly load (electricity consumption) during week t, 4: is a noise term to represent errors from the model, chin) and Coll): are the sinusoidal predictors representing the first terms of a Fourier series with respect to time. Note that this model is a simplified version of a model found in the paper of Dupuis et al. (2016). 1 The objective of the question is to find suitable values for P and parameters Bj. j = 0,..., 2P, and then to make predictions with this optimized model. Load the content of the file Loaddata. RData into R with the function load. This should load the following two vectors: Loadseries: a time series of the weekly electricity consumption on the Nord Pool market (Scandinavian countries). Weekseries: the date corresponding to the first day of the week for each entry of the time series. Part (b), 5 points: Construct the predictors design matrix that you will store in the matrix Designmatrix which is associated with the model (1)-(2) when P = 5. The design matrix must have the matrix format and not a data frame format. Part (c), 5 points: Create a function OLSSSE which takes as inputs .yTrain: a vector of responses from your training set, XdesignTrain: the predictors design matrix from your training set, Xdesignvalid: the predictors design matrix for observations in your validation set, Valid: a vector of responses from your validation set, and which outputs SSE: the sum of squared prediction errors from your validation set when predictions were performed through an OLS regression whose parameters were estimated on the training set. Part (d), 10 points: You will now perform 6-fold cross-validation to identify the best value of P among P = 1,2,3,4,5. The dataset will be split as follows: the first subset corresponds to all data points observed in 2007, the second subset corresponds to all data points observed in 2008, the last subset corresponds to all data points observed in 2012, Use you function OLSSSE from part (c) to fill an 5 x 6 matrix SSEmat con- taining as element (ij) the out-of-sample sum of squared prediction errors for the subsample ; and when P = i. Then sum the columns of this matrix to obtain a cross validation sum of squared errors (SSE) for each possible value of P = 1, 2, 3, 4, 5, which you should store in a vector SSEtot. Write the values from SSEtot in your report. According this procedure, which is the best value of P? Part (e), 5 points: Use the optimal value of P you obtained in part (d) to retrain your model (i.e. re-estimate the parameters) on your full data sample. What are the parameter estimates you obtain (i.e. show them in a table)? Use such re-estimated parameters that you should store in a vector OptPreds) to perform predictions for each load value in the time series using your model (1)-(2) Plot the time series of realized load values (the true time series) as a black curve, and then the time series of predicted load values as a blue curve. Add a legend to your plot to identify both curves. Part (f), 5 points: Using your model, what is the predicted value of the weekly load for the week starting on Monday August 13, 2012? Question 1 (35 points) This question is related to the prediction of the weekly electricity consump- tion (called electricity load) in the Nord Pool electricity market (Scandina- vian countries) through Fourier expansions and regression. We shall attempt predicting the weekly load by using a predictor related to time which is meant to represent seasonality. Indeed, we shall use the following model: la = po + 32,5-10pin) + 1,cm) + 4 (1) Chains) = sin ( + 2321), C(cas) = cos (17 *) (2) where .t: denotes the week i.e. t = 0 is the first week, t = 1 is the second week, etc, L: is the weekly load (electricity consumption) during week t, 4: is a noise term to represent errors from the model, chin) and Coll): are the sinusoidal predictors representing the first terms of a Fourier series with respect to time. Note that this model is a simplified version of a model found in the paper of Dupuis et al. (2016). 1 The objective of the question is to find suitable values for P and parameters Bj. j = 0,..., 2P, and then to make predictions with this optimized model. Load the content of the file Loaddata. RData into R with the function load. This should load the following two vectors: Loadseries: a time series of the weekly electricity consumption on the Nord Pool market (Scandinavian countries). Weekseries: the date corresponding to the first day of the week for each entry of the time series. Part (b), 5 points: Construct the predictors design matrix that you will store in the matrix Designmatrix which is associated with the model (1)-(2) when P = 5. The design matrix must have the matrix format and not a data frame format. Part (c), 5 points: Create a function OLSSSE which takes as inputs .yTrain: a vector of responses from your training set, XdesignTrain: the predictors design matrix from your training set, Xdesignvalid: the predictors design matrix for observations in your validation set, Valid: a vector of responses from your validation set, and which outputs SSE: the sum of squared prediction errors from your validation set when predictions were performed through an OLS regression whose parameters were estimated on the training set. Part (d), 10 points: You will now perform 6-fold cross-validation to identify the best value of P among P = 1,2,3,4,5. The dataset will be split as follows: the first subset corresponds to all data points observed in 2007, the second subset corresponds to all data points observed in 2008, the last subset corresponds to all data points observed in 2012, Use you function OLSSSE from part (c) to fill an 5 x 6 matrix SSEmat con- taining as element (ij) the out-of-sample sum of squared prediction errors for the subsample ; and when P = i. Then sum the columns of this matrix to obtain a cross validation sum of squared errors (SSE) for each possible value of P = 1, 2, 3, 4, 5, which you should store in a vector SSEtot. Write the values from SSEtot in your report. According this procedure, which is the best value of P? Part (e), 5 points: Use the optimal value of P you obtained in part (d) to retrain your model (i.e. re-estimate the parameters) on your full data sample. What are the parameter estimates you obtain (i.e. show them in a table)? Use such re-estimated parameters that you should store in a vector OptPreds) to perform predictions for each load value in the time series using your model (1)-(2) Plot the time series of realized load values (the true time series) as a black curve, and then the time series of predicted load values as a blue curve. Add a legend to your plot to identify both curves. Part (f), 5 points: Using your model, what is the predicted value of the weekly load for the week starting on Monday August 13, 2012

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions