Question
Reuse your submitted data and codes on previous assignment (Linear Regression Project) to implement a demonstration on Regularized Linear Regression that shows how a proper
Reuse your submitted data and codes on previous assignment (Linear Regression Project) to implement a demonstration on Regularized Linear Regression that shows how a proper choose for regularization coefficient could prevent the model being over-fitted;
Make the hypothesis space so enriched that the model could "interpolate" training data, i.e. the error on training data tends close to zero.
Implement five learning approaches;
2.1. Task CF. Learning using the closed-form equation that is a Full Batch learning and that solves the linear equation system derived from the Regularized Linear Regression with Quadratic Regularization (aka Tikhonov Regularization, and as Ridge Regression). Name this task as CF (Closed form equation, Full-batch).
2.2. Task GD. Derive an equation based on Gradient Descent to iteratively update weight parameters for Regularized Linear Regression with Quadratic and Lasso Regularization. In this task, you compute gradient of the regularized mean-square-error given whole of training data, i.e. a full-batch. This task is forked by two kinds of regularizations, the quadratic regularization (GD-R), and lasso regularization (GD-L).
2.3. Task SGD. Do the same as GD, but consider an Online (Sequential) Learning, where, the gradient is computed for the regularized square-error committed by a single sample randomly drawn from the training data. This task is forked by two kinds of regularizations, too, the quadratic regularization (SGD-R), and lasso regularization (SGD-L).
Train your regression model using 80% of data and compute RMSD of the trained model with respect to the remaining 20% of data. Do this task once for each one of the five learning method described in 2, i.e. CF, GD-R, GD-L, SGD-R, SGD-L.
Set the regularization coefficient with variable values, in some log-scale ranges, e.g. lambda=0.01, 0.1, 1, 10, and repeat the whole procedure by step 3.
Prepare a summary, either by a table or a plot, that reports the RMSD with respect to the learning methods and variable values for the regularization coefficient. Precisely, it is a table of RMSD values with five rows for five learning methods, and four columns for four values for lambda, or it can be shown by five curves, each one is corresponded to one of the five learning methods, and shows the RMSD vs the regularization coefficient.
For GD-L, and SGD-L, count and compare the number of non-zero weights. Write a short discussion on your results, especially if you could find some logical explanations, regarding the essence of the real-world problem that you chose as the subject of this assignment, about the zero weights (or from the other side, about non-zero weights) that is obtained via GD-L and SGD-L. All your implementations must be done without use of any high-level library of ML or Deep Learning functions.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access with AI-Powered Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started