Question

1 Approved Answer

Posted on Sep 11, 2024

Solve all parts with code The google colab code/file is : { cells: [ { cell_type: markdown, metadata: {}, source: [ # Linear Regression for

image text in transcribed Solve all parts with code

The google colab code/file is :

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Linear Regression for Red Wine Quality Classification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this problem, we are interested in the problem of classifying red wine qualities. You will train a linear regression classifier to predict the **sensory score** of wines based on their physicochemical properties including but not limited to pH, chlorides, density, and amount of residual sugar after fermentaion stops. You have been provided with a dataset of $m=1279$ training samples and $320$ test samples, where each sample contains 11 physicochemical properties for the selected wine. ", " ", "Formally, you can build the regression models as follows: ", " ", "\\begin{align*} ", " Y_{score} &= w_0 + w_1^{\\top} x_1 + w_2 x_2 + \\dots + w_{11} x_{11}\\\\ ", "\\end{align*} ", " ", "where $x_i \\in \\mathbb{R}^N$ corresponds to the $i$ th physicochemical property of interest for all $N$ wines." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports ", " ", "We are only using numpy, pandas, and a few plotting functions here. Please do not import any additional packages. Additional package imports will result in zero credit, unless the TAs have announced additional allowances on piazza." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os ", "import numpy as np ", "import pandas as pd ", "# Plotting ", "import seaborn as sb ", "import matplotlib.pyplot as plt ", "import time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading Data ", " ", "Here, we load the wine data into memory, and examine its shape. You should notive that there are 1279 training samples, and 320 test samples, with 11 features each. We will also visualize a histogram of the sensory scores." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Train Data ", "X_train = np.load(os.path.join('train', 'wine_features.npy')) ", "y_train = np.load(os.path.join('train', 'wine_quality.npy')) ", "print(\"X.shape \", X_train.shape) ", " ", "# Test Data ", "X_test = np.load(os.path.join('test', 'wine_features.npy')) ", "y_test = np.load(os.path.join('test', 'wine_quality.npy')) ", "print(\"X_test.shape \", X_test.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we will quickly visualize our target variables. We can quickly observe that the training sensory scores _seem_ to roughly follow a gaussian distribution." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sb.set(font_scale=1.2) ", "target_fig, target_ax = plt.subplots(1, figsize=(5,3)) ", "target_ax.hist(y_train) ", "target_ax.set_title('Sensory Score') ", "target_ax.set_ylabel('Count') ", "target_ax.set_xlabel('Sensory Score')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem a) Regularized Linear Regression - Gradient Desecnt ", " ", "Implement linear regression to predict $Y_{score}$. Use **batch gradient descent** as your optimization technique with the **Least Mean Square objective** function. Use a learning rate of $r=10^{-3}$ and train for a maximum of $T=10$ epochs. Report your final LMS error on the test data, and provide a plot of the training error computed at each epoch. Initialze your parameters randomly from a normal distribution $w_i \\sim \\mathcal{N}(0,1) $. You will need to implement linear regression with no regularization, l1 regularization, and l2 regularizaiton. ", " ", "First, modify the function below to initialize parameters by drawing from the ", "normal distribution. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# MODIFY THIS ", "def initialize_parameters(n): ", " \"\"\" ", " This function initializes the parameters by drawing from a normal distribution. ", " args: ", " n_features - the number of features to return a vector for ", " output: ", " w - ( feature x 1 column vector ) - vector of randomly initialized parameters ", " ", " ", " \"\"\" ", " return None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below, we have already initialized the hyper-parameters needed for this experiment. We have also provided several functions which you should fill out. At the bare minimum, you must implement the LMSE loss function (with no regularization, with l1 regularization, with l2 regularization) and the train function. The loss function must return a real-valued loss given data, labels, and parameters, and the training function should use batch SGD to train the model for a given set of epochs, and return the loss accumulated over each epoch. ", " ", "Please refer to the following formula when implementing the loss function with no regularization, l1 regularization, and l2 regularization respectively. ", " ", "\\begin{align*} ", " \\text{LMSE with no regularization} &= \\frac{1}{2} \\sum^m_{i=1} (y_i - w^\\top x_i)^2\\\\ ", " \\text{LMSE with l1 regularization} &= \\frac{1}{2} \\sum^m_{i=1} (y_i - w^\\top x_i)^2 + \\lambda \\sum^n_{j=1} |w_j|\\\\ ", " \\text{LMSE with l2 regularization} &= \\frac{1}{2} \\sum^m_{i=1} (y_i - w^\\top x_i)^2 + \\lambda \\sum^n_{j=1} w_j^2 ", "\\end{align*} ", " ", "**You are welcome to add additional functions as needed, but please make sure to thoroughly comment everything you add. Also do not import any additional functions from other libraries.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DEFAULT_R=1e-3 # fixed ", "DEFAULT_EPOCHS=10 ", "DEFAULT_LAMBDA=0.1 ", "DEFAULT_BATCH_SIZE=32 ", "NO_REGULARIZATION='no_reg' ", "L1_LOSS = 'l1' ", "L2_LOSS = 'l2' ", " ", "# Loss function - MODIFY THIS ", "def loss(X, y, w, regularization_type=NO_REGULARIZATION, lam=DEFAULT_LAMBDA): ", " \"\"\"Return the real value loss over a given data matrix X, with its labels y, ", " and parameters, w. You should compute loss with no regularization, l1 regularization, ", " and l2_regularization. ", " ", " ARGS: ", " X - (m x n ) matrix of features ", " y - (m x 1 ) column vector of targets ", " w - (n x 1 ) column vector of parameters ", " regularization_type - one of NO_REGULARIZATION, L1_LOSS, L2_LOSS ", " lam - hyperparameter for regularization ", " \"\"\" ", " loss = 0.0 ", " if regularization_type == NO_REGULARIZATION: ", " # TODO: Calculate the loss when there is no regularization ", " pass ", " elif regularization_type == L1_LOSS: ", " # TODO: Calculate the loss when there is l1 regularization ", " pass ", " elif regularization_type == L2_LOSS: ", " # TODO: Calculate the loss when there is l2 regularization ", " pass ", " return loss ", " ", "# Training function - MODIFY THIS ", "def train(X, Y, w, ", " r=DEFAULT_R, ", " epochs=DEFAULT_EPOCHS, ", " regularization_type=NO_REGULARIZATION, ", " lam=DEFAULT_LAMBDA, ", " batch_size=DEFAULT_BATCH_SIZE): ", " \"\"\"Train the model using batch gradient descent for a certain number of epochs. ", " Note for each batch you must do the following: ", " - record the loss ", " - compute the gradient for that batch ", " - update the parameters according to that gradient ", " And for each epoch, you should record the sum of the error over all batches. ", " After training is complete, return the parameters and the ", " list of training losses for each epoch as a tuple ", " ", " You may define however many functions you need to do this, but please ", " comment all additional functions thoroughly. You may NOT import additional ", " libraries. ", " ", " Please do not add egregious print statements to your code. 1 print statement ", " per epoch should suffice. ", " ", " ARGS: ", " X - (m x n ) matrix of features ", " y - (m x 1 ) column vector of targets ", " w - (n x 1 ) column vector of parameters ", " KWARGS: ", " r - (real > 0) the learning rate ", " epochs - (natural > 0) the number of epochs to train ", " lam - (real) the regularization parameter ", " batch_size - (natural > 0 ) the size of each batch ", " \"\"\" ", " losses = [] ", " for epoch in range(epochs): ", " # TODO: Put internal logic for each batch here ", " # You will need to handle eatch batch separately ", " pass ", " return w, losses ", " ", "# Testing function - NO MODIFICATION NEEDED ", "def test(X, Y, w, regularization_type=NO_REGULARIZATION): ", " \"\"\" ", " In testing, we only compute the loss over the test samples, without updating ", " the gradient. This function returns the loss (a real number) using the ", " function implemented below. You do not need to modify this. ", " \"\"\" ", " X_normalized = Xp.max(X) ", " return loss(X_normalized, Y, w, regularization_type) ", " ", "# Plotting function - NO MODIFICATION NEEDED ", "def plot_loss(losses, title=\"Loss\"): ", " \"\"\"Create a simple seaborn lineplot for the losses ", " \"\"\" ", " sb.set(font_scale=2) ", " fig, ax = plt.subplots(1, 1, figsize=(10,10)) ", " sb.lineplot(x=range(len(losses)), y=losses, ax=ax) ", " plt.title(title) ", " plt.xlabel(\"Epochs\") ", " plt.ylabel(\"Loss\") ", " return fig" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have defined our needed functions, we can train and test the model, and then plot/ report the results below. ", " ", "We first train the model with no regularization." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(314159) # DO NOT MODIFY THE SEED ", "m, n = X_train.shape ", "# Initialize parameters ", " ", "# Training with no regularization ", "print(\"Computing for sensory score with no regularization\") ", "start = time.time() ", "w = initialize_parameters(n) ", "X_train = np.concatenate([X_train, np.ones((X_train.shape[0], 0))],1) ", "w_no_reg, loss_no_reg = train(X_train, y_train, w, regularization_type=NO_REGULARIZATION) ", "no_reg_train_duration = time.time() - start ", "# Testing with no regularization ", "X_test = np.concatenate([X_test, np.ones((X_test.shape[0], 0))], 1) ", "test_loss_no_reg = test(X_test, y_test, w, NO_REGULARIZATION) ", " ", " ", "print(\"**GRADIENT DESCENT**\") ", "plot_loss(loss_no_reg, title=\"Loss for Sensory Score Prediction with no regularization\") ", "plt.savefig(\"sensory_score_no_reg.png\") ", "print(\"Final training loss achieved {loss}\".format(loss=loss_no_reg[-1])) ", "print(\"Test loss achieved {loss}\".format(loss=test_loss_no_reg)) ", "print(\"Duration {time}\".format(time=no_reg_train_duration))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now train the model with l1 regularization." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Training with l1 regularization ", "print(\"Computing for sensory score with l1 regularization\") ", "start = time.time() ", "w = initialize_parameters(n) ", "X_train = np.concatenate([X_train, np.ones((X_train.shape[0], 0))],1) ", "w_l1_reg, loss_l1_reg = train(X_train, y_train, w, regularization_type=L1_LOSS) ", "l1_reg_train_duration = time.time() - start ", "# Testing with l1 regularization ", "X_test = np.concatenate([X_test, np.ones((X_test.shape[0], 0))], 1) ", "test_loss_l1_reg = test(X_test, y_test, w, L1_LOSS) ", " ", " ", "print(\"**GRADIENT DESCENT**\") ", "plot_loss(loss_l1_reg, title=\"Loss for Sensory Score Prediction with l1 regularization\") ", "plt.savefig(\"sensory_score_l1_reg.png\") ", "print(\"Final training loss achieved {loss}\".format(loss=loss_l1_reg[-1])) ", "print(\"Test loss achieved {loss}\".format(loss=test_loss_l1_reg)) ", "print(\"Duration {time}\".format(time=l1_reg_train_duration))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now train the model with l2 regularization" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Training with l2 regularization ", "print(\"Computing for sensory score with l2 regularization\") ", "start = time.time() ", "w = initialize_parameters(n) ", "X_train = np.concatenate([X_train, np.ones((X_train.shape[0], 0))],1) ", "w_l2_reg, loss_l2_reg = train(X_train, y_train, w, regularization_type=L2_LOSS) ", "l2_reg_train_duration = time.time() - start ", "# Testing with l2 regularization ", "X_test = np.concatenate([X_test, np.ones((X_test.shape[0], 0))], 1) ", "test_loss_l2_reg = test(X_test, y_test, w, L2_LOSS) ", " ", " ", "print(\"**GRADIENT DESCENT**\") ", "plot_loss(loss_l2_reg, title=\"Loss for Sensory Score Prediction with l2 regularization\") ", "plt.savefig(\"sensory_score_l2_reg.png\") ", "print(\"Final training loss achieved {loss}\".format(loss=loss_l2_reg[-1])) ", "print(\"Test loss achieved {loss}\".format(loss=test_loss_l2_reg)) ", "print(\"Duration {time}\".format(time=l2_reg_train_duration))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare the results you obtained with the different regularization methods. Which type of regularization performed best on the test set? Why?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### >" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem b) ", " ", " As we explored above, the exact solution for linear regression can be computed using an analytic solution. The solution for **l2 regularized** linear regression can be given as ", " $\\begin{align*} ", " \\mathbf{w}^* &= (XX^T + \\lambda I)^{-1}XY ", " \\end{align*}$ ", "implement this exact solution for the parameters $\\mathbf{w}^*$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def analytic_solution(X, y, lam): ", " \"\"\"Compute the analytic solution for regularized regression using the equation above. ", " Return the value of the updated parameters. ", " ", " ARGS: ", " X - (n x m ) matrix of features ", " y - (n x 1 ) column vector of targets ", " lam - real number denoting the regularization term ", " \"\"\" ", " return None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute the analytic solution for the problem of sensory scores. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "start = time.time() ", "print(\"Computing analytic soln for sensory score\") ", "w = analytic_solution(X_train, y_train, DEFAULT_LAMBDA) ", "duration = time.time() - start ", "training_loss = loss(X_train, y_train, w, L2_LOSS) ", "test_loss = loss(X_test, y_test, w, L2_LOSS) ", " ", "print(\"Training loss achieved {loss}\".format(loss=training_loss)) ", "print(\"Test loss achieved {loss}\".format(loss=test_loss)) ", "print(\"Duration {time}\".format(time=duration))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare the final LMS error using this solution with the error achieved in part a. Why are the test loss higher for the test set in part a compared to this method? " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### >" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem c) ", " ", "Sometimes, a different choice of hyper-parameter can provide better results. In machine learning, we perform a **line-search** over a hyper-parameter by selecting a range of values for one particular hyper-parameter and retraining the model to see if better results are achievable. For this problem, pick five different values of your own for **learning rate**, and retrain the models using gradient descent from part a) for 10 epochs. Compare the training and testing error you achieve with these values by reporting the final values. You may wish to do additional plotting with seaborn and matplotlib to summarize your results." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# PUT YOUR CODE FOR SIMPLE LINE SEARCH HERE ", "# YOU ONLY NEED TO TEST FIVE VALUES OF LEARNING RATE" ] } ], "metadata": { "interpreter": { "hash": "40d3a090f54c6569ab1632332b64b2c03c39dcf918b08424e98f38b5ae0af88f" }, "kernelspec": { "display_name": "Python 3.7.4 ('base')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }

4 Programming - Linear Regression for Red Wine Quality Assessment [35 pts ] For this problem, use the provided google colab notebook part4.ipynb. The text here restates the problem as it is written in that notebook, but excludes the code snippets you are encouraged to build off of. In this problem, we are interested in the problem of classifying red wine qualities. You will train a linear regression classifier to predict the sensory score of wines based on their physicochemical properties including but not limited to pH, chlorides, density, and amount of residual sugar after fermentaion stops. You have been provided with a dataset of m=1279 training samples and 320 test samples, where each sample contains 11 physicochemical properties for the selected wine. Formally, you can build the regression models as follows: Yscore=w0+w1x1+w2x2++w11x11 where xiRN corresponds to the i th physicochemical property of interest for all N wines. (a) Implement linear regression to predict Yscore. Use batch gradient descent as your optimization technique with the Least Mean Square objective function. Use a learning rate of r=103 and train for a maximum of T=10 epochs. Report your final LMS error on the test data, and provide a plot of the training error computed at each epoch. Initialze your parameters randomly from a normal distribution wiN(0,1). You will need to implement linear regression with no regularization, 11 regularization, and 12 regularizaiton. Please refer to the following formula when implementing the loss function with no regularization, 11 regularization, and 12 regularization respectively. [15 pts] LMSEwithnoregularizationLMSEwith11regularizationLMSEwith12regularization=21i=1m(yiwxi)2=21i=1m(yiwxi)2+j=1nwj=21i=1m(yiwxi)2+j=1nwj2 (b) As we explored above, the exact solution for linear regression can be computed using an analytic solution. The solution for 12 regularized linear regression can be given as w=(XXT+I)1XY implement this exact solution for the parameters w. Compare the final LMS error using this solution with the error achieved in part a. Why are the test loss higher for the test set in part a compared to this method? [10 pts ] (c) Sometimes, a different choice of hyper-parameter can provide better results. In machine learning, we perform a line-search over a hyper-parameter by selecting a range of values for one particular hyperparameter and retraining the model to see if better results are achievable. For this problem, pick five 3 different values of your own for learning rate, and retrain the models using gradient descent from part a for 10 epochs. Compare the training and testing error you achieve with these values by reporting the final values. You may wish to do additional plotting with seaborn and matplotlib to summarize your results. [10pts]