Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Suppose we are given a training dataset consisting of feature vectors x 1 , dots, x N i n R d and corresponding target values

Suppose we are given a training dataset consisting of feature vectors
x1,dots,xNinRd and corresponding target values y1,dots,yNinR. The
vector xiinRd can be written in more detail as
xi=[xi1vdotsxid].
We would like to find a vector
=[01vdotsd]inRd+1
such that
yi~~0+1xi1+cdots+dxid for i=1,dots,N.
(a) For any given vector inRd+1, the mean squared error can be written
as
L()=1N||x-y||2.
What is the matrix x and what is the vector y? Explain why the
above formula computes the mean squared error.
(b) Let's compute L'() using the chain rule. The function L can be
expressed as
L()=g(h())
for certain functions g and h. What are g and h? Write down for-
mulas for h'() and g'(u). What is L'()? What is gradL()?
(c) In single variable calculus, we learn to minimize functions by setting
the derivative equal to 0. In multivariable calculus, we can attempt
to minimize L by setting gradL()=0 and solving for . Write down
the equation gradL()=0 and simplify. Do we know of a technique
that can be used to solve for ?
(d) Write Python code that uses the above approach to implement linear
regression from scratch using the California housing dataset. You can
use scikit-learn to load the dataset and split the dataset into training
and validation datasets. However, you should not use scikit-learn
to train your model. It's ok to use numpy to help with vector and
matrix operations.
(e) An alternative strategy to find a vector inRd+1 which minimizes
L() is to use the gradient descent algorithm. Write Python code
that trains a linear regression model from scratch using gradient de-
scent. You can use scikit-learn to load the California housing dataset
and to split it into training and validation datasets, but you should
not use scikit-learn to train your model. Make a plot (called a "con-
vergence plot") that shows the value of L() after each iteration of
gradient descent. Try out different learning rates. Which learning
rate leads to the fastest convergence? How many iterations of gradi-
ent descent do you need in order for the gradient descent iteration to
converge?
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

Differentiate between legislation and corporate governance.

Answered: 1 week ago