Question
Please help with below questions. Introduction This script will walk you through the process of fitting a linear model using polynomial basis functions, and the
Please help with below questions.
Introduction
This script will walk you through the process of fitting a linear model using polynomial basis functions, and the selection of a hyper-parameter using a validation data set.
importnumpyasnp
importmatplotlib.pyplotasplt
#Definingdatagenerationmodel
deff(x):
return5*(x-0)*(x-0.5)*(x-1)
#Definingstdofnoise
stdNoise=0.2;
#Settingaseedforrandomnumbergenerator
rng=np.random.default_rng(seed=42)
#Generatingtraining,validationandtestdata
N=80
x_train=rng.random(N)
y_train=f(x_train)+stdNoise*rng.normal(size=N)
N=20
x_test=rng.random(N)
y_test=f(x_test)+stdNoise*rng.normal(size=N)
#Plottingdatapurelyforverification
plt.plot(x_train,y_train,'k.',x_test,y_test,'r.')
plt.xlabel('x')
plt.ylabel('y')
plt.legend({'Training','Testing'})
plt.show()
#FunctionthatcreatestheXmatrixasdefinedforfittingourmodel
defcreate_X(x,deg):
X=np.ones((len(x),deg+1))
foriinrange(1,deg+1):
X[:,i]=x**i
returnX
#Functionforpredictingtheresponse
defpredict(x,beta):
returnnp.dot(create_X(x,len(beta)-1),beta)
#Functionforfittingthemodel
deffit(x,y,deg):
returnnp.linalg.lstsq(create_X(x,deg),y,rcond=None)[0]
#FunctionforcomputingtheMSE
defrmse(y,yPred):
se=(y-yPred)**2
returnnp.sqrt(np.mean(se))
#Fittingmodel
deg=2
beta=fit(x_train,y_train,deg)
#Computingtrainingerror
y_train_pred=predict(x_train,beta)
err=rmse(y_train,y_train_pred)
print('TrainingError={:2.3}'.format(err))
#Computingtesterror
y_test_pred=predict(x_test,beta)
err=rmse(y_test,y_test_pred)
print('TestError={:2.3}'.format(err))
#Plottingfittedmodel
x=np.linspace(0,1,100)
y=predict(x,beta)
plt.plot(x,y,'b-',x_train,y_train,'ks',x_test,y_test,'rs')
plt.legend(['Prediction','TrainingPoints','TestPoints'])
plt.show()
Question 1 [20 pts]
Your first tasks is to split the data into pre-val training and validation. You should use the last 30 samples for validation and the rest of the pre-validation training set. Keep all measurements in the same order as the original training set. Make sure the variables specified below are used for this purpose.
x_preval,y_preval=[],[]
x_val,y_val=[],[]
print(len(x_val),y_val)
#YOURCODEHERE
raiseNotImplementedError()
"""Checkthatthedimensionsarecorrectandthecorrectdataisincludedineachvariable"""
assertlen(x_val)==30
assertlen(y_val)==30
assertlen(x_preval)==len(x_train)-30
assertlen(y_preval)==len(y_train)-30
assertx_val[-1]==x_train[-1]
asserty_val[-1]==y_train[-1]
assertx_preval[0]==x_train[0]
asserty_preval[0]==y_train[0]
Question 2 [40 pts]
Next, compute training and validation errors for each of the listed degrees. The training error should show a decreasing pattern. The validation error should decrease and then increase.
#Listofdegreesconsideredfortheanalysis
degList=[0,1,2,3,4,5,6,7,8,9,10]
#Initializingrangeofdegreevaluestobetestedanderrors
errTrain=np.zeros(len(degList))
errVal=np.zeros(len(degList))
#ComputingtrainingandvalidationRMSEerrorsforeachdegreevalue
#YOURCODEHERE
raiseNotImplementedError()
#Plottingresults
plt.plot(degList,errTrain,'b.-',degList,errVal,'r.-')
plt.xlabel('degree')
plt.ylabel('RMSE')
plt.legend(['Pre-ValidationTrainingError','ValidationTrainingError'])
plt.show()
"""Checkthatthecorrecttrendsandthecorrectvaluesarepresent"""
assert-np.max(np.diff(errTrain))>0###Checkingformonotonicityoftrainingerror
assert-np.min(np.diff(errVal))>0###Checkingforsomedecreasingtrendinthevalidationerror
assertnp.max(np.diff(errVal))>0###Checkingforsomeincreasingtredinthevalidationerror
assertnp.abs(min(errTrain)-0.14)<1e-2###Checkingtheminimumofthetrainingerror
assertnp.abs(min(errVal)-0.22)<1e-2###Checkingtheminimumofthetestingerror
Performance of Optimal Model
We demonstrate the performance of the model by comparing the error when training with only the pre-validation training data, and with the training and validation data after the hyper-parameter has already been selected.
Question 3 [20 pts]
Complete the code to compute the desired test errors for the models trained using the pre-validation training set and the full training set.
#Selectingoptimaldegree
degOpt=degList[np.argmin(errVal)]
print('OptimalDegree={:1}'.format(degOpt))
#Initializingvariablefortheerrorusingonlythepre-validationtrainingset
errTest_PreVal=[]
#Initializingvariablefortheerrorusingonlythefulltrainingset
errTest_FullTrain=[]
#YOURCODEHERE
raiseNotImplementedError()
#Printingresults
print('TestError[PrevalDatasetOnly]={:2.3}'.format(errTest_PreVal))
print('TestError[FullTrainingDataset]={:2.3}'.format(errTest_FullTrain))
#Plottingfittedmodel
x=np.linspace(0,1,100)
y=predict(x,beta)#Usethebetafromthefulltrainingsetforbettervisualization
plt.plot(x,y,'b-',x_train,y_train,'ks',x_test,y_test,'rs')
plt.legend(['Prediction','TrainingPoints','TestPoints'])
plt.show()
"""Checkthatthecorrectvaluesarepresent"""
asserterrTest_PreVal>errTest_FullTrain#Thefull-trainingseterrorshouldbelower
Question 4 [15 pts]
Is this performance using the full training set better than what it was observed when training the model only with the pre-val training set? Why is that the case? Please enter your response below.
YOUR ANSWER HERE
Question 5 [5 pts]
Do you always expect this to be the case? Please enter your response below.
YOUR ANSWER HERE
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started