Answered step by step
Verified Expert Solution
Question
1 Approved Answer
4 Under - Parameterization and Over - Parameterization In the previous section, we had more data points than features in our data, i . e
UnderParameterization and OverParameterization
In the previous section, we had more data points than features in our data, ie we were looking at This tends to be the ideal situation, since we need to find an unknown weight for each feature, and this gives us enough information to determine each weight similar to how two data points are enough to find the slope and intercept, the two unknowns, of a line
Sometimes, however, we may have fewer data points than we have features this makes it difficult to determines how the underlying model should depend on each feature. We just don't have enough data. In the following problems, consider a training data set of size and a test data set of size
Problem : Let be a matrix of random values, with rows and columns, where each entry sampled from a distribution. Note that for any input vector will be a vector of values. We could then consider performing linear regression on the data points rather than Note that if this transformed data set will have fewer input features than we have data points in our data set, and thus we restore linear regression to working order.
Plot over from to the testing error when, for a given you pick a random to transform the input vectors by then do linear regression on the result. You'll need to repeat the experiment for a number of for each to get a good plot. What do you notice? Does this seem to be a reasonable trend?
Problem : Notice that there's nothing stopping us from continuing to increase This puts us in a region over overparameterization we have more features in our data than data points and in fact increasingly overparameterization, if we were bold enough to take One possible solution is to when performing linear regression on the transformed data, do ridge regression, introducing the ridge penalty into the loss we are minimizing.
Continue the experiment, for dots, plotting the resulting testing error averaged over multiple choices of How did you choose a good value? Note that the number of weights we need to find changes with should this influence What do you notice?
Bonus: Why does this happen?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started