Question
This module has been concerned with multiple regression - including multiple terms in our regression models, be they higher power terms, dummy variables, interaction terms,
This module has been concerned with multiple regression - including multiple terms in our regression models, be they higher power terms, dummy variables, interaction terms, or just additional quantitative variable terms. Models can become very complex, but model complexity can be a problem. This discussion post is concerned with the issue of 'overfitting' a model to the data. Make your initial post by Thursday night and your response post by the end of the module. |
Initial Post
Watch this video:https://www.youtube.com/watch?v=ls3XKoGntXgLinks to an external site.
This article illustrates the issue of overfitting as well:Model selection and overfitting - nature.pdf
Actions
When you are studying a response variable, you often have the opportunity to measure many predictor variables in your study. Every additional predictor variable, interaction term, power term, etc will increase R2 and thus your model will explain more of the variation in the response variable. Why, then, would we not want to include a predictor variable in our regression model? Please explain the reasoning for model simplification.
Give an example of a population with a response variable you would be interested in modeling - give examples of at least 5 predictor variables that you might consider when collecting data to create a model.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started