Question
The purpose of this assignment is to review the theory and mechanics of linear regression, as previously covered in DSC-520. Linear regression is one of
The purpose of this assignment is to review the theory and mechanics of linear regression, as previously covered in DSC-520. Linear regression is one of the cornerstones of predictive modeling, and knowledge of this methodology is necessary to progress further in data science.
To demonstrate completion of this assignment, create Word document with your working code, screenshots of program results, and written answers to questions. Writing should be professional and rigorous, and include scientific/mathematical justification, where appropriate, for all conclusions reached. Upload your final Word document to the LMS when complete.
Part 1:Operational Tasks
For the following exercises, work with theadultdata set. Use Python to solve each problem.
- Partition the data set into a training set and a test set, each containing about half of the records.
- Run a regression model to predictHours per WeekusingAgeandEducation Num. Obtain a summary of the model. Are there any predictor variables that should not be in the model?
- Validate the model from the previous exercise.
- Use the regression equation to complete sentence: "The estimated hours per week equals...."
- Interpret the coefficient forAge.
- Interpret the coefficient forEducation Num.
- Find and interpret the value ofs.
- Find and interpretRadj 2.
- FindMAEBaselineandMAERegression, and determine whether the regression model outperformed its baseline model.
Part 2:Mathematical and Statistical Basis
- Referring to your notes from DSC-520, show that the regression model created in Part 1 is mathematically consistent with the least-squares methodology for linear regression.
- Does the regression model's correlation coefficient r, and the coefficient of determination r squared, demonstrate a strong, medium, or weak relationship between the predictor and criterion variables? Is the regression result statistically significant given the p-value?Assume a 0.05 cutoff value.
Include references to all theoretical concepts and works cited. Show all your steps with explanations. Explain major components of complex solutions, code, and any output. Include captions to tables, images, and diagrams. Use formal and detailed mathematical and scientific notation throughout the document.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started