Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Regression Project Guidelines Data source: SPSS survival manual 4e, 2010 by Julie Pallant. You need to refer to this under heading labeled data source in

Regression Project Guidelines

Data source:

SPSS survival manual 4e, 2010 by Julie Pallant. You need to refer to this under heading labeled "data source" in your report.

The author obtained the scaled scores of the following variables from an experiment in psychology through a set of survey questions answered by the respondents.

Dependent variable (DV):

tlifesat: total life satisfaction

Independent variable (IDV):

toptim: total optimism

tmast: total mastery

tposaff: total positive affect

tnegaff: total negative affect

tpstress: total perceived stress

sex: 1 if male; 2 if female

age: in years

Note: You will apply the following guidelines and your class notes on the data given by your instructor.

  • Split your data sample randomly as shown in the class - Training (80%) and Test (20%). Run descriptive tests on your regression dependent variable (DV). Identify outliers if any by "mean 3*standard deviation." Keep the outliers for regressions.
  • Use analysis of descriptive statistics to summarize the data. Comment on the findings.
  • Develop an estimated regression equation (regression model) and use that to predict your DV in the test sample. Identify which independent variables are statistically significant. Use variable names from the header row in the data file to write the regression equation.

Training Sample:

  • Run correlation on all variables (except sex) on the training sample. Analyze.
  • Run regression on the training sample (include sex and all other IDVs).
  • Report adjusted R2 and other relevant statistics as discussed in the class.

Test Sample:

  • Predict your DV values on test sample using the regression model from training sample.
  • Report generalization mean squared error (prediction accuracy).

Analytical Report:

The report will consist of three sections in one single word file:

  1. with all group member's name
  2. (Section 1) Summarize results on three single-spaced pages (maximum).

As discussed in the class, summary pages should contain brief descriptions of the following: Problem description or research question, data sources, DV descriptive statistics, correlation, regression model equation, model estimation and fit statistics, residual plots, model Generalization, and

conclusion and recommendation.

  1. (Section 2) Cut and paste your results and plots analyzed in section 1 from your Excel file.
  2. (Section 3) Data columns for all the regression variables in the test file only, ID, random number, forecast of DV, error and squared error. The RMSE test number needs to be there. Do not keep any extra columns that you may have generated to do the project.

Thresholds:

You will use the thresholds for correlation numbers (weak/moderate/strong). You will use 0.7 for determination of potential multi-collinearity problem. For outlier detection of the DV you will use "mean 3*standard deviation" method. For correlation (could be positive or negative) you will use the rule: absolute value between 0 and 0.3 - weak correlation, between 0.3 and 0.5 moderate correlation and greater than 0.5 strong correlation.

Model Generalization:

To test the generalization power of your model, you need to split randomly the sample into 80%/20%. You will use 80% data set to run the linear regression. After you run the regression, you will get a model equation. You will also get an estimate of mean squared error (MSE of training data). Use the model equation to predict your DV in the test data. Note that your test data already has the actual DV value for each test observation. Compute the test MSE from the actual DV and the predicted DV. If the two MSEs (one from the regression output and one from the test data) are close (i.e. MSEtest is not more than 1.5*MSEtraining) your model is generalizing. Your task is to report correctly the two MSE numbers and conclude whether the model is generalizing or not. Based on your overall analysis report whether the model is implementable or not.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Statistics

Authors: James T. McClave, Terry T Sincich

12th Edition

9780321831088, 321755936, 032183108X, 978-0321755933

Students also viewed these Mathematics questions

Question

4. Jobe dy -Y 2 et by

Answered: 1 week ago