Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Data 1 - Prestige Data We are using a mock dataset named Prestige to predict the average salary of Canadian occupations. The dataset consists of

Data 1 - Prestige Data

We are using a mock dataset named "Prestige" to predict the average salary of Canadian occupations. The dataset consists of the following variables:

income: average income (in $)

education: average education (in years)

women: percentage of women in the profession (%)

prestige: prestige score for occupation (numeric, continuous)

type: type of occupation (bcblue collar, wcwhite collar, profprofessional/managerial/technical)

NOTE:

You feel that the 'type' variable would be better broken down into only two types, bluecollar and white collar, where 'prof' would be categorized as white collar. Before beginning any analysis, recode the 'type' variable to reflect this change (i.e., all 'prof' occupations are reassigned to the 'wc' category).

1. You are interested in building a model to predict income, but first, you want to examine its distribution and determine whether a transformation is necessary.

a. Compute a numeric summary, histogram and boxplot for the income data. Describe the shape of the distribution.

b. Consider the log transformation. Create histogram and boxplot for this log-transformed data.

c. Do you suggest using the log-transformed income as the outcome variable in your model? Why or why not?

2. Next, you are interested in whether the effects of prestige on income depend on the type of occupation. Create appropriate graphic to check this. Based on this graphic, is an interaction between type and prestige worth including in a model to predict income?

3. Run a regression to predict income (i.e., using the outcome you chose in 1c) using all of the variables and an interaction between prestige and type (if you deemed it worth including in number 2 above). Copy and paste the regression output below.

a. Write sentence interpreting the effects of each of the variables on income (i.e., interpret the model coefficients).

b. What is the adjusted R2 value? Interpret this value.

c. Are there any potential outliers in your model? Look at the standardized residuals and discuss.

d. Examine the correlation matrix. Do you see any potential multicollinearity issues? Why?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Theory Of Distributions

Authors: Svetlin G Georgiev

1st Edition

3319195271, 9783319195278

More Books

Students also viewed these Mathematics questions

Question

How can Trip 7 prevent future supply chain uncertainties?

Answered: 1 week ago