Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hello Tutors! I have a Stats question! There is no way that I am able to post the data here because of the Excel file

Hello Tutors! I have a Stats question! There is no way that I am able to post the data here because of the Excel file which contains way too many numbers! I have tried but in vain, and I cannot attach the Excel file here!

Kindly get back to me, at michaelcedzynskiy at yahoo dot come!

I will provide you with the Excel file! I REALLY NEED HELP ON THIS ASSIGNMENT. PLEASE HELP A STUDENT OUT!

image text in transcribedimage text in transcribedimage text in transcribed
All hypothesis tests should include hypotheses, test statistic, p-value or critical value, decision, and conclusion. Please minimize the listing of computer output or the excessive use of appendices in reporting your results. Summarize the results of each regression model simply by displaying the regression equation, the coefficients and their standard errors, as well as the usual summary statistics such as the standard error, R-square and R-square(adj). A policy analyst for a local school board wanted to determine what relationships between income and aggregate level of education might be used to encourage students to stay in school. Although there were potential problems with interpreting relationships based on aggregate data, she decided to begin with data from the 2016 Census. She collected data for the almost 276 census tracts in Ottawa-Gatineau, before compiling a dataset with the following variables: Census T identifying code for the census tract Pop 15+ P hsgrad the number of adults aged 15 and over with earned income the proportion of adults with high school graduation P trades P collcert the proportion of adults with qualifications in a trade the proportion of adults with a college certificate Punivdipl the proportion of adults with a university diploma (no degree) P bachdegr the proportion of adults with a bachelor's degree P meddent the proportion of adults with a medical, dental or related degree P masters MedInc the proportion of adults with a masters degree the median employment income for individuals above 15 years AvgInc MedInc* the average employment income for individuals above 15 years FITSI the median employment income, with missing values the fits or predicted values from the model in part (f) Note that each proportion tracks the relative number of individuals whose highest level of education is as indicated and the categories are mutually exclusive. The data is in the file ottawagatineau.xIsx. (a) Plot the average incomes against the median incomes, or draw a boxplot or histogram of the average incomes. What two words would best describe the shape of income distributions in general and of average incomes in particular? (b) Perform a multiple regression analysis using the seven educational variables as predictor variables and the median income (MedInc) as the response variable. Are there any problems with multicollinearity? The VIF values are given below for each variable.Predictor VIF P_univdipl 1.1 P_hsgrad 4.1 P_bachdegr 8 P_trades 7.1 P_meddent 2 P_collcert 2.6 P_masters 9.1 (c) For the regression model in (b), graph the residuals against the fitted values and comment on whether the linear regression model assumptions are warranted. If you are using Minitab 17, you should plot the standardized residuals; otherwise, plotting the residuals is fine. (d) Regress the P_bachdegr variable against the other six educational variables. What is the relationship between the R-square from this model and the VIF of the P_bachdegr variable from the model in (b)? (e) The MedInc* variable copies the data from the MedInc variable, but a missing value code (*) has been inserted for a number of census tracts. Examine the MedInc* data and describe the nature of these census tracts (hint: look at the standardized residual values for the "unusual observations" from the regression model in part (b).) The remaining questions pertain to regression models based on the MedInc* variable and not the original MedInc variable. The elimination of these observations means that subsequent models may not predict well the median incomes for these unusual census tracts. (f) Re-estimate the multiple regression model using MedInc* as the new response variable. (g) What changes do you notice, comparing the model in (f) with the model in (b)? (h) Plot the residuals against the fitted values. Do you see any other problems with the model assumptions? If you are using Minitab 17, you should plot the standardized residuals; otherwise, plotting the residuals is fine. (1) Calculate the correlation coefficient between the fitted values (see FITS1 in worksheet) and the MedInc* variable. Show the relationship between this correlation coefficient and the value of R2. (j) Perform an F-test for the overall usefulness of the model, using the 1% level of significance. What do you conclude?(k) Using the model developed in part (1), test the marginal usefulness or importance of the P_bachdegr variable, given the other variables in the model, using a 1% level of significance. What do you conclude? (1) If the proportion of adults with a bachelor's degree were to increase by 0.1 in a set of census tracts (that is, from 0.1 to 0.2 or from 0.3 to 0.4, as the case may be), assuming the other predictor variables remain constant, what is the estimated average increase in the median incomes for these census tracts? (Give an estimate using a 99% confidence level.) Would you conclude that a university degree is beneficial in terms of increasing aggregate incomes? (m) Regress MedInc* against only the P_bachdegr variable and find the estimated slope of the regression line. Is the coefficient of the P_bachdegr variable in the simple regression model consistent with the coefficient of the same variable in the multiple regression model? Explain briefly why they might differ. (n) Use the model developed in part (f) to calculate a 99% prediction interval for the actual median income in census tract 907.00. In Minitab, select "Predict" under "Regression". Show manually how the standard error for the prediction interval is calculated using the standard error for the confidence interval and the standard error (se = VMSE) of the estimated regression model. (0) Explain why you would not expect the prediction interval to cover the actual median income for this census tract. (p) Finally, re-estimate the multiple regression model, but this time drop the variables that are the least useful. You can also use the "Best Subsets" option in Minitab to find the best model. Explain whether this final model is better than the model from part (f)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

College Algebra (Subscription)

Authors: Mark Dugopolski

6th Edition

0321916670, 9780321916679

More Books

Students also viewed these Mathematics questions