Question
Note: Please include both R codes and results in your solutions. (You may use the Compile Report function under Menu File in RStudio to generate
Note: Please include both R codes and results in your solutions. (You may use the "Compile Report" function under Menu "File" in RStudio to generate a Word/PDF report of both R codes and results
Question 1: Linear Regression (50 Points) Load the dataset NILT2012GR _SUBSET.csv into R. The data contains 9 variables for 1204 citizens, which comes from Queen's University in Belfast (North Ireland) and is based on the Northern Ireland Life and Times Survey (NILT) 2012. Make a subset named Q1 in which variable persinc2 (personal income) and rage (age) contains no missing value. Then create a new variable named log_income, which takes log transformation of persinc2, to answer below questions.
(a) We want to explore the effect of age (variable rage) on log income (variable log_income) with a simple linear regression model. In this model, what kind of relationship is assumed between the two variables and how would you present this relationship with an equation formula?
(b) Implement the linear regression in R and interpret the results accordingly.
(c) Visualize the relationship between age and log_income with a scatter plot. It should look similar as below.