Question
please help me with this ASAP, i will give you the best rating I swear. Question 1 (5 marks) Choose the dataset that will be
please help me with this ASAP, i will give you the best rating I swear.
Question 1 (5 marks) Choose the dataset that will be used throughout the rest of the assignment. B. Medical Cost Data: This is a dataset on individual medical costs billed by health insurance. The response variable of interest is 'charges'. A description of the variables can be found here: https://www.kaggle.com/datasets/mirichoi0218/insurance Read the dataset into R. in words introduce your dataset, explain the chosen dataset to someone who isn't familiar with it, introduce the general topic area and any important variables.
Question 2 (5 marks) Fit a multiple linear regression model in R using the data. decide the predictor variables. Clearly state what the response and predictor variables are. Show the standard R output using the summary() function. It is not necessary to use a model selection algorithm.
Question 3 (15 marks) Check the assumptions using the residuals. Show the appropriate plots. For each of the 4 assumptions, state how well the assumption is satisfied, referencing applicable plots as necessary. For the purposes of this assignment, even if the assumptions are not met, please proceed with the rest of the assignment.
Question 4 (6 marks)
For TWO of your predictors, provide an interpretation of the slope. Provide an interpretation of the intercept. Does the intercept have a meaningful interpretation in practice?
Question 5 (10 marks) State the hypotheses and conclusion to the ANOVA F-test and t-tests for each slope. Additionally, interpret the coefficient of determination. (For ease of marking, please show the standard R output using the summary() function in your Question 5 answer - the same one from Question 2.)
Question 6 (6 marks) For a combination of predictor variable values of your choice, calculate the confidence interval and prediction interval. Provide an interpretation for each of the intervals. encouraged to choose a combination of predictor values that makes sense and would have a meaningful interpretation.
Question 7 (10 marks) Fit another linear regression model with the same response variable. Choose a model where either the second model's predictors are a subset of the first model's predictors OR the first model's predictors are a subset of the second model's predictors. Use 2 methods to determine which model is preferable. Show and explain the results of both of these methods. Do the 2 methods agree? (reduced or complete model approach?)
insurance csv (insurance.csv):
here is the link to it: https://drive.google.com/file/d/10XxY1lWW69gwfr7sH0MURtu4-xHY1AZY/view?usp=sharing
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started