Answered step by step
Verified Expert Solution
Question
1 Approved Answer
### Data In this assignment, we will use the following data. We assume that y is the response of interest and x is a predictor
### Data In this assignment, we will use the following data. We assume that y is the response of interest and x is a predictor of y, and that we have 19 independent observations of the pairs (x,y): "{r} rm(list=1s()) x = c(1, 1. 5, 2, 3,4, 4. 5, 5, 5. 5, 6, 6. 5,7, 8,9,10, 11, 12,13, 14,15) y = c(21. 9, 27 . 8, 42. 8, 48. 0, 52. 5, 52. 0, 53. 1, 46. 1, 42. 0, 39. 9, 38. 1, 34. 0, 33. 8, 30. 0, 26. 1, 24. 0, 20. 0, 11. 1, 6. 3) length(x) D X X [1] 19 ### Evaluation Two problems, 10 pts each. The first problem is on simple linear regression, the second deals with polynomial regression models. ## Problem 1 [10 pts] The goal of this problem is to calculate a bootstrap distribution of the SLOPE of the linear regression of y on x. You will then use the bootstrap distribution to estimate the uncertainty of the slope. ### Plot [1 pt] use the 'plot' function to make a scatterplot of the x,y data. Apply the function 'lines' to x,y to overlay a continuous series of linear segments joining the observed data points (hence your plot should show the raw data points and the line). use a point choice (argument pch of the function plot) of 19 and a red color for plotting the points, and a blue color for plotting the lines. Your plot should look like: ! (https://www. afhalifax. ca/disp/coursedata/plotline. png) 50 40 20 30 O 2 8 10 12 14 "{r} plot (x,y, main = "scatterplot", xlab = "x", ylab = "y", pch = 19, col="red") lines (x,y, col="blue")### Bootstrap distribution [5 pts] write a code that produces a vector named 'bootdist' that contains a bootstrap distribution of 1000 values of the slope of the linear regression of y on x. You can mimic/reuse code used in the lectures. You will need to use a for loop, produce bootstrap sample of the data, apply the linear regression model, extract the slope and append it to the distribution. "{r} # (1) fit a linear regression model of y on x # and print a summary of this model Im. out= 1m(y~x) toef (1m. out) ypred= predict (1m. out) resids= y-ypred SSE=sum(resids^2) print (paste("Error sum of squares is", SSE)) . . . X (Intercept) 49. 94 5183 -2. 169989 "Error sum of squares is 1850. 32457364341" "{r} set. seed(239) # keep this line data=cbind(x, y) #Nboot=? # set Nboot to the required value # initialize an empty vector that will store the bootstrap distribution #bootdist=? create a for loop of size Nboot # use sample to create a bootstrap sample index index=? # create a temporary dataframe datai # that corresponds to the rows in index of data #datai=? # fit a linear regression model using the datai data # save it in an object called lmout #1 mout=? * now use coef to extract the slope from object Imout # and append this value to the bootstrap distribution #bootdist=? # close your loop ### Histogram [1 pts] Make a histogram of the bootstrap distribution 'bootdist'. use the main argument to add the title 'Bootstrap distribution of the slope parameter'. use the xlab argument to give add the x-axis label: 'Estimated slope value'. "{r} . . . ### confidence interval [2 pts] use the quantile function to calculate a 94% percentile bootstrap confidence interval (make sure that you leave out a lower tail of probability 3% and an upper tail of probability 3%). Store this confidence interval in a variable named bootci and print this variable. "{r} #bootci=? TDOOLCTNote: you can use the following code to nicely print your interval: a and b are chosen arbitrarily, just to show you how to round the numbers and how to embed the numbers in your text. a=-2. 19021 b=3. 230157 myint=c(a, b) # define some interval You can round the values like this: ""{r} round (myint [1] , 3) round (myint [2] , 3) You can embed the values in your text like this: The interval is 'r paste("(", round (myint [1] , 3) , ", ", round(myint [2], 3) , ")") ". when you will knit your file, you will see the numerical values printed in your text. So now mimic this to write a sentence to report the CI that you have calculated above: A 94\\% percentile interval for the slope is . . . ### Comparison with summary [1 pt] Read the summary of your linear model (see (1)). what is the estimated value of the regression slope? what is the estimated standard error of this slope? Now, apply the 'mean' and 'sd' functions to the bootstrap distribution of the slope that you have calculate above. write a short sentence to qualitatively compare these values to the corresponding values found by 'Im
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started