Show your working clearly. Thanks
11 A car insurance company wishes to investigate the relationship between the age of drivers and the average annual mileage. The company has asked drivers of specific ages about their annual mileage. The age of the drivers is denoted by x (where x = 40, 45. .... 75), and the annual mileage (in 1,000 miles) is dencted by v. The company asked 100 drivers of each age. The average annual mileage and the sample variance for the annual mileage for each age are shown in the following table, together with some relevant statistics. Sum Sum of squares age x 40 45 50 55 65 75 460 27,500 average mileage y 15 145 141 134 13 121 11.8 114 1053 1,398 23 sample variance 2.25 2.56 1.69 1.96 3.24 4.00 1.44 1.21 rx y 600 652.5 705 737 780 786.5 826 855 5,942 The second last column contains the sum of the eight other columns and the last column contains the sum of the squares of the eight other columns. (i) Determine a 95% confidence interval for the average annual mileage of drivers aged 50 based on the sample of 100 drivers at this age, justifying any assumptions you make. [3] (ii) Perform a test of the null hypothesis that the average annual mileage of drivers aged 40 is equal to the average annual mileage of drivers aged 50 based on the two samples of 100 drivers each. You should calculate an approximate -value, make a test decision and justify your decision and any approximations. [4] (iii) Determine the correlation coefficient between the observed average annual mileage y and the age r of the driver. [4] Further studies show that the correlation coefficient between the actual annual mileage for each individual driver and the age x of the driver based on the entire sample of 800 drivers is -0.63. You are not required to confirm this result. (iv) Explain the difference between this correlation coefficient and the correlation coefficient calculated in part (iii). [2] (v) State the circumstances under which the two correlation coefficients would be equal [1] (vi) Determine the parameters of the simple linear regression model with the actual annual mileage y for each individual driver being the response variable and age x the explanatory variable, including writing down the equation. [7] [Total 21]