Answered step by step
Verified Expert Solution
Question
1 Approved Answer
. Insurance companies use many different variables to determine the rates that they charge their customers. One of those variables is Body Mass Index (BMI)
. Insurance companies use many different variables to determine the rates that they charge their customers. One of those variables is Body Mass Index (BMI) which is a (certainly not perfect) way of measuring how heavy a person is in relation to their height. Below is a chart comparing a person's BMI to how much they were charged by the insurance company. 10 25 30 35 I] 45 50 a) (5 pts) The formula for the regression line is: chFges = 13186.58 + 1473.11 * bmi. Interpret the slope of this line in cOntext. b) (10 pts) What would the predicted charges be for someone who has a BMI of 25? What about for someone with a BMI of 35? c) (10 pts) There is someone with a particularly large residual who has a BMI of 30.36 and was charged $62,592.87. Compute the residual for this individual. Recall that residual! = observed - predicted. d) (Extra Credit) Use the picture above to explain why a single linear regression may not be the most accurate way of describing the data 2. The three-point line in college basketball was moved further away from the hoop back in 2019. The NCAA (the governing body of college athletics) would like players to be able to make shots behind that line no more than 113 (33%) of the time. Last season, there were 1?8,749 three- point shots attempted, and 60,741 were made. a. (2 pts) State the hypotheses for testing whether this sample provides evidence that the proportion of shots made was more than 0.33. b. (2 pts) If you were to use a simulation to test this hypothesis, would you use a bootstrap distribution or a randomization distribution? c. (17 pts) Using the distribution you chose above, explain the steps you use to obtain a p-value for this test and state what the p-value is. DO NOT USE A NORMAL DISTRIBUTION FOR THIS PROBLEM. d. (4 pts) State the conclusion for this test using a 5% signicance level in the context of the problem. e. (Extra Credit) If you were to use the same sample proportion, but had a smaller sample size, your p- value would be larger. For which of the following samples would we have a p-value smaller than 0.05 (there could be more than one right choice but you will only get credit if you only select all of the right answers and none of the wrong answers): 34! 100 340/1000 3,400! 10,000 3. Popular Twitch show Critical Role's first Dungeons and Dragons campaign ran for over 100 episodes. Over the course of the campaign, the players rolled a 20-sided dice a combined 8,849 times. The average number rolled was about 11.6. One guest player (Wil) was notorious for rolling very low numbers. Wil's average roll was only a 10 over 49 rolls with a standard deviation of 5.9. You have decided to test whether this is enough evidence to say Wil's rolls were significantly worse than average. a. (2 pts) State the null and alternative hypothesis for testing whether Wil was significantly worse at rolling than the average of 11.6. b. (2 pt) Would you use a normal or a t-distribution for this test? c. (17 pts) Give the test statistic and the p-value for this test using the distribution you chose above. d. (4 pts) State the conclusion of the test in the context of the problem. e. (Extra Credit) Taliesin was known for rolling very high. 114 out of 1318 rolls were a 20. Is this significantly more than expected? (The expected proportion would be 0.05.)4. A survey asked almost 300 data scientists questions about their jobs during 2020 and 2021. Of the 40 who stated that they worked entirely on-site during the time period, their mean salary was $84,962.56 with a standard deviation of $86,907.47. In coutrast, the mean salary for the 134 who worked entirely remotely was $115,107.68 with a standard deviation of $90,058.65. We want to create a 90% condence interval for how much more money remote- working data scientists make compared to their inperson counterparts. a. (2 pts) What is the variable of interest? (Mean? Proportion? Difference of Means? Difference of Proportions? Paired Difference of Means?) b. (4 pts) What is the sample statistic? c. (1'? pts) Show all steps required to determine the 90% condence interval for how much more money remote-working data scientists make compared to their in-person counterparts. d. (2 pts) Is it plausible that all remote-working and in- person data scientists have the same mean salary? Explain. 6. (Extra Credit) Do you think we could perform a hypothesis test on this data if the sample sizes were less than 30? Why or why not? Be as specic as possible
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started