Answered step by step
Verified Expert Solution
Question
1 Approved Answer
6. Consider the following data set, where eleven tea brands are surveyed and x denotes the cost per box (in USD) and y denotes
6. Consider the following data set, where eleven tea brands are surveyed and x denotes the cost per box (in USD) and y denotes the amount of caffeine (in mg per 100ml). 1 2 P3 $3 10.0 8.0 15 13.0 9.0 11.0 4 6 Y 8.04 6.95 7.58 8.81 8.33 D7 28 14.0 6.0 4.0 9.96 7.24 4.26 .9 12.0 7.0 5.0 10.84 4.82 5.68 10 11 We can compute easily that = 9, S = 3.317, y = z-scores for x and y by using Zxi 2 P3 Xi - Sx Zxi Zyi 0.080 0.082 0.047 0 x and Zyi = , 7.5 and s = 2.032. When we compute the sy Yi - Y and when we compute Zx; Zy; we obtain: Sy 4 195 SSxx = Remember that SSxy can be written as Say 0.246 1.825 0.116 2.405 1.487 = SxSy (EN) - i=1 a) Write SSxx and SSy in terms of sx, Sy and (n 1). Zxi SS yy = Zyi 0.796 1.081 Q6 7 8 .9 11 10 b) First simplify the definition of the Pearson correlation coefficient r = SSxy SSxxSSyy to an ex- pression that contains (1-1 (Zx; Zy;) and n using your results from part (a). Then compute r using the given Z, Zy, values. Interpret the correlation coefficient r you've found very shortly. . x Sx c) Find the equation of the least square line = o + Bx using = 9, 8x = 3.317, = 7.5, = 2.032 and the correlation coefficient r you computed in part (b). Sy d) Use the summary table from R to check your answers for parts (b) and (c). If you made a mistake but you cannot find your mistake, just use the correct values values from the table to answer the remaining questions. Find r and = o + Bx from the summary table: Call: lm(formula Residuals: P = y x, data = df_tea) 1Q Median Min 3Q Max -1.92127 -0.45577 -0.04136 0.70941 1.83882 Coefficients: (Intercept) 3.0001 X 0.5001 Estimate Std. Error t value Pr(>Itl) 1.1247 2.667 0.02573 * 0.1179 4.241 0.00217 ** Signif. codes: 0 0.001 0.01 *** 0.05 '.' 0.1 1 r = = Residual standard error: 1.237 on 9 degrees of freedom Multiple R-squared: 0.6665, Adjusted R-squared: 0.6295 F-statistic: 17.99 on 1 and 9 DF, p-value: 0.00217 e) Use the equation of the regression line to answer the following questions: (i) How much caffeine (per 100ml) would the tea contain, if it costs $7.5 per box? (ii) You're a consultant and your customer asks you: How much caffeine (per 100ml) would the tea contain, if it costs $25 per box?" What's your answer? (Don't blindly plug in numbers.) (iii) On average, how much more (or less) caffeine corresponds to one dollar increase in the price of tea? f) Thanks to our probabilistic model approach, we can actually give much better answers than the ones in part (e). You can find useful formulas on the last page and use to.025 2.262 for when the degrees of freedom, df, is 9. Answer the following questions using the summary table: (i) By using the relevant numbers in the model, justify the existence of the B in our model by running a hypothesis test with H : = 0 and Halternative: B = 0 and a = = 0.05. (ii) Find a 95% confidence interval for your answer to (e-iii), namely, on average how much more (or less) caffeine corresponds to one dollar increase in the price of tea. (iii) Give a 95% confidence interval for average caffeine if your customer plans to purchase lots of boxes of tea boxes with cost $7.5 per box. (iv) Give a 95% confidence interval for average caffeine if your customer plans to purchase a single tea box that costs $7.5.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started