Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

ANOVA and Multiple Regression Projects A national distributor of Eagle brand snacks is attempting to develop a model to explain sales of their product. To

ANOVA and Multiple Regression Projects A national distributor of "Eagle" brand snacks is attempting to develop a model to explain sales of their product. To do so, data have been gathered on monthly sales (measured in hundreds of dollars) from its many marketing areas. From these, sixty observations have been selected at random. The data reported on the data sheet are as follows: Column #1 = Monthly sales in thousands of dollars per marketing area. 2 = Promotional budget for the sales area, in thousands of dollars. 3 = Median family income in the sales area, in thousands of dollars. 4 = Product recognition index, proportion of respondents to a marketing survey in the market district that recognized the Eagle brand name (reported as a decimal value). 5 = Average retail price of product in dollars. 6 = Average retail price of leading competitor brand in dollars. 7 = A coded variable representing the advertising method employed in the sales area. There are four levels of this categorical variable labeled A, B, C, and D as follows: A = Sports Magazine only B = Radio Sport Show Ads C = TV Sports Show Ad D = TV General Advertising Multiple Regression Project Of particular interest to the company is the effect of advertising expenditures on sales, the effect their price has on sales, and the effect of the different package designs the company has been using on sales (note: the dependent variable is not unit sales but dollar sales reported in hundreds of dollars). You will need to adjust your Excel spreadsheet file from the Anova Project to do the following: In Excel use the Data > Data Analysis > Regression procedure to fit a simple regression of Sales (dependent, or Y, variable) as a function of Advertising Expenditures (independent, or X, variable). 1. Report the estimated regression equation. 2. Conduct the F-test for model significance. Interpret your results. 3. Interpret the slope coefficient and test for significance using a t test. 4. a. Use a t test to determine if the slope coefficient is significantly different (two-tail test) than the value 1.0 that is, test H0: = 1 against H1: 1. b. Of what importance is it to the company to know if this slope coefficient differs significantly from 1.0? 5. Be sure to include a copy of your Excel printout. Econ 222: Economic and Business Statistics II Lecture 17 Multiple Regression Topics Review Last Class: Reviewed calculation formulas for simple regression. F-test from Anova table for model adequacy. Measures of goodness-of-fit. R-square, Se (root MSE) Digression on correlation coefficient Interpretation of slope coefficients Inferences from slopes - t-tests, Confidence Intervals Interpretation of the Slope Coefficient, b1 house price 98.24833 0.10977 (square feet) b1 measures the estimated change in the average value of Y as a result of a oneunit change in X Here, b1 = .10977 tells us that the value of a house increases by .10977 thousands of dollars = $109.77, on average, for each additional one square foot of size. Note: unit of measure of slope is the same as dependent variable. Inference about the Slope: t-test for Significance Null and alternative hypotheses H0: 1 = 0(no linear relationship) H1: 1 0(linear relationship does exist) Test statistic b1 10 .10977 0 t 3.3238 s b1 .03297 where: sb1 = Standard error of the slope d.f. n 2 Excel Output b1 0.10977 Regression Statistics Multiple R R Square 0.58082 Adjusted R Square 0.52842 t 3.3238 0.76211 Standard Error sb1 0.03297 41.33032 Observations 10 ANOVA df SS MS F 11.0848 Regression 1 18934.9348 18934.9348 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Significance F 0.01039 p-value for t-test Inferences about the Slope: t Test Example H0: 1 = 0 Degrees of freedom for t-test = denominator d.f. from ANOVA table. HA: 1 0 Test Statistic: t = 3.329 d.f. = 10-2 = 8 Decision: Reject H0 at = 0.05 Conclusion: /2=.025 Reject H0 /2=.025 Do not reject H0 -t/2 -2.3060 0 Reject H 0 t/2 2.3060 3.329 There is sufficient evidence at the 5% significance level to indicate that square footage affects house price. Inferences from Regression Confidence Interval Estimate of the Slope: b1 t /2sb1 d.f. = n - 2 Excel Printout for House Prices: Intercept Square Feet Coefficients Standard Error t Stat P-value Lower 95% Upper 95% 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858) Since the units of the house price variable is $1000s, we are 95% confident that the average impact on sales price for each additional square foot of house size is between $33.70 and $185.80. Steps in Regression Modeling Use the model. Two purposes in developing regression models. a. Discovering determinants Interpretation of slope coefficients Inferences from slopes - t-tests, Confidence Intervals b. Prediction or Forecasting Prediction for a mean value of Y. Prediction for an individual value of Y. Prediction of y, Given x Confidence interval estimate for the mean of y given a particular value of xp CLM - Confidence limits for the Mean y Size of interval varies according to distance away from mean, x y t/2s 1 (x p x) n SSxx 2 Confidence Interval for an Individual y, Given x Confidence interval estimate for an Individual value of y given a particular xp CLI - Confidence limits for Individual y y t/2s 1 (x p x) 1 n SSxx 2 This extra term adds to the interval width to reflect the added uncertainty for an individual case Interval Estimates for Different Values of x y Prediction Interval for an individual y, given xp Confidence Interval for the mean of y, given xp + b 1x y = b0 x xp x Example: House Prices Use model to predict the price for a house with 2000 square feet: house price 98.25 0.1098 (sq.ft.) 98.25 0.1098(2000) 317.85 The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850 Estimation of Mean Values: Example Confidence Interval Estimate for E(y)|xp Find the 95% confidence interval for the average price of 2,000 square-foot houses Predicted Price Yi = 317.85 ($1,000s) y t /2 s 1 n (x p x) 2 SSxx 317.85 37.12 The confidence interval endpoints are 280.66 -- 354.90, or from $280,660 -- $354,900 Estimation of Individual Values: Example Prediction Interval Estimate for y|xp Find the 95% confidence interval for an individual house with 2,000 square feet Predicted Price Yi = 317.85 ($1,000s) y t /2 s 1 1 n (x p x) 2 SSxx 317.85 102.28 The prediction interval endpoints are 215.50 -- 420.07, or from $215,500 -- $420,070 Finding Confidence and Prediction Intervals on Excel In Excel PHStat, use Add-Ins PHStat | regression | simple linear regression ... Check the \"confidence and prediction interval for X=\" box and enter the x-value and confidence level desired MegaStat | Correlation/Regression | Regression Anslysis... Select the Y and X variable ranges (including labels), select \"type in predictor variable\" and enter desired value, click OK Finding Confidence and Prediction Intervals Excel-MegaStat Finding Confidence and Prediction Intervals Excel-MegaStat A Multiple Regression Model A distributor of frozen desert pies wants to evaluate factors thought to influence demand Dependent variable: Pie sales (units per week) Independent variables: Price (in $) Advertising Exp.($100's) Data are collected for 15 weeks Pie Sales Model Week Pie Sales Price ($) Advertising ($100s) 1 350 5.50 3.3 2 460 7.50 3.3 3 350 8.00 3.0 4 430 8.00 4.5 5 350 6.80 3.0 6 380 7.50 4.0 7 430 4.50 3.0 8 470 6.40 3.7 9 450 7.00 3.5 10 490 5.00 4.0 11 340 7.20 3.5 12 300 7.90 3.2 13 440 5.90 4.0 14 450 5.00 3.5 15 300 7.00 2.7 Multiple regression model: Sales = b0 + b1 (Price) + b2 (Advertising) Scatter Diagrams Sales Sales Price Advertising Estimating a Multiple Linear Regression Equation Computer software is generally used to generate the coefficients and measures of goodness of fit for multiple regression Excel: PHStat: Data (or Tools) > Data Analysis... > Regression Add-Ins (PHStat) > Regression > Multiple Regression... MegaStat: Add-Ins (MegaStat) > Correlation/Regression > Regression Analysis... Multiple Regression Output Regression Statistics Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 Standard Error 47.46341 Observations ANOVA Regression Sales 306.526- 24.975(Pri 74.131(Adv ce) ertising) 15 df SS MS 2 29460.027 14730.013 Residual 12 27033.306 14 56493.333 Significance F 2252.776 Total F 6.53861 0.01201 Is the Model Significant? F-Test for Overall Significance of the Model Shows if there is a linear relationship between all of the x variables considered together and y Use F test statistic Hypotheses: H0: 1 = 2 = ... = k = 0 (no linear relationship) H1: at least one i 0 (at least one independent variable affects y) H 0 : 1 = 2 = 0 H1: at least one i 0 F-Test for Overall Significance (continued) Regression Statistics Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 Standard Error Regression With 2 and 12 degrees of freedom P-value for the F-Test 47.46341 Observations ANOVA MSR 14730.0 F 6.5386 MSE 2252.8 15 df SS MS 2 29460.027 14730.013 Residual 12 27033.306 14 56493.333 Significance F 2252.776 Total F 6.53861 Sales 306.526 - 24.975(Price) 74.131(Advertising) 0.01201 F-Test for Overall Significance Test Statistic: H0: 1 = 2 = 0 HA: 1 and 2 not both zero = .05 df1= 2 (continued) MSR F 6.5386 MSE df2 = 12 Decision: Critical Value: Reject H0 at = 0.05 Conclusion: F0.05, 2, 12 = 3.89 F0.01, 2, 12 = 6.93 = .05 0 Do not reject H0 Reject H0 F.05 = 3.885 F The regression model does explain a significant portion of the variation in pie sales at the 5% significance level. (There is evidence that at least one independent variable affects y when tested at = 0.05.) Multiple Coefficient of Determination Reports the proportion of total variation in y explained by all x variables taken together SSR Sum of squares regression R SST Tot al sum of squares 2 Multiple Coefficient of Determination (continued) Regression Statistics SSR 29460.0 R .52148 SST 56493.3 2 Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 Standard Error 47.46341 Observations ANOVA Regression 52.1% of the variation in pie sales is explained by the variation in price and advertising 15 df SS MS 2 29460.027 14730.013 Residual 12 27033.306 14 56493.333 Significance F 2252.776 Total F 6.53861 0.01201 New Adjusted R2 R2 always increases when a new x variable is added to the model This can be a disadvantage when comparing models What is the net effect of adding a new variable? We lose a degree of freedom when a new x variable is added Did the new x variable add enough explanatory power to offset the loss of one degree of freedom? New Adjusted R2 (continued) Shows the proportion of variation in y explained by all x variables adjusted for the number of x variables used n 1 R 1 (1 R ) n k 1 2 A 2 (where n = sample size, k = number of independent variables) Penalize excessive use of unimportant independent variables Smaller than R2 Useful in comparing among models since R2 will always increase as X variables are added but adjusted R2 may increase or decrease. Multiple Coefficient of Determination New (continued) R 2 .44172 A Regression Statistics Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 Standard Error 47.46341 Observations ANOVA Regression 44.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variables. 15 df SS MS 2 29460.027 14730.013 Residual 12 27033.306 14 56493.333 Significance F 2252.776 Total F 6.53861 0.01201 Standard Deviation of the Regression Model The estimate of the standard deviation of the regression model is: SSE se MSE n k 1 Is this value large or small? Must compare to the overall standard deviation of y for comparison Standard Deviation of the Regression Model (continued) Regression Statistics Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 The standard deviation of the regression model is 47.46 = 2252.776 Standard Error 47.46341 Observations ANOVA Regression The standard deviation of y alone is 63.52 = (56493.333 / 14 ) 15 df SS MS 2 29460.027 14730.013 Residual 12 27033.306 14 56493.333 Significance F 2252.776 Total F 6.53861 0.01201 Interpretation of Estimated Coefficients Slope (bi) Estimates that the average value of y changes by b i units for each 1 unit increase in Xi holding all other variables constant Example: if b1 = -25, then sales (y) is expected to decrease by an estimated 25 pies per week for each $1 increase in selling price (x1), net of the effects of changes due to advertising (x2) y-intercept (b0) The estimated average value of y when all xi = 0 (assuming all xi = 0 is within the range of observed values) The Multiple Regression Equation Sales 306.526 - 24.975(Price) 74.131(Advertising) where Sales is in number of pies per week Price is in $ Advertising is in $100's. The Multiple Regression Equation Sales 306.526 - 24.975(Price) 74.131(Advertising) where Sales is in number of pies per week Price is in $ Advertising is in $100's. b1 = -24.975: sales will decrease, on average, by 24.975 pies per week for each $1 increase in selling price when advertising does not change. The Multiple Regression Equation Sales 306.526 - 24.975(Price) 74.131(Advertising) where Sales is in number of pies per week Price is in $ Advertising is in $100's. b1 = -24.975: sales will decrease, on average, by 24.975 pies per week for each $1 increase in selling price when advertising does not change. b2 = 74.131: sales will increase, on average, by 74.131 pies per week for each $100 increase in advertising when selling price does not change. Are Individual Variables Significant? Use t-tests of individual variable slopes Shows if there is a linear relationship between the variable xi and y Hypotheses: H0: i = 0 (no linear relationship) H1: i 0 (linear relationship does exist between xi and y) Are Individual Variables Significant? Use t-tests of individual variable slopes Shows if there is a linear relationship between the variable xi and y Hypotheses: H0: i = 0 (no linear relationship) H1: i 0 (linear relationship does exist between xi and y) H0: 1 = 0 H0: 2 = 0 H1: 1 0 H1: 2 0 Are Individual Variables Significant? (continued) H0: i = 0 (no linear relationship) H1: i 0 (linear relationship does exist between xi and y) bi 0 t sbi Test Statistic: (df = n - k - 1) Are Individual Variables Significant? Revie w (continued) Regression Statistics Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 Standard Error Regression t-value for Advertising is t = 2.855, with p-value .0145 47.46341 Observations ANOVA t-value for Price is t = -2.306, with p-value .0398 15 df SS MS 2 29460.027 14730.013 Residual 12 27033.306 14 56493.333 Significance F 2252.776 Total F 6.53861 0.01201 Inferences about the Slope: t Test Example From Excel output: H0: i = 0 HA: i 0 Price Standard Error = .05 /2=.025 /2=.025 10.83213 -2.30565 0.03979 74.13096 25.96732 2.85478 0.01449 Decision: Reject H0 for each variable Conclusion: Do not reject H0 -t/2 -2.1788 P-value The test statistic for each variable falls in the rejection region (p-values < .05) t/2 = 2.1788 Reject H0 t Stat -24.97509 Advertising d.f. = 15-2-1 = 12 Coefficients 0 Reject H0 t/2 2.1788 There is evidence that both Price and Advertising affect pie sales at = .05 Confidence Interval Estimate for the Slope Confidence interval for the population slope 1 (the effect of changes in price on pie sales): bi t / 2 sbi where t has (n - k - 1) d.f. Coefficients Standard Error ... Intercept 306.52619 114.25389 ... 57.58835 555.46404 Price -24.97509 10.83213 ... -48.57626 -1.37392 74.13096 25.96732 ... 17.55303 130.70888 Advertising Lower 95% Upper 95% Example: Weekly sales are estimated to be reduced by between 1.37 to 48.58 pies for each increase of $1 in the selling price Using The Model to Make Predictions Predict sales for a week in which the selling price is $5.50 and advertising is $350: Sales 306.526 - 24.975(Price) 74.131(Advertising) 306.526 - 24.975 (5.50) 74.131(3.5) 428.62 Predicted sales is 428.62 pies Note that Advertising is in $100's, so $350 means that x2 = 3.5 Using The Model to Make Predictions In simple regression we had prediction interval formulas as follows: for mean y t /2 s 2 1 (x p x) n SSxx for individual y t /2 s 2 1 (x p x) 1 n SSxx Using The Model to Make Predictions Using The Model to Make Predictions The regression model standard error is Se = MSE = 47.46 A rough prediction range for pie sales in a given week is y 2 (47.46) y 94.2 428.62 94.2 334.42to 522.82 Predictions in MegaStat Rough interval y 2 MSE 334.42to 522.82 Steps in Regression Modeling 4. Examine estimated parameters (coefficients) of the model. a. Inferences from slopes - Confidence Intervals, t-tests b. Interpretation of slope coefficients c. Determine if included explanatory variables are appropriate 5. Check for violations of conditions a. b. c. d. Non-normality of error Multicolinearity Heteroscedasticity Autocorrelation of errors Multicollinearity Multicollinearity: High correlation exists between two independent variables This means the two variables contribute redundant information to the multiple regression model Multicollinearity (continued) Including two highly correlated independent variables can adversely affect the regression results No new information provided Can lead to unstable coefficients (large standard error and low t-values) Coefficient signs may not match prior expectations Some Indications of Severe Multicollinearity Incorrect signs on the coefficients Large change in the value of a previous coefficient when a new variable is added to the model A previously significant variable becomes insignificant when a new independent variable is added The estimate of the standard deviation of the model increases when a variable is added to the model The Correlation Matrix Correlation between the dependent variable and selected independent variables can be found using Excel: Tools / Data Analysis... / Correlation Can check for statistical significance of correlation with a t test Pie Sales Correlation Matrix Pie Sales Price Advertising Price Advertising 1 -0.44327 0.55632 1 0.03044 Price vs. Sales : r = -0.44327 Pie Sales There is a negative association between price and sales Advertising vs. Sales : r = 0.55632 There is a positive association between advertising and sales 1 Detect Collinearity (Variance Inflationary Factor) VIFj is used to measure collinearity: 1 VIFj 2 1 Rj R2j is the coefficient of determination when the jth independent variable is regressed against the remaining k - 1 independent variables If VIFj > 5, xj is highly correlated with the other explanatory variables Detecting Multicollinearity in MegaStat MegaStat > correlation/regression > regression analysis Check the \"variance inflationary factors\" box Output for the pie sales example: Since there are only two explanatory variables, both VIF values are the same Since VIF < 5, there is no evidence of multicollinearity between Price and Advertising Predictions in MegaStat Dummy Variables in Regression Categorical explanatory variables can be included in regression analysis as independent variables. Each categorical variable would have two or more levels. Dependent variable is still quantitative, one or more independent variables are also quantitative. Example: y f ( x1 , x 2 , x 3 ) y = accident rate in plant x1 = speed operation in parts per minutes (quantitative) x2 = operating shift (D, N, M) x3 = day of the week (M, T, W, R, F) Qualitative (Dummy) Variables Dummy variables are also called indicator variables with: each level of the variable coded as a binary, 0 or 1 To represent categorical variables with more than two categories, more than one dummy variable is needed. The number of dummy variables needed for each categorical explanatory variable equals the number of categories - 1. Dummy Variable Schemes Day = dummy for day shift = 1 if day shift, 0 otherwise Night = dummy for night shift = 1 if night shift, 0 otherwise Midnights = dummy for midnight shift = 1 if midnight shift, 0 otherwise Only two of the three dummies can be included in the regression model. Exclude the category to which you want to make all comparisons. Next Class Examples of dummy variables in regression Dummy-Variable Model Example (with 2 Levels) Let: y = pie sales y b0 b1x1 b 2 x 2 x1 = price x2 = holiday (X2 = 1 if a holiday occurred during the week) (X2 = 0 if there was no holiday that week) Dummy-Variable Model Example (with 2 Levels) (continued) y b0 b1x1 b 2 (1) (b 0 b 2 ) b1x1 y b0 b1x1 b 2 (0) b 0 b1 x 1 Different intercept y (sales) b0 + b2 b0 Holi day No H olida y Holiday No Holiday Same slope If H0: 2 = 0 is rejected, then \"Holiday\" has a significant effect on pie sales x1 (Price) Interpretation of the Dummy Variable Coefficient (with 2 Levels) Example: Sales 300 - 30(Price) 15(Holiday ) Sales: number of pies sold per week Price: pie price in $ 1 If a holiday occurred during the week Holiday: 0 If no holiday occurred b2 = 15: on average, sales were 15 pies greater in weeks with a holiday than in weeks without a holiday, given the same price Dummy-Variable Model Example (with 2 Dummies for 3 Levels) Dummy-Variable Model Example (with 2 Dummies for 3 Levels) y b 0 b1 Speed b 2 DN b 3 DM y .618 0.212 Speed 0.146 DN .338DM Report the estimated regression equation. (above) Is the model as a whole significant in explaining the accident rate? Comment on the goodness of fit of the model. Dummy-Variable Model Example (with 2 Dummies for 3 Levels) y b 0 b1 Speed b 2 DN b 3 DM y .618 0.212 Speed 0.146 DN .338DM Interpret the values of the estimate regression coefficients. Conduct t-tests to determine the significance of the regression coefficients. Dummy-Variable Model Example (with 2 Dummies for 3 Levels) y b 0 b1 Speed b 2 DN b 3 DM y .618 0.212 Speed 0.146 DN .338DM Report three regression equation estimates, one for each level of Shift. Dummy-Variable Model Example (with 2 Dummies for 3 Levels) y b 0 b1 Speed b 2 DN b 3 DM y .618 0.212 Speed 0.146 DN .338DM What does this model predict to be the mean difference in accident rate between the day shift and the midnight shift? Is this difference significant? What does this model predict to be the mean difference in accident rate between the day shift and the night shift? Is this difference significant? Multiple regression Data Set 114.5 70.1 111.4 122.1 127.9 83.3 97.0 114.6 118.4 75.2 71.7 80.0 71.4 105.0 97.6 76.2 86.6 138.2 90.4 88.5 98.1 71.6 111.9 83.5 112.8 75.5 84.2 78.1 90.1 119.0 120.0 117.4 120.9 97.2 120.5 104.6 96.4 95.1 72.0 99.8 111.5 51.1 80.4 109.6 106.1 79.5 94.1 97.7 61.6 47.3 51.3 51.1 43.7 54.2 46.3 53.6 51.6 53.9 53.4 53.6 46.1 49.1 51.3 55.9 51.7 48.7 59.2 54.6 58.4 42.2 57.7 50.5 48.8 49.7 48.0 58.1 56.2 47.2 51.7 47.5 56.2 47.1 52.1 51.9 52.0 55.0 54.1 45.0 52.2 47.9 44.2 49.7 54.2 58.9 51.8 57.7 53.8 53.8 43.5 38.5 37.7 42.9 39.2 38.8 42.7 44.6 42.4 36.3 44.4 43.2 39.0 43.7 43.6 36.8 40.3 43.5 42.2 35.8 44.4 36.0 38.1 39.4 39.1 35.8 41.5 39.2 40.5 38.7 42.6 35.6 38.2 37.4 43.4 39.7 35.8 44.3 39.0 42.5 44.6 35.6 35.0 38.4 42.4 38.2 43.1 39.8 40.0 0.65 0.76 0.75 0.83 0.65 0.84 0.69 0.75 0.77 0.77 0.85 0.82 0.64 0.67 0.77 0.70 0.80 0.82 0.72 0.72 0.72 0.84 0.62 0.82 0.69 0.82 0.76 0.80 0.73 0.81 0.74 0.79 0.63 0.78 0.67 0.71 0.80 0.77 0.82 0.78 0.78 0.72 0.72 0.82 0.76 0.79 0.67 0.75 0.79 2.40 2.55 2.74 2.46 2.55 2.75 2.83 2.60 2.54 3.12 3.27 2.95 2.93 2.71 2.93 3.02 2.45 2.50 2.98 3.11 2.42 2.90 2.93 2.92 3.02 2.95 2.75 2.79 2.69 2.84 2.77 2.87 2.51 2.72 2.99 2.38 2.68 3.06 2.97 2.59 2.51 2.88 2.69 2.73 3.28 2.75 2.92 2.79 3.27 3.25 3.06 3.12 1.85 2.52 2.87 2.65 1.87 2.24 1.87 2.46 3.37 2.97 2.61 2.14 1.58 1.51 2.02 3.14 1.82 3.40 2.91 1.75 2.06 1.84 1.58 3.03 3.08 2.13 1.82 1.81 2.08 2.23 3.02 1.85 1.50 1.81 1.72 2.34 1.90 3.12 3.40 2.03 1.50 3.16 2.70 2.84 2.52 1.55 B A C C C B D D B B A D D D A A A C B D A D C B C B B A D C C C C B C D B D B A C A D D C A D A A 64.1 92.1 67.4 82.5 70.9 64.6 104.7 103.9 79.3 126.8 83.6 44.0 46.0 54.0 55.1 49.2 49.1 57.6 53.4 48.2 50.2 41.9 39.8 35.4 36.2 44.1 38.0 44.4 37.4 43.6 44.8 44.5 39.8 0.73 0.78 0.80 0.72 0.77 0.77 0.78 0.81 0.76 0.72 0.76 3.02 2.99 3.09 2.68 3.06 2.66 2.81 3.05 2.48 2.14 2.55 1.93 2.48 1.55 2.49 3.22 2.95 3.41 2.19 2.83 3.18 2.57 A C A A B A B B A B A Instructions Please refer to the Anova and Simple Regression Projects. Your company would like you to complete their sales prediction model. They would like you to ascertain if the other variables for which they have data also affect sales. A complete model will have to include advertising expenditures and package design along with the other variables listed above. 1. Create three dummy variables named DA, DB, and DC to capture the effects of the four levels of the categorical variable. Then use Tools > Data Analysis > Regression, to fit a regression of Sales as a function of all the variables in your data set (variables 3 through 7 above), plus the three dummies DA, DB, and DC. 2. Conduct the F-test for model significance and report your results. 3. Does your model appear to be adequate for the purpose intended? (Refer to goodness-of-fit measures, in particular, R, adjusted R, and the standard error of estimate.) 4. Your boss wants to know what you predict will be the effect on company sales if the company increases its price. What will be your response? 5. Do changes in your competitor's price have a significant impact on your company's sales and, if so, at what significance level? 6. Are any of the other variables in your model significant in determining sales at the 5% significance level or better? 7. Your boss also wants to know about the effectiveness of the various advertising methods. Report your findings with regard to this variable

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

High School Math 2012 Common-core Algebra 2 Grade 10/11

Authors: Savvas Learning Co

Student Edition

9780133186024, 0133186024

More Books

Students also viewed these Mathematics questions

Question

8 Describe the use of social media in B2B marketing.

Answered: 1 week ago