Question
ABC Company, a discount clothing retailer, would like to compare the sales per square foot in its different locations, taking into account the substantial differences
ABC Company, a discount clothing retailer, would like to compare the sales per square foot in its different locations, taking into account the substantial differences in income and population. The management is planning to use this information to guide their location selections for new store openings.
The table below , includes Retail Sales data at 87 locations. This data set has been randomly split into two sections: The Training dataset contains 67 observations, representing the bulk of the locations, and the Holdoutdataset includes 20 observations to use to test and validate models that are constructed with the Trainingdataset.
For each of the locations in the dataset, we have values of the following variables: Variable | Coding & explanation |
STORE ID . | Each store has a unique identifier number |
SALES | Sales per square foot (in dollars/sq-ft) |
INCOME | The median household income in the surrounding community (in dollars) |
POPULATION | The size of the surrounding population (in thousands) |
MARKET | The market type which can take 3 possible values: Rural, Urban and Suburban |
1. Using the Trainingdataset, run a regression with Sales as the dependent variable and Population as the only explanatory variable. Print out regression output on a single page, and answer the following:
a) Is the sign on the slope coefficients for Population as you expected? Please provide a brief justification for your conclusions.
b) Provide a brief interpretation of the value of the Population coefficient in your model.
c) What percent of the variation in Sales is explained by your model?
d) Using the Holdoutdataset, measure the RMSPR for the regression model.
2. Build the best model you can to predict the sales per square foot at any ABC location. You may use any of the variables listed above and/or create new ones. Once you have settled on your model, print the regression output and provide a 1-2 page discussion of your regression model that covers the following topics:
a) How did you choose the explanatory variables in your regression?
b) If you created new variables, explain what they are.
c) Explain how you validated your model, and justify why your model is better than the one you created in Question 1 and any other models you created.
Store ID | Sales ($/SF) | Income | Population (000) | Market | Random | |
1 | 393.4 | 65000 | 710 | Urban | 0.9421167 | |
2 | 488.8 | 93000 | 947 | Suburban | 0.8753115 | |
3 | 471.4 | 67000 | 831 | Rural | 0.325584 | |
4 | 532.9 | 75000 | 470 | Urban | 0.809741 | |
5 | 588.3 | 72000 | 829 | Urban | 0.8358955 | |
6 | 409.9 | 60000 | 774 | Rural | 0.5073342 | |
7 | 561.1 | 69000 | 684 | Urban | 0.6258087 | |
8 | 564.2 | 52000 | 592 | Urban | 0.3979399 | |
9 | 213.8 | 63000 | 222 | Rural | 0.6684851 | |
10 | 417.5 | 76000 | 1034 | Suburban | 0.6246485 | |
11 | 290.9 | 65000 | 395 | Rural | 0.4298419 | |
12 | 468.4 | 86000 | 713 | Suburban | 0.0520569 | |
13 | 319.8 | 66000 | 260 | Rural | 0.4669508 | |
14 | 209.4 | 62000 | 410 | Rural | 0.8547977 | |
15 | 476.2 | 87000 | 850 | Suburban | 0.8999421 | |
16 | 283.5 | 64000 | 635 | Suburban | 0.8277788 | |
17 | 353.2 | 52000 | 551 | Urban | 0.9272327 | |
18 | 441.5 | 78000 | 326 | Urban | 0.023217 | |
19 | 481.2 | 78000 | 463 | Urban | 0.7893981 | |
20 | 572.6 | 64000 | 760 | Urban | 0.7812228 | |
21 | 355.8 | 73000 | 798 | Suburban | 0.5811773 | |
22 | 376 | 63000 | 612 | Rural | 0.3660083 | |
23 | 428 | 70000 | 926 | Suburban | 0.1530077 | |
24 | 310.5 | 65000 | 438 | Rural | 0.7012616 | |
25 | 555.2 | 73000 | 992 | Suburban | 0.1562667 | |
26 | 455.9 | 74000 | 786 | Suburban | 0.5304186 | |
27 | 489.1 | 68000 | 712 | Urban | 0.1864392 | |
28 | 454 | 92000 | 424 | Suburban | 0.0726347 | |
29 | 469.3 | 86000 | 1224 | Suburban | 0.3940539 | |
30 | 349.2 | 56000 | 646 | Rural | 0.8580489 | |
31 | 475.7 | 80000 | 953 | Suburban | 0.5251337 | |
32 | 162.8 | 50000 | 560 | Rural | 0.8733175 | |
33 | 458.2 | 69000 | 548 | Rural | 0.0416188 | |
34 | 495.8 | 84000 | 796 | Suburban | 0.7544696 | |
35 | 612.8 | 73000 | 872 | Urban | 0.7337205 | |
36 | 466.6 | 78000 | 923 | Suburban | 0.1718359 | |
37 | 401.7 | 71000 | 626 | Suburban | 0.2691121 | |
38 | 375.1 | 66000 | 323 | Rural | 0.5256277 | |
39 | 490.1 | 67000 | 559 | Urban | 0.608761 | |
40 | 646.4 | 78000 | 531 | Urban | 0.8947365 | |
41 | 607.6 | 72000 | 499 | Urban | 0.0783688 | |
42 | 476.4 | 56000 | 947 | Rural | 0.4114759 | |
43 | 305.1 | 54000 | 705 | Rural | 0.0819157 | |
44 | 587.4 | 71000 | 1077 | Urban | 0.0897385 | |
45 | 491.1 | 59000 | 610 | Urban | 0.132568 | |
46 | 483.5 | 59000 | 663 | Urban | 0.7551134 | |
47 | 379.3 | 86000 | 849 | Suburban | 0.2820937 | |
48 | 437.9 | 95000 | 884 | Suburban | 0.4860746 | |
49 | 508.9 | 67000 | 672 | Urban | 0.4371968 | |
50 | 573.8 | 76000 | 769 | Urban | 0.2634168 | |
51 | 376.8 | 85000 | 828 | Suburban | 0.2548404 | |
52 | 518.1 | 63000 | 628 | Rural | 0.9827471 | |
53 | 467.9 | 78000 | 421 | Urban | 0.5087367 | |
54 | 582 | 76000 | 569 | Urban | 0.8624967 | |
55 | 550.5 | 64000 | 452 | Urban | 0.4769442 | |
56 | 544.3 | 89000 | 503 | Urban | 0.2909594 | |
57 | 467.1 | 61000 | 581 | Urban | 0.7059228 | |
58 | 126.9 | 55000 | 321 | Rural | 0.64167 | |
59 | 382 | 84000 | 740 | Suburban | 0.0885328 | |
60 | 403.3 | 61000 | 685 | Urban | 0.3819721 | |
61 | 407.5 | 63000 | 945 | Rural | 0.3342186 | |
62 | 512 | 71000 | 1075 | Suburban | 0.3314819 | |
63 | 691.7 | 76000 | 699 | Urban | 0.6074417 | |
64 | 569.4 | 73000 | 650 | Urban | 0.7365464 | |
65 | 425.6 | 70000 | 552 | Rural | 0.1462433 | |
66 | 481 | 84000 | 514 | Urban | 0.8343099 | |
67 | 465.5 | 90000 | 641 | Suburban | 0.9501743 | |
68 | 442 | 74000 | 862 | Suburban | 0.6062411 | |
69 | 272.7 | 59000 | 434 | Rural | 0.0514388 | |
70 | 630.2 | 79000 | 531 | Urban | 0.1486485 | |
71 | 416.7 | 73000 | 985 | Suburban | 0.8221382 | |
72 | 527.5 | 71000 | 597 | Urban | 0.9333084 | |
73 | 341.4 | 77000 | 729 | Suburban | 0.8096481 | |
74 | 325.7 | 58000 | 611 | Rural | 0.982407 | |
75 | 461.1 | 79000 | 793 | Suburban | 0.1957593 | |
76 | 352.7 | 75000 | 827 | Suburban | 0.2156417 | |
77 | 325.9 | 74000 | 270 | Rural | 0.2121122 | |
78 | 458.3 | 64000 | 452 | Rural | 0.50538 | |
79 | 362.9 | 80000 | 578 | Suburban | 0.5865473 | |
80 | 442.9 | 66000 | 738 | Rural | 0.4057885 | |
81 | 641.9 | 88000 | 755 | Urban | 0.3748682 | |
82 | 287.2 | 56000 | 401 | Rural | 0.5828727 | |
83 | 313.1 | 62000 | 433 | Rural | 0.6306487 | |
84 | 468.4 | 72000 | 656 | Rural | 0.7036573 | |
85 | 616.5 | 69000 | 726 | Urban | 0.3942697 | |
86 | 276.6 | 65000 | 658 | Rural | 0.2217373 | |
87 | 580.2 | 71000 | 482 | Urban | 0.7480467 | |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started