Question
The file banking.txt attached to this assignment provides data acquired from banking and census records for different zip codes in the banks current market. Such
The file banking.txt attached to this assignment provides data acquired from banking and census records for different zip codes in the bank’s current market. Such information can be useful in targeting advertising for new customers or for choosing locations for branch offices. The data show
median age of the population (AGE)
median income (INCOME) in $
average bank balance (BALANCE) in $
median years of education (EDUCATION)
In this exercise you are asked to apply regression analysis techniques to describe the effect of age education and income on average account balance.
Analyze the distribution of average account balance using histogram, and compute appropriate descriptive statistics. Write a paragraph describing distribution of Balance and use appropriate descriptive statistics to describe center and spread of the distribution. Discuss your findings. Also, do you see any outliers? Include the histogram.
Create scatterplots to visualize the associations between bank balance and the other variables. Discuss the patterns displayed by the scatterplot. Also, do the associations appear to be linear? (You can create scatterplots or a matrix plot). Include the scatterplots.
Compute correlation values of bank balance vs the other variables. Interpret the correlation values, and discuss which pairs of variables appear to be strongly associated. Include the relevant output that shows the correlation values.
What is the independent variable and what are the dependent variable in this regression analysis?
Use SAS to fit a regression model to predict balance from age, education and income. Analyze the model parameters. Which predictors have a significant effect on balance? Use the t-tests on the parameters for alpha=0.05. Include the relevant regression output.
If one of the predictors is not significant, remove it from the model and refit the new regression model. Write the expression of the newly fitted regression model.
Interpret the value of the parameters for the variables in the model.
Report the value for the R2 coefficient and describe what it indicates. Include the portion of the output that includes the R2 coefficient values.
According to census data, the population for a certain zip code area has median age equal to 34.8 years, median education equal to 12.5 years and median income equal to $42,401.
Use the final model computed in step (f) above to compute the predicted average balance for the zip code area.
If the observed average balance for the zip code area is $21,572, what’s the model prediction error?
Copy and paste your SAS code into the word document along with your answers.
Age Education Income Balance
35.9 14.8 91033 38517
37.7 13.8 86748 40618
36.8 13.8 72245 35206
35.3 13.2 70639 33434
35.3 13.2 64879 28162
34.8 13.7 75591 36708
39.3 14.4 80615 38766
36.6 13.9 76507 34811
35.7 16.1 107935 41032
40.5 15.1 82557 41742
37.9 14.2 58294 29950
43.1 15.8 88041 51107
37.7 12.9 64597 34936
36 13.1 64894 32387
40.4 16.1 61091 32150
33.8 13.6 76771 37996
36.4 13.5 55609 24672
37.7 12.8 74091 37603
36.2 12.9 53713 26785
39.1 12.7 60262 32576
39.4 16.1 111548 56569
36.1 12.8 48600 26144
35.3 12.7 51419 24558
37.5 12.8 51182 23584
34.4 12.8 60753 26773
33.7 13.8 64601 27877
40.4 13.2 62164 28507
38.9 12.7 46607 27096
34.3 12.7 61446 28018
38.7 12.8 62024 31283
33.4 12.6 54986 24671
35 12.7 48182 25280
38.1 12.7 47388 24890
34.9 12.5 55273 26114
36.1 12.9 53892 27570
32.7 12.6 47923 20826
37.1 12.5 46176 23858
23.5 13.6 33088 20834
38 13.6 53890 26542
33.6 12.7 57390 27396
41.7 13 48439 31054
36.6 14.1 56803 29198
34.9 12.4 52392 24650
36.7 12.8 48631 23610
38.4 12.5 52500 29706
34.8 12.5 42401 21572
33.6 12.7 64792 32677
37 14.1 59842 29347
34.4 12.7 65625 29127
37.2 12.5 54044 27753
35.7 12.6 39707 21345
37.8 12.9 45286 28174
35.6 12.8 37784 19125
35.7 12.4 52284 29763
34.3 12.4 42944 22275
39.8 13.4 46036 27005
36.2 12.3 50357 24076
35.1 12.3 45521 23293
35.6 16.1 30418 16854
40.7 12.7 52500 28867
33.5 12.5 41795 21556
37.5 12.5 66667 31758
37.6 12.9 38596 17939
39.1 12.6 44286 22579
33.1 12.2 37287 19343
36.4 12.9 38184 21534
37.3 12.5 47119 22357
38.7 13.6 44520 25276
36.9 12.7 52838 23077
32.7 12.3 34688 20082
36.1 12.4 31770 15912
39.5 12.8 32994 21145
36.5 12.3 33891 18340
32.9 12.4 37813 19196
29.9 12.3 46528 21798
32.1 12.3 30319 13677
36.1 13.3 36492 20572
35.9 12.4 51818 26242
32.7 12.2 35625 17077
37.2 12.6 36789 20020
38.8 12.3 42750 25385
37.5 13 30412 20463
36.4 12.5 37083 21670
42.4 12.6 31563 15961
19.5 16.1 15395 5956
30.5 12.8 21433 11380
33.2 12.3 31250 18959
36.7 12.5 31344 16100
32.4 12.6 29733 14620
36.5 12.4 41607 22340
33.9 12.1 32813 26405
29.6 12.1 29375 13693
37.5 11.1 34896 20586
34 12.6 20578 14095
28.7 12.1 32574 14393
36.1 12.2 30589 16352
30.6 12.3 26565 17410
22.8 12.3 16590 10436
30.3 12.2 9354 9904
22 12 14115 9071
30.8 11.9 17992 10679
35.1 11 7741 6207
Problem 2 [5 points] - ONLY for Graduate Students
Historical data about the Boston Marathon can be found on its website. The graph shows winning times (in minutes) for men and women against the year in which the race was run. Men’s times are represented by “M” and women’s time by “W”. The graph also displays two regression lines of winning times vs year for men and women. There is no dataset for this question, but answer the following questions based on the graph.
Consider the men’s winning times, is there evidence of a linear trend? Would you expect the slope of the regression line to be positive or negative?
Now let’s consider the winning times for women, is there evidence of a linear trend? Discuss.
If we fit two separate linear regression models for men’s and women’s winning times, which slope will be greater in absolute value?
Step by Step Solution
3.45 Rating (155 Votes )
There are 3 Steps involved in it
Step: 1
Please note that we cant provide solutions using paid softwares such as sas however an open so...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started