Question
2.1 Import/read the external CorporateBonds.csv data file into a new variable named bonds in RStudio using the read.table() function. Remember to specify the arguments for
2.1 Import/read the external CorporateBonds.csv data file into a new variable named bonds in RStudio using the read.table() function. Remember to specify the arguments for the header and sep arguments correctly in the function. Then use the tail() function to show the last 5 rows of variable bonds. (5 points)
2.2 First, you can develop a baseline simple linear regression (SLR) model which uses a single independent variable Years () to predict the dependent variable Yield (). Store the built estimated baseline SLR equation's regression results into a new variable called bonds.slr.fit. Then apply the summary() function on bonds.slr.fit to show the regression report for this estimated baseline equation. How much percentage of variability in is explained by the estimated baseline equation according to ? (10 points)
2.3 Use the plot() function on bonds.slr.fit to get the diagnostic plots. Among the 4 diagnostic plots, show the diagnostic plot of Residuals vs Fitted here and use it to explain if the linearity assumption is violated or not. (5 points)
2.4 Assume the baseline model violates the linearity assumption, you may now consider using a quadratic regression model to capture the remaining quadratic pattern in the residual plot. Develop a quadratic regression model with two independent variables Years () and Years Squared () to predict the dependent variable Yield (). Store the built estimated quadratic regression equation's results into a new variable called bonds.quad.fit. Then apply the summary() function on bonds.quad.fit to show the regression report for this estimated quadratic regression equation. (10 points)
2.5 According to the regression report for the estimated quadratic regression equation from part 2.4, is the overall quadratic regression model significant at the 5% significance level, and which number you use to make the conclusion? Are the individual coefficient estimates of Years and Years Squared significant at the 5% level, and which numbers you use to make the conclusions? Per the provided value, how much percentage of variability in y is explained by the estimated quadratic regression equation? Do you see any improvement in terms of by this quadratic regression equation, compared to the previous baseline SLR equation? (10 points)
2.6 Use the plot() function on bonds.quad.fit to get the diagnostic plots. Show the diagnostic plot of Residuals vs Fitted here and use it to explain if the linearity assumption is severely violated or not. As a result, do you think the quadratic regression model is better than the baseline SLR model in terms of the linearity assumption and ? Why or why not? (10 points)
CORPORATEBONDS.CSV
CompanyTicker | Years | Yield |
GE | 1 | 0.767 |
MS | 1 | 1.816 |
WFC | 1.25 | 0.797 |
TOTAL | 1.75 | 1.378 |
TOTAL | 3.25 | 1.748 |
GS | 3.75 | 3.558 |
MS | 4 | 4.413 |
JPM | 4.25 | 2.31 |
C | 4.75 | 3.332 |
RABOBK | 4.75 | 2.805 |
TOTAL | 5 | 2.069 |
MS | 5 | 4.739 |
AXP | 5 | 2.181 |
MTNA | 5 | 4.366 |
BAC | 5 | 3.699 |
VOD | 5 | 1.855 |
SHBASS | 5 | 2.861 |
AIG | 5 | 3.452 |
HCN | 7 | 4.184 |
MS | 9.25 | 5.798 |
GS | 9.25 | 5.365 |
GE | 9.5 | 3.778 |
GS | 9.75 | 5.367 |
C | 9.75 | 4.414 |
BAC | 9.75 | 4.949 |
RABOBK | 9.75 | 4.203 |
WFC | 10 | 3.682 |
TOTAL | 10 | 3.27 |
MTNA | 10 | 6.046 |
LNC | 10 | 4.163 |
FCX | 10 | 4.03 |
NEM | 10 | 3.866 |
PAA | 10.25 | 3.856 |
HSBC | 12 | 4.079 |
GS | 25.5 | 6.913 |
C | 25.75 | 8.204 |
GE | 26 | 5.13 |
GE | 26.75 | 5.138 |
T | 28.5 | 4.93 |
BAC | 29.75 | 5.903 |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started