Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Simple Regression Analysis What are we trying to accomplish? Quantify the relationship between two variables or more. How can we quantify a relationship between two

Simple Regression Analysis What are we trying to accomplish? Quantify the relationship between two variables or more. How can we quantify a relationship between two variables? 1. Correlation Analysis 2. Regression Analysis How does the correlations analysis work? 1. Correlation coefficient calculation 2. t-test for significance How does the regression analysis work? 1. Types of regression analysis Simple vs. multiple regression analysis 2. Names of variables Dependent vs. independent variables 3. Types of relationships Linear vs. nonlinear 4. The Estimation of a Linear Regression Line or Equation Population: Y=+X+ Estimated: Y = Y +e where Y = a + b X Because we do not know and , we must estimate them by a and b, respectively. Among many possible ways to estimate a and b, it makes the best sense to choose a and b such that the error sum of squares (ESS or sum of errors squared) is the smallest (or least). This is called the Ordinary Least Squares (OLS) Method. 5. Assumptions of Regression Analysis Normal distribution of the dependent variable Errors are normally distributed Independent distribution of errors Identical distribution of errors about the mean of zero and a fixed standard error 6. What do these assumptions allow one to do? They enable one to conduct a statistical significance test of individual regression coefficients, using the t-distribution. 7. How does the significance test for individual regression coefficients work? The null and the alternative hypothesis The Decision Criteria Exactly the same as the single-sample hypothesis testing 8. What other information can be obtained from regression analysis? The measure of goodness of fit and explanatory power R-square The measure of model adequacy and efficiency Adjusted R-square The measure of model acceptability F-statistic 9. What does an F-test do? How does it work? The Null and the Alternative Hypothesis The decision criteria 10. Is there a single reference table that condenses all this information? ANOVA Table is it. 11. Is there a way to know if certain assumptions are violated? Yes. We will only study the detection of first-order autocorrelation via the DurbinWatson statistic. The Durbin-Watson statistic, DW, is defined as: The decision criteria are: 12. What other violation of assumptions is possible? Homoscedasticity is violated Heteroscedasticity Unequal variance of errors When heteroscedasticity is detected, the Generalized Least Squares (GLS) method is used to estimate the regression equation. 13. Show me an Example!!! GSB 420 Applied Quantitative Methods Dr. Jin Choi Chapters 13 and 14. Regression Analysis A The Ordinary Least Squares (OLS) Method 1 The Simple Regression Model: Given the population regression of Yt X t t where Yt = dependent variable = actual value of Y at time period t Xt =independent variable = actual value of X at time perioed t = population intercept term = population slope or coefficient term = population error term at time period t t Note: We assume that Yt and Xt are measured without error. 2. The Goal: We are to estimate the above population regression equation by where Yt Yt a b X t = estimated value for Yt a = estimated intercept term b = estimated slope coefficient term 3. The Inevitable Dilemma The estimation and use of the above equation creates an error of: et Yt Yt This means 4. Yt Yt et a b X t et A Solution to the Above Dilemma Find a and b such that et would be the smallest for all time periods. Mathematically, this means finding a and b that minimize the sum of squared errors (SSE): Find a and b that Minimize SSE = n n n t 1 t 1 t 1 et2 (Yt Yt ) 2 (Yt a b X t ) 2 Use your optimization skill to find a and b as: b= S XY ( X t X )(Yt Y ) S X2 ( X t X )2 a= Y b X Thus, this method is called the Least Squares Method. B. Assumptions for Statistical Tests Once you estimate a and b, the next issue is to know how reliable, trustworthy, and confident these estimates are in relation to the population parameters of and . Thus, ww must conduct statistical tests. However, before we do the tests, we must understand what assumptions are present in doing so. The assumptions as follows: 1. Xt are measured without error. 2. Yt are normally distributed given Xt of zero E( et ) = 0. et are normally distributed around a mean 3. et are independently distributed in relation to other error terms such as .... , and 4. et et k , , et 1 et 2 no serial- or auto-correlation. are identically distributed have the same variance, e2 . These assumptions are condensed and simplified as: et C. n.i.i.d. (0, e2 ) The Two-Tail Significance Tests of Coefficients We can ask whether the intercept term or the coefficient term is equal to (or different from) a specific value 0 such as one or zero or 10 This implies that we are hypothesizing either = 0 or = 0 where 0 and 0 can any value of one's interest. Let's choose the case of = 0. If so, we must set up the following: Test at a 5% significance level H0: = 0 1. vs. H1: 0 The Critical Value Approach a. Identify the critical t-value (=table t-value) from the t-table. That is, the test statistic to use is t with (n-k-1) degrees of freedom where k = the number of independent variables used in the regression equation In this case of simple regression, k=1: That is, find the table value of t, ttable = t/2,(n-k-1), from the t-table at a 5% significance level. b. Calculate the calculated t-value by: tcalculated = tc = b 0 Sb where b = estimated value from the regression equation 0 = the hypothesized value of 0 = 0 in this case. standard error of b Sb c. Apply the decision criteria: If |tc| < t/2,(n-k-1), then accept H0: = 0 0 = 0 in this case. Otherwise, reject H0: = 0 accept H1: 0 2. The Confidence Interval Approach a. Construct a 95% C.I. for as: b. Apply the decision criteria: = b t / 2,( n k 1) S b If the hypothesized value, 0, falls within the confidence interval, accept H0: = 0 such as 0 = 0. Otherwise, reject H0. 3. The p-value Approach a. Find the p-value from the t-table (or given by the computer output). b. Apply the decision criteria: If the p-value found > the chosen significance () level, accept H0: = 0 such as 0 = 0. If the p-value found < the chosen significance () level, reject H0. Exercise Problems on Chapter 13. Simple Regression Analysis This set of exercise problems has 23 problems, worth 25 points. 1. In simple regression analysis, the dependent variable is also known as a/an _____ variable and the independent variable is also known as a/an _____ variable. a. c. e. 2. b. d. response; explanatory all of the above In simple regression analysis, you will find _____ independent variable(s) and ____ dependent variable(s). a. d. 3. exogenous; endogenous to-be-determined; predetermined only (b) and (c) of the above one; one two; two Given the relationship of: b. e. one; two c. more than one; one two; one Y = 0 + 1 X + , 0 and 1are both called regression coefficients. More specifically, however, 0 is called a/an _____ and 1 is called a/an _____. a. c. e. 4. b. d. intercept; slope coefficient slope coefficient; slope coefficient The least-squares method for estimating regression coefficients is named as such because it tries to find the coefficient values by minimizing ______. a. b. c. e. 5. base; intercept slope coefficient; intercept Y-intercept; X-intercept the product of coefficients squared the sum of coefficients squared the sum of errors d. the sum of errors squared the sum of the dependent variable squared Given the following data of: Covariance between X and Y = SXY = 40 Variance of X = SX2 = 50 Number of observations = n = 10 X = 5 and Y = 8 the intercept value is _____ and the slope coefficient is _____. a. d. 6. 1.25; 1.75 X and Y are positively and directly related. as X decreases, Y decreases as well. as X increases, Y increases as well. all of the above is true. only (a) and (c) are true. X and Y are positively and directly related. as X decreases, Y increases as well. as X increases, Y decreases as well. X can determine the value of Y. none of the above. TSS = RSS + ESS RSS = TSS - ESS only (a) and (c) of the above b. d. ESS = TSS - RSS all of the above The regression sum of squares (RSS) is also called _____ variation and the error or residual sum of squares (ESS) is also called _____ variation. a. c. e. 10. c. The analysis-of-variance (ANOVA) table utilizes the partition of the various sums of squares. Given the total sum of squares (TSS), the regression sum of squares (RSS) and the error or residual sum of squares (ESS), which of the following relationship is true? a. c. e. 9. 0.8; 4 0.4; 0.08 If a slope coefficient is statistically equal to zero, it means that _____. a. b. c. d. e. 8. b. e. A positive slope coefficient means that _____. a. b. c. d. e. 7. 4; 0.8 1.75; 1.25 explained; unexplained total; uncertain only (c) and (d) of the above b. d. unexplained; explained good; bad Given the following data: The regression sum of squares (RSS) = 104 The error or residual sum of squares (ESS) = 56 The coefficient of determination is _____ and the correlation coefficient between X and Y is _____. a. d. 11. 0.65; 0.8062 standard deviation of Y variance of the predicted Y any of the above b. d. standard error of the estimate variance of Y errors (or residuals) are independent of one another. errors are normally distributed. errors are identically distributed and thus, have an equal variance. all of the above. only (a) and (b) of the above. errors (or residuals) are correlated an independent variable are correlated errors (or residuals) are not correlated an independent variable are not correlated errors (or residuals) are identical. Autocorrelation problem can be detected by the use of the _____. a. d. 15. c. Autocorrelation means that lagged values of _____ with one another. a. b. c. d. e. 14. 0.7338; 0.5384 0.65; 0.4225 Which of the following is an assumption of regression analysis? a. b. c. d. e. 13. b. e. A statistic that measures how much actual values of Y vary around the predicted values of Y on the basis of a regression equation is called the _____. a. c. e. 12. 0.5384; 0.7338 0.8062; 0.65 Z-statistic Durbin-Watson statistic b. e. t-statistic c. F-statistic either (b) or (d) of the above The significance test for a slope coefficient in an estimated regression equation can be done by the use of the _____. a. d. Z-statistic Durbin-Watson statistic b. e. t-statistic c. F-statistic either (a) or (b) of the above Given that the following demand relationship between the quantity demanded (Q) and its per-unit price (P) is estimated by a simple regression equation on the basis of 20 observations, answer Questions 16 through 21: Coefficient Intercept P 100 -5 Standard Error 45 2 p-value 0.03 0.005 The corresponding ANOVA table shows only the sums of squares as follows: Regression Error (or Residual) 16. Which of the following represents the estimated regression equation on the basis of the above information? a. d. 17. b. e. Q = 45 + 2P Q = 2.22 - 2.5P c. P = 100 - 5Q 0; 0 2.22; 2.5 b. e. 100; -5 2.22; -2.5 c. 45; 2 If you are to conduct a significance test for each of the estimated regression coefficients at a 5% significance level, you would conclude that the intercept term is _____ and the slope coefficient is _____. a. c. e. 19. Q = 100 - 5P P = 45 + 2Q If you are to conduct a significance test for the above regression coefficients at a 5% significance level, you would find the calculated t-value for the intercept to be _____ and that for the slope coefficient to be _____. a. d. 18. Sum of Squares 200 150 insignificant; insignificant significant; significant normal; normal b. d. insignificant; significant significant; insignificant If the calculated t- value for the slope coefficient is -1.9, you would find the table value of t to be _____ and thus, conclude to _____ the null hypothesis of the slope coefficient being equal to zero at the 5% significance level. a. d. 20. b. e. 1.7341; accept 2.1009; reject c. 1.7341; reject Based on the sums of squares given, the coefficient of determination is _____ and the corresponding calculated F-statistic is _____. a. d. 21. 1.7247; reject 2.1009; accept 0.5714; 24 0.4286; 1.333 b. e. 0.4286; 24 0.75; 1.333 c. 0.5714; 1.333 Given that an F-test is to be done for the joint significance of coefficients, you would find that the degrees of freedom for the numerator and the denominator are respectively _____ and _____. a. d. 1; 19 2; 19 b. e. 1; 18 2; 18 c. 2; 20 The following are worth 2 points each. 22. On the basis of 15 observations, you found a correlation coefficient between the inflation rate change (via the CPI) and your salary change to be 0.65. If you are to test the significance of this correlation coefficient, you would use a/an _____ and conclude that there is a/an _____ relationship between the inflation rate change and your salary change at a 5% significance level. a. c. e. 23. Z-statistic; significant t-statistic; significant F-statistic; significant b. d. Z-statistic; insignificant t-statistic; insignificant If the slope coefficient is found to be 5 in a simple regression equation and its standard error to be 3 on the basis of 10 observations, the 95% confidence interval for this slope coefficient lies between _____ and _____. a. d. -1; 11 -0.4375; 10.4375 b. e. -1.918; 11.918 0; 12 c. -0.5785; 10.5785

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

An Introduction to the Mathematics of Financial Derivatives

Authors: Ali Hirsa, Salih N. Neftci

3rd edition

012384682X, 978-0123846822

More Books

Students also viewed these Mathematics questions