Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Econ 295, Fall 2017, Prof. Lesica DU E : December 6, 2017 ASSIGNMENT 3, Points: 60 + 6 Bonus F F F F F AT

Econ 295, Fall 2017, Prof. Lesica DU E : December 6, 2017 ASSIGNMENT 3, Points: 60 + 6 Bonus F F F F F AT THE BEGINNING OF CLASS ONLY This is not a group assignment. Even if you work in a group, you have to hand in individual answers. Write your answers clearly, legibly and completely. Partial marks are possible if the answer is correct up to a point. Questions from the textbook refer to the Stock and Watson 3rd Updated edition. Make sure you put your name and student ID on every paper you hand in as an answer. STAPLE, CLIP, TAPE, HOOK! Neither the professor nor the T.A. are responsible for lost pages if you do not bind them together securely. Question 1 [10 points]. Sir Francis Galton (1822-1911), an anthropologist and cousin of Charles Darwin, created the term regression. In his article \"Regression towards Mediocrity in Hereditary Stature,\" Galton compared the height of children to that of their parents, using a sample of 930 adult children and 205 couples. In essence he found that tall (short) parents will have tall (short) offspring, but that the children will not be quite as tall (short) as their parents, on average. Hence there is regression towards the mean, or as Galton referred to it, mediocrity. This result is obviously a fallacy if you attempted to infer behavior over time since, if true, the variance of height in humans would shrink over generations. This is not the case. To research this result, you collect data from 110 college students and estimate the following relationship: Yi = 19.6 + 0.73M idparh (7.2) (0.1) 2 = 0.45 R n = 110 where Studenth is the height of students in inches and Midparh is the average of the parental heights. (a) Interpret the estimated coefficients. What is the prediction for the height of a child whose parents have an average height of 70.06 inches? (b) Test for the statistical significance of the slope coefficient. (c) Construct a 95% confidence interval for a one inch increase in the average of parental height. (d) If children, on average, were expected to be of the same height as their parents, then this would imply two hypotheses, one for the slope and one for the intercept. (i) What should the null hypothesis be for the intercept? Calculate the relevant t-statistic and carry out the hypothesis test at the 1% level. (ii) What should the null hypothesis be for the slope? Calculate the relevant t-statistic and carry out the hypothesis test at the 5% level. (e) Galton was concerned about the height of the English aristocracy and referred to the above result as \"regression towards mediocrity.\" What was his concern? Why do you think we refer to this result today as \"Galton's Fallacy\"? Question 2 [10 points]. In the process of collecting weight and height data from 29 female and 81 male students at your university, you also asked the students for the number of siblings they have. Although it was not quite clear to you initially what you would use that variable for, you construct a new theory that suggests that children who have more siblings come from poorer families and will have to share the food on the table. Although a friend tells you that this theory does not pass the \"straight-face\" test, you decide to hypothesize that peers with many siblings will weigh less, on average, for a given height. In addition, you believe that the muscle/fat tissue composition of male bodies suggests that females will weigh less, on average, for a given height. To test these theories, you perform the following regression: \\ i = 229.92 6.52 F emalei + 0.51 Sibsi + 5.58 Heighti Studentw (44.01) (5.52) (2.25) (0.62) 2 = 0.5 R where Studentw is in pounds, Height is in inches, Female takes a value of 1 for females and is 0 otherwise, Sibs is the number of siblings (heteroskedasticity-robust standard errors in parentheses). 1 2 (a) Carrying out hypotheses tests using the relevant t-statistics to test your two claims separately, is there strong evidence in favor of your hypotheses? Is it appropriate to use two separate tests in this situation? (b) You also perform an F-test on the joint hypothesis that the two coefficients for females and siblings are zero. The calculated F-statistic is 0.84. Find the critical value from the F-table. Can you reject the null hypothesis? Is it possible that one of the two parameters is zero in the population, but not the other? (c) You are now a bit worried that the entire regression does not make sense and therefore also test for the height coefficient to be zero. The resulting F-statistic is 57.25. Does that prove that there is a relationship between weight and height? Question 3 [10 points]. Do Exercises Question 8.2 in your textbook. (page 301 in the 3rd Updated edition) You HAVE TO use the Updated edition [Careful! NOT the Review the Concepts questions.] The last 2 questions are empirical, requiring the use of statistical software. For empirical questions, meaning involving data and estimation, whenever you run regressions or plot data, you have to report the results. In particular, you have to print and submit your .log file with the output (if using STATA) or .R file with the output (if using R). You must submit a hard copy of these as your answers. Materials should be stapled together in order by problem. The most readable and elegant format for assignment answers incorporates student comments, code, output, and graphics into a seamless document. Question 4 [15 points]. Use the minwage.dta data for this problem. The study of minimum wage effects on employment (among other outcomes) is (still) very current and popular in economics. In this problem you will rework through a famous example in learn more about regression dummy variables and interaction terms. To quickly contextualize what this is about, a summary of the study is in your textbook, in a blue box, on page 497, in chapter 13. This is just for information purposes. You should read it. Also, read the summary about the study and data used here: http://www.stat.ucla.edu/projects/datasets/fastfood-explanation.html. These are the key variables that you need for the assignment. EMPFT = number of full-time employees BEFORE the min wage changed EMPPT = number of part-time employees BEFORE the min wage changed NMGRS = number of managers/ass't managers BEFORE the min wage changed EMPFT2 = number of full-time employees AFTER the min wage changed EMPPT2 = number of part-time employees AFTER the min wage changed NMGRS2 = number of managers/ass't managers AFTER the min wage changedb REMPFT = Number of full time employees REMPPT = number of part time employees RNMGRS = number of managers time = 1 for observations AFTER, = 0 for observations BEFORE the minimum wage change. STATE = 1 if NJ; 0 if PA Questions (a) Load the minwage.dta and create two different variables: (1) F T E1 = EM P F T + (0.5 EM P P T ) + N M GRS (2) F T E2 = EM P F T 2 + (0.5 EM P P T 2) + N M GRS2 The first (FTE1) is a measure of full time employment BEFORE the min wage change. The second (FTE2) is a measure of full time employment AFTER the min wage change. 3 (b) Now calculate 4 different conditional means (expected values):1 (1) mean (F T E1|ST AT E = 0) (2) mean (F T E1|ST AT E = 1) (3) mean (F T E2|ST AT E = 0) (4) mean (F T E2|ST AT E = 1) How do you interpret these employment means? What does the condition ST AT E = 1 or = 0 indicate? (c) Lastly, create these variables (1) F T E = REM P F T + (0.5 REM P P T ) + RN M GRS (2) a N J time interaction variable. (something like \"gen TS = time*STATE\" in Stata.) Run the following regression F T Eist = 0 + 1 ST AT Es + 2 timet + 3 (ST AT E time) + uist What are the estimated coefficients? How do they compare to your answers in (b)? Question 5 [15 points]. In your macroeconomics class you learned about the quantity theory of money (QTM). This idea has been around for several hundred years actually. It states a simple relationship between money and prices. First, changes in the quantity of money have a positive effect on the general price level. Second, as an empirical matter, movements in the money stock should account for the major long run changes in the price level. QTM has been written as the following relationship: P Y = M V, where M is the money supply, V is the velocity of money, P is the price level and Y is real output (income). Dividing both side by Y gives MV P = . Y If V and Y are constant, then changes in M result in equal changes in P . However, this version of the QTM is too strict to hold in the real world. A less restrictive version can be written in terms of % changes: P M Y V = + P M Y V Therefore, changes in money supply, income and velocity cause changes in the price level. Suppose we treat velocity of money as an unobserved error term. We can then express the QTM by the regression (1) inf li = 0 + 1 mgrowthi + 2 ygrowthi + ui . P P Y Note that inf li is simply for country i, mgrowthi is M M and ygrowthi is Y . Finally, ut represents V V for country i. If the QTM holds, then 1 = 1 and 2 = 1. Use the money.csv data to test this proposition. The data contains the following variables. inf li = inflation of country i mgrowthi = money growth of country i ygrowthi = output growth of country i Questions (a) Estimate regression (1) and report the results. Are the coefficient (point) estimates consistent with the QTM? (b) Test individually the hypotheses corresponding to the QTM; that is, individually test that 1 = 1 and 2 = 1. Are test results consistent with the QTM? [In Stata, try using both test and ttest commands, and compare the results.] (c) Evaluate the join hypothesis 1 = 1, 2 = 1 using an F-test. Are your results consistent with the the QTM? Does the joint test give the same result as the individual tests? 1Careful, 'calculating' something here means actually computing or estimating a value. Not the same as 'creating' a variable from other variables in the data. 4 BONUS Question [6 points]. In this problem you will learn the \"charms\" of constructing datasets and working with raw data. You have likely seen the big banner hung over the building across the Lazaridis Hall, proudly pointing out how WLU is ranked no. 1 in student satisfaction, and second year in the row, to boot. Let's analyze this proposition with an economist's hat on, using our knowledge of econometrics. WLU's no. 1 ranking in student satisfaction, in the comprehensive university category, comes from the Maclean's survey of students. You can find the rankings for comprehensive universities here: http://www.macleans.ca/education/university-rankings/comprehensive-universities/ Maclean's also has data on the average number of hours students spend partying per week. The data can be found here: http://www.macleans.ca/education/ranking-canadas-top-party-universities-for-2018/ Questions (a) First, you need to create a data set containing these variables. Specifically, open a spreadsheet program, create 4 variables in 4 different columns. Open the above links and enter the corresponding variables across universities. (1) university - Enter appropriate university name/acronym. (Best to avoid spaces.) (2) year - Set 2018 for every university. I know it looks weird, but this is how Maclean's chooses to do it. (3) ss - rank number. Keep in mind it is only for comprehensive universities.2 Be careful that you do not make an error when entering data! Measurement errors suck. (4) partyh - carefully match the data on hours spent partying with the corresponding university. Save the file as a .csv, BUT open it (load it) in Stata or R for econometric analysis. You may not use excel under any circumstance! (b) Estimate the relationship between Student satisfaction rank and hours of partying. What is the 1 coefficient on party hours? What is its significance? Is partying good? Create a scatter plot of the relationship between Student Satisfaction (on Y-axis) and Party Hours (on X-axis).3 Include the regression line (fit) through this scatter plot. (c) Luckily, Maclean's also has data on student satisfaction ranking of comprehensive universities and party hours for 2 previous years: 2016 and 2017. You should include these in your data file, thereby creating a panel-data structure. I included articles with the data for those two years with assignment on MLS. You can also google for corresponding articles in previous years. Go back to your .csv file, create 2 more rows for each university and enter corresponding values of ss and partyh for those two years. What is this type of data structure called? (HINT: Check chapter 1.) Is the data balanced? How many more observations do you now have compared to question (a)? (d) Estimate the OLS regression of student satisfaction on party hours using all 3 years of data you collected. Is partying (still) good? Again, create a scatter plot of the relationship between Student Satisfaction (on Y-axis) and Party Hours (on X-axis) as in (b). Include the new OLS regression line (fit) through this scatter plot. (e) Finally, create a set of dummy variables: one for each year (so, 3 dummies) and 1 for each university you have (yes 15 university dummies!) Using the gen function in Stata is easiest probably. Now, regress ss on partyh, but include a whole set of year and university dummies in this regression. (It might look long and weird, but if you did it correct that will be fine.) You can ignore the coefficients on the dummies, they do not interest us, but what is the coefficient on partyh now? Is partying (still) good? Do not loose the code you created for these problems. You will likely need it again and will find it useful. Last updated: November 28, 2017 2Which means WLU is really not no. 1. Welcome to lying with statistics. 3Given the regression of Y on X, you should always put the Y (X) variable on the Y (X) axis

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Using R For Econometrics

Authors: Florian Heiss

1st Edition

1523285133, 9781523285136

More Books

Students also viewed these Economics questions

Question

40. Compare and contrast IPSec tunnel mode and IPSec transfer mode.

Answered: 1 week ago

Question

Describe the linkages between HRM and strategy formulation. page 74

Answered: 1 week ago

Question

Identify approaches to improving retention rates.

Answered: 1 week ago