Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Assignment 3 This assignment is due Tuesday, March 26. This question is practice in using the questionnaire and codebook to find exactly the data you

Assignment 3

This assignment is due Tuesday, March 26.

  1. This question is practice in using the questionnaire and codebook to find exactly the data you want. It will also be the first part of a later assignment, which will be about studying gender gaps in pay for public school teachers.

Your goal for now is to identify elementary (primary) and secondary public school teachers in the 2017 NSCG data. (The data are in nscg17.rdata) You need to make a dummy variable in the nscg17 dataframe that is 1 for the teachers. To do this you need to take into account two things:

Who they work for. This requires using at least two variables. Public refers to schools run by local governments.

What they do. Include only individuals explicitly identified as an elementary or secondary teacher. Exclude kindergarten teachers.

  1. Do each of those two things in separate codeblocks and describe in words what each block is doing.
  2. Use R to verify that you ended up with 3,791 teachers. If you have fewer, you probably missed a category of teacher.

Hint 1: Remember that not all variables that look numeric in the codebook are actually numeric when you read them into R.

Hint 2: The best way to start on this little project is to look at the 2017 questionnaire to find questions that will tell you about those two criteria. The questionnaire has annotations tying variables to questions. Look up the variables in the codebook.

Hint 3: In some cases there is more than one variable that seems relevant. These usually differ in how much detail they offer. For this question, you probably want the one with the most detail, even though it might seem like a hassle.

  1. This question uses data from the National Longitudinal Survey of Youth 1997 cohort (NLSY97) and will help you understand interaction effects and how to implement them. Pay careful attention to the clarity and precision of your writing about the regressions (that doesnt mean you need to write a lot).

Notes about the NLSY data: (1) Negative values indicate missing data, which should be converted to NAs in R. (2) There is a codebook in the codebooks directory, which you will need, but there is a trick. The original variable names are mysterious, so I renamed them in the file renameNLSY97.r. These are the ones you need:

age97 R1194000 sex R0536300

height.footpart97 R0322500 height.inchpart97 R0322600

  1. Load the data and convert missing data values to NA. Make a height variable from height.footpart97 and height.inchpart97. Youll also need a dummy variable for female and a variable for age minus 12 years. As usual, all of this should be done using within.
  2. Regress height on age in 1997. Then regress height on age minus 12 (the youngest NLSY respondents were 12 years old in 1997). What is the predicted height of an individual who was 12 years old according to each regression? (Use lots of decimal places to get the point of this.) What is the predicted height of an individual who is 15 years old according to each regression?
  3. One or two sentence answer. Explain why age should be interacted with sex if were interested in explaining the heights of teenagers in the NLSY97 (think back to peoples heights as you went through high school). Remember that everybody in the data is between 12 and 16 years old. It may help to sketch your idea about how age and height are related for boys and girls in their teens (you dont need to include your sketch in your assignment).
  4. Run a regression that includes an interaction of age minus 12 with female. You should have three explanatory variables.
  5. Sketch the estimated relationship between age and height for boys and girls between 12 and 16. You should have two lines and two intercepts. (You dont need to turn in your sketch, but it will help a lot in understanding whats going on her) What are the intercepts and what do they represent? What are the slopes and what do they represent in terms of teenagers heights? (Remember that youre using age minus 12 in the regression, not age.)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions