Question
1. (a) Each subject in a study has two lines of data. Line 1 contains an ID, GENDER ( M or F), and DOB; line
1. (a) Each subject in a study has two lines of data. Line 1 contains an ID, GENDER( M or F), and DOB; line 2 contains HEIGHT and WEIGHT. All data values are separated by commas.
Write a SAS DATA step using List input to read the following sample lines of data into a data set named Q1_A:
001,M,06/14/1944
68,155
002,F,12/25/1967
52,99
003,M,07/04/1983
72,128
004,M,08/05/1982
70,115
005,F,09/13/1975
56,113
(b) Each data line below contains ID in columns 1-3, DOB (in MMDDYY form) in columns 4-11, SEX(M or F) in column 12, HEIGHT in columns 13-14, WEIGHT in columns 15-17. Create a SAS data set named Q1_B consisting of data for female only. Do this by first testing the value of SEX and only reading the remaining values if the line contains female data.
01204/04/77M69110
01103/06/80F55 99
01002/08/74F58120
00906/05/86M76160
00803/03/79F51 95
(c) Merge the data sets Q1_A and Q1_B to the data set Q1_ALL. (Hint: the variable names of male and female variable in two data sets are different.)
(d) Using Q1_ALL, create a new data set named Q1_FINAL containing two more variables AGE and GROUP (Old or Young). Calculate the ages of five subjects by 03/01/2011. Depending on the AGE, determine the value of GROUP: if AGE>60, then the value of GROUP is Old; if AGE<=60, then the value of GROUP is Young. Remove the variable DOB. Attach the SAS codes and the proc print output.
2. (a) Write a SAS DATA step to read a series of variables ID, SCORE1 and SCORE2 where there are several ID, SCORE1 and SCORE2 per line. Each ID, SCORE1 and SCORE2 forms one observation. ID is a character variable; SCORE1 and SCORE2 are numeric variables. For the score variables, change the value N/A to 0 and change the value 9999 to 100. The name of the data set is Q2_A. Here are the data:
001 N/A 97 003 9999 85 002 98 9999
004 9999 86 005 60 N/A 006 100 100
008 98 9999 010 N/A N/A 009 87 98
011 85 59
(b) Combine the data sets Q1_FINAL and Q2_A depends on the same IDs, and create a data set named Q2_FINAL. Remove all the observations which contain missing values. Attach the SAS codes and the proc print output.
3. (a) Using the data set Q2_FINAL, test the normality of variable HEIGHT. Attach the SAS codes, output and necessary explanation of the output.
(b) Using the data set Q2_FINAL, count the frequencies of the males and females given that at least one of SCORE1 and SCORE2 is greater than 60.
(c) Using GPLOT statement to generate one graph that displays the data for GENDER and GROUP. (Note: there are four types of data: 1. Male and Old; 2. Male and Young; 3. Female and Old; 4. Female and Young. Hint: Create a new variable such as Gender_Group)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started