Question
Problems 3: 30 points You will use the data frame, patient, for this problem. > patient ID GLUC TGL HDL LDL HRT MAMM SMOKE 1
Problems 3: 30 points You will use the data frame, patient, for this problem.
> patient ID GLUC TGL HDL LDL HRT MAMM SMOKE 1 A 88 NA 32 99 Y
In Assignment 1, we wrote a function called table1, which calculates the descriptive statistics for numeric variables. For this problem, you need to write a function that calculates the descriptive statistics for both numeric and categorical variables (factors).
For numeric variables, you will calculate mean (MEAN), median (MEDIAN), stardard deviation(SD), and count the number of missing values (NMiss), like you did in Assignment1.
For character variables, you will need to tabulate the count within each level of the variable and count the number of missing values.
This function takes only three argument,
dat : the name of the data frame, such patient.
numvar : a character vector contains one or more values. This argument contains the names of numeric variable. Set the default value to NULL. If you don't specify any values for this argument, which is the NULL value, then no need to calculate the statistics for the numeric variables.
charvar : a character vector contains one or more values. This argument contains the names of categorical variable. Set the default value to NULL. If you don't specify any values for this argument, which is the NULL value, then no need to calculate the statistics for the categorical variables.
The returned result is a list with length of 2. Each component of the list is either a data frame that contains the statistics or a NULL value. The first component of the list contains the descriptive statistics for the numeric variables and the second components of the list contains the counts for the character variables. The format of the returned list looks like the one below:
> table1 (dat=patient, numvar=c("TGL", "HDL", "LDL"), charvar=c("HRT", "MAMM"))
$numericStats varName MEAN MEDIAN SD NMiss 1 TGL 180.66667 180.0 23.03620 4 2 HDL 55.66667 62.5 19.00175 4 3 LDL 160.28571 165.0 40.06126 3
$FactorStats varName group count 1 HRT N 2 2 Y 3 3 NMiss 5 4 MAMM no 2 5 yes 4 6 NMiss 4
> table1 (dat=patient, numvar="LDL", charvar=c("HRT", "MAMM","SMOKE"))
$numericStats varName MEAN MEDIAN SD NMiss 1 LDL 160.2857 165 40.06126 3
$FactorStats varName group count 1 HRT N 2 2 Y 3 3 NMiss 5 4 MAMM no 2 5 yes 4 6 NMiss 4 7 SMOKE ever 3 8 never 4 9 NMiss 3
> table1 (dat=patient, numvar=c("HDL", "LDL"))
$numericStats varName MEAN MEDIAN SD NMiss 1 HDL 55.66667 62.5 19.00175 4
2 LDL 160.28571 165.0 40.06126 3
$FactorStats NULL
> table1 (dat=patient, charvar=c("HRT", "MAMM","SMOKE"))
$numericStats NULL
$FactorStats varName group count 1 HRT N 2 2 Y 3 3 NMiss 5 4 MAMM no 2 5 yes 4 6 NMiss 4 7 SMOKE ever 3 8 never 4 9 NMiss 3
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started