Answered step by step
Verified Expert Solution
Question
1 Approved Answer
code in R Requirements: 1. Write the function(s) which accurately implements the algorithm(s) as described or requested. 2. Include the error-handling to ensure your function(s)
code in R
Requirements: 1. Write the function(s) which accurately implements the algorithm(s) as described or requested. 2. Include the error-handling to ensure your function(s) work properly. Note: The requirements apply to your messy_impute() and tidy_impute() functions. One scenario which naturally creates non-tidy data is a teacher's gradebook. Table 1 shows an example with five homework and five quizzes. Table 1: An example of a teacher's gradebook UID Homework_1 Homework_2 ... Homework_5 Quiz_1 ... Quiz_5 123456787 70 90 80 76 70 123456788 91 85 73 90 80 123456789 60 71 78 88 73 (a) Create a simulated dataset in Rl called gradebook that represents a possible gradebook in the basic format as Table 1: Each row of the gradebook should contain all measurements for a student. Each column should contain scores for one assignment. The last column should be "Quiz_5." The simulated gradebook should contain the grades for 100 students and scores (out of 100) for 5 homework and 5 quizzes. Set the seed for simulating your data with your UID. please check how to use runif(). (b) Write R code in R markdown file to randomly replace 10% of Homework_5 and Quiz_5 by NA respectively, and then use is.na() in conjunction with sum() to show your results. (c) Write a function messy_impute() that imputes missing values in the gradebook. Please also present your algorithm or flowchart to answer this question in the R markdown file. Note: Imputation is the process of replacing missing values by estimated values. The simplest (far from preferred) method to impute values is to replace missing values by the most typical value, say the mean. Assume the format of the gradebook is fixed (five homework and five quizzes), but NA values may occur in any assignments except for UID. The messy_impute() function should have at least three arguments and .... - A data frame contains the gradebook as specified in the example. - A center argument should be a character object indicating the impute functions (Mean or Median). - A margin argument should be an integer (1 or 2) indicating either the function imputes the missing values by row/student (1) or by column/assignment (2). If choosing by row, the function should process homework and quizzes separately. The function should return the imputed data frame (or tibble). Toy example with three homework assignments only: Suppose the input data frame contains the data as follow. UID Homework_1 Homework_2 Homework_3 111111111 41.00 27.00 52.00 222222222 43.00 94.00 88.00 333333333 2.00 79.00 51.00 444444444 46.00 45.00 54.00 555555555 60.00 23.00 NA To impute the NA value by row with mean, we compute 41.5 to replace NA. If do it by column, the function should compute 52+88451+54 = 61.25 to replace NA. 60+23 2 (e) Write R code in the R markdown to convert the gradebook into the tidy format. Name this object gradebook_tidy. (f) Write a function tidy_impute() that imputes missing values in gradebook_tidy object. The tidy_impute() function should have the same arguments as in the messy_impute() function. You should return an imputed gradebook_tidy object. Please also present your algorithm or flowchart to answer this question in the R markdown file. 3 HW2 Note: Don't convert gradebook_tidy object to be a messy format or reuse your messy_impute() in any steps of your tidy_impute(). (g) Please use the cases you select from (d) to demonstrate your function in the R markdown fileStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started