Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

code in R Requirements: 1. Write the function(s) which accurately implements the algorithm(s) as described or requested. 2. Include the error-handling to ensure your function(s)

code in R

image text in transcribedimage text in transcribedimage text in transcribed

Requirements: 1. Write the function(s) which accurately implements the algorithm(s) as described or requested. 2. Include the error-handling to ensure your function(s) work properly. Note: The requirements apply to your messy_impute() and tidy_impute() functions. One scenario which naturally creates non-tidy data is a teacher's gradebook. Table 1 shows an example with five homework and five quizzes. Table 1: An example of a teacher's gradebook UID Homework_1 Homework_2 ... Homework_5 Quiz_1 ... Quiz_5 123456787 70 90 80 76 70 123456788 91 85 73 90 80 123456789 60 71 78 88 73 (a) Create a simulated dataset in Rl called gradebook that represents a possible gradebook in the basic format as Table 1: Each row of the gradebook should contain all measurements for a student. Each column should contain scores for one assignment. The last column should be "Quiz_5." The simulated gradebook should contain the grades for 100 students and scores (out of 100) for 5 homework and 5 quizzes. Set the seed for simulating your data with your UID. please check how to use runif(). (b) Write R code in R markdown file to randomly replace 10% of Homework_5 and Quiz_5 by NA respectively, and then use is.na() in conjunction with sum() to show your results. (c) Write a function messy_impute() that imputes missing values in the gradebook. Please also present your algorithm or flowchart to answer this question in the R markdown file. Note: Imputation is the process of replacing missing values by estimated values. The simplest (far from preferred) method to impute values is to replace missing values by the most typical value, say the mean. Assume the format of the gradebook is fixed (five homework and five quizzes), but NA values may occur in any assignments except for UID. The messy_impute() function should have at least three arguments and .... - A data frame contains the gradebook as specified in the example. - A center argument should be a character object indicating the impute functions (Mean or Median). - A margin argument should be an integer (1 or 2) indicating either the function imputes the missing values by row/student (1) or by column/assignment (2). If choosing by row, the function should process homework and quizzes separately. The function should return the imputed data frame (or tibble). Toy example with three homework assignments only: Suppose the input data frame contains the data as follow. UID Homework_1 Homework_2 Homework_3 111111111 41.00 27.00 52.00 222222222 43.00 94.00 88.00 333333333 2.00 79.00 51.00 444444444 46.00 45.00 54.00 555555555 60.00 23.00 NA To impute the NA value by row with mean, we compute 41.5 to replace NA. If do it by column, the function should compute 52+88451+54 = 61.25 to replace NA. 60+23 2 (e) Write R code in the R markdown to convert the gradebook into the tidy format. Name this object gradebook_tidy. (f) Write a function tidy_impute() that imputes missing values in gradebook_tidy object. The tidy_impute() function should have the same arguments as in the messy_impute() function. You should return an imputed gradebook_tidy object. Please also present your algorithm or flowchart to answer this question in the R markdown file. 3 HW2 Note: Don't convert gradebook_tidy object to be a messy format or reuse your messy_impute() in any steps of your tidy_impute(). (g) Please use the cases you select from (d) to demonstrate your function in the R markdown file

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Machine Learning And Knowledge Discovery In Databases European Conference Ecml Pkdd 2015 Porto Portugal September 7 11 2015 Proceedings Part 3 Lnai 9286

Authors: Albert Bifet ,Michael May ,Bianca Zadrozny ,Ricard Gavalda ,Dino Pedreschi ,Francesco Bonchi ,Jaime Cardoso ,Myra Spiliopoulou

1st Edition

ISBN: 3319234609, 978-3319234601

More Books

Students also viewed these Databases questions

Question

What must a creditor do to become a secured party?

Answered: 1 week ago