Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Part 1 - General data preparation and cleaning. a) Import the MLDATASET_PartiallyCleaned.xlsxinto R Studio. This dataset is a partially cleaned version of MLDATASET-200000-1612938401.xlsx b) Write

image text in transcribed

Part 1 - General data preparation and cleaning. a) Import the MLDATASET_PartiallyCleaned.xlsxinto R Studio. This dataset is a partially cleaned version of MLDATASET-200000-1612938401.xlsx b) Write the appropriate code in R Studio to prepare and clean the MLDATASET PartiallyCleaned dataset as follows: i. ii. For How.Many.Times.File.Seen, set all values = 65535 to NA: Convert Threads.Started to a factor whose categories are given by 1= 1 thread started 2 = 2 threads started 3= 3 threads started 4 = 4 threads started 5= 5 or more threads started Hint: Replace all values greater than 5 with 5, then use the factor(.) function. iii. Log-transform Characters.in.URL using the log() function, and remove the original Characters.in.URL column from the dataset (unless you have overwritten it with the log-transformed data) iv. Select only the complete cases using the nagmit() function, and name the dataset MLDATASET.cleaned. Briefly outline the preparation and cleaning process in your report and why you believe the above steps were necessary

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Beginning C# 5.0 Databases

Authors: Vidya Vrat Agarwal

2nd Edition

1430242604, 978-1430242604

More Books

Students also viewed these Databases questions

Question

Explain the function and purpose of the Job Level Table.

Answered: 1 week ago