Question
Data Clean-up Dataset to be cleaned up can be downloaded via google drive https://drive.google.com/file/d/0B1D66qK8jxd0YTI3cUU4R1VDa0U/view?usp=sharing 1. Create a separate repository and push the attached dataset (dirty_data.csv)
Data Clean-up
Dataset to be cleaned up can be downloaded via google drive
https://drive.google.com/file/d/0B1D66qK8jxd0YTI3cUU4R1VDa0U/view?usp=sharing
1. Create a separate repository and push the attached dataset (dirty_data.csv)
2. Populate the missing values in the Area variable with an appropriate values (Birmingham, Coventry, Dudley, Sandwell, Solihull, Walsall or Wolverhampton)
3. Remove special characters, padding (the white space before and after the text) from Street 1 and Street 2 variables. Make sure the first letters of street names are capitalized and the street denominations are following the same standard (for example, all streets are indicated as str., avenues as ave., etc.
4. If the value in Street 2 duplicates the value in Street 1, remove the value in Street 2
5. Remove the Strange HTML column
Complete the cleanup code and push the changes to the repository.
Submit a link to the repository. The repository will contain:
Combined code (.r or .rmd)
Original (dirty) dataset
New (clean) dataset
Dataset can be found
https://drive.google.com/file/d/0B1D66qK8jxd0YTI3cUU4R1VDa0U/view?usp=sharing
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started