Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Use rapid miner 1. Clean up any bad data in the dataset 2. Categorize each of the datatypes for each column in the dataset. 3.
Use rapid miner 1. Clean up any bad data in the dataset 2. Categorize each of the datatypes for each column in the dataset. 3. Provide Descriptive Statistics for each of the 14 columns after the date column 4. Try to identify each column's (A - N) distribution, and explain why you believe it is the distribution you determined 5. Try to identify any Association Rules in columns H - N 6. Explore and try to run a cluster analysis on any interesting columns to you 7. Run a resampling of the dataset using any sampling method discussed in the class to 10,000 rows 8. Run the descriptive sample statistics on the dataset - and determine sampling error caused by the sampling 9. Describe the value of the sampling. When and is it worthwhile to run sampling on a dataset
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started