There will always be anomalies in data that can create gaps in our analytics. We call these
Fantastic news! We've Found the answer you've been seeking!
Question:
There will always be anomalies in data that can create gaps in our analytics. We call these outliers, as they tend to be well outside of the normal distribution of the data.
What steps can we take to smooth this data over when we see it? Should we simply delete the outlier, or are there other tactics we can take in order to normalize for such a huge dispersion?
For example, in a Netflix data set, if it showed that someone was 185 years old, we can safely conclude this is an error. Should we get rid of that entry entirely, or are there ways to preserve the data?
Related Book For
Systems analysis and design
ISBN: ?978-1118808177
5th edition
Authors: Alan Dennis, Barbara Haley Wixom, Roberta m. Roth
Posted Date: