[EX02-216] Outliers! How often do they occur? What do we do with them? Complete part a to

Question:

[EX02-216] Outliers! How often do they occur?

What do we do with them? Complete part a to see how often outliers can occur. Then complete part b to decide what to do with outliers.

a. Use the technology of your choice to take samples of various sizes (10, 30, 100, 300 would be good choices) from a normal distribution

(mean of 100 and standard deviation of 20 will work nicely) and see how many outliers a randomly generated sample contains. You will probably be surprised. Generate 10 samples of each size for a more representative result. Describe your results—in particular comment on the frequency of outliers in your samples.

MINITAB Choose: Calc  Random Data  Normal Enter: Generate 10 rows of data

(Use n  10, 30, 100, 300)

Store in column(s): C1–C10 Mean: 100 Stand. Dev.: 20 Choose: Graph  Boxplot  Multiple Y’s Simple 

OK Enter: Graph variables: C1–C10 Choose: Data View Select: Interquartile range box Outlier symbols In practice, we want to do something about the data points that are discovered to be outliers. First, the outlier should be inspected: if there is some obvious reason why it is incorrect, it should be corrected.

(For example, a woman’s height of 59 inches may well be entered incorrectly as 95 inches, which would be nearly 8 feet tall and is a very unlikely height. If the data value can be corrected, fix it! Otherwise, you must weigh the choice between discarding good data (even if they are different) and keeping erroneous data. At this level, it is probably best to make a note about the outlier and continue with using the solution. To help understand the effect of removing an outlier value, let’s look at this set of data, randomly generated from a normal distribution N(100, 20).

b. Construct a boxplot and identify any outliers.
74.2 84.5 88.5 110.8 97.6 100.2 116.4 78.3 154.8 144.7 110.6 93.7 113.3 96.1 86.7 97.3 102.8 91.8 58.5 120.1 102.8 82.5 107.6 91.1 95.7 98 98.4 81.9 58.5 118.1

c. Remove the outlier and construct a new boxplot.

d. Describe your findings and comment on why it might be best and less confusing while studying introductory statistics not to discard outliers.

Step by Step Answer:

Related Book For  book-img-for-question
Question Posted: