Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hi, Please I need help with this question below on Statistics and Data Analysis Problem #1 Following is a list of golf scores on an

Hi, Please I need help with this question below on Statistics and Data Analysis

Problem #1

Following is a list of golf scores on an amateur tournament. There were only a few holes, so low values are possible. For your convenience, the values have been sorted for you. 38, 40, 42, 47, 49, 49, 51, 51, 58, 60, 61, 62, 63, 87, 92

1. Provide the 5-number summary. Do this manually, that is, do not use R.

2. Using the 1.5 IQR rule, indicate which values, if any would be outliers on the high side.

3. Using the 1.5 IQR rule, indicate which values, if any would be outliers on the low side.

4. Thought Question to Answer: If for answer 3 or 4, you have identified outliers, do you think you should remove the value(s) from any subsequent analysis? Review the discussion of outliers in this document as well as in your lecture to decide. Be sure to explain why or why not you would remove any observations.

Reminder: for the remaining questions in this problem, do NOT remove any observations -- even if you said earlier that we should. That is, use the original dataset that was provided in the question. (It's to save the grader the headache of trying to grade multiple different datasets from students who may have made different decisions).

Use R for the following questions:

Remember, for this and all R code in the course, to paste any R code you create into your assignment document.

5. Create a vector of the results.

6. Calculate the mean, median, and standard deviation of the vector. Again, paste the R code as well as the results into your document.

7. Using R, generate a 5-number summary (use the fivenum() function). Also generate a boxplot. Remember to always include a title in your charts. Including labels (where applicable) is always a good idea as well. Note: It is possible that 5-number summary you obtain from R may look a little different from the one you did manually above. This is okay. R uses a mathematical formula for this calculation that is slightly different from the manual technique discussed in our lectures.

Problem #2

Let's stick with our golf theme. In a larger sample of a group of people playing mini-golf, the following scores were recorded. This was a tournament in which there were 9 holes, so the lowest score that could possibly be recorded is 9. Here are the scores that were recorded:

61, 30, 61, 31, 78, 51, 93, 111, 58, 69, 58, 107, 62, 57, 53, 65, 57, 82, 64, 61, 35, 94, 48, 49, 49, 63, 44, 41, 48, 33, 64, 51, 31, 81, 38, 82, 86, 57, 63, 60, 54, 76, 40, 83, 62, 124, 20, 64, 8, 45, 78, 51, 28, 47, 69, 46, 73, 60, 102, 80

1. Using R, create a vector of these observations. You should NOT type all of them in! Instead, type the usual command to create a vector, e.g. minigolfScores<- c() and then paste the above list of observations inside the parentheses. TIP: If you are having trouble doing this, test it by only copying, say, the first 3 observations into your vector. i.e. minigolfScores<- c(61,30,61). Test it by outputting things like the sum, and mean of those three values. Then when you are confident that you have the technique figured out, go ahead and paste in the entire vector. As a test, I will tell you that the mean of all of the values in the above dataset should be about 60.6.

2. Draw a histogram (again, always using R) and describe the distribution of this dataset. When you describe the distribution, you must demonstrate that you can use proper terminology. Remember that whenever drawing any kind of chart you should always include a title. Histograms should always include a label for the x-axis (and of course, a title).

3. Using R, draw a boxplot, Again, don't forget to include titles & labels! (Note: In the future, I will probably not remind you about titles and labels in the future - it is something an analyst should always do when creating graphs! This reminder is also on the assignment checklist.) You can also use R's sort() function to see all the numbers in order. Examine the data and ask yourself if this examination of the data suggest the presence of any outliers or questionable data.

4. If any part of your analysis in the previous step suggests the presence of outliers or otherwise questionable data, do you think it would be wiser to include or exclude these observations from your subsequent analysis?

5. Report descriptive statistics of this variable including: mean, median, standard deviation, and five number summary. Note: For this problem, if you answered yes above, that you feel that some of the observations should be excluded from your analysis, then you should create a NEW vector leaving out any removed value(s) and calculate your descriptive statistics using this new vector.

Problem #3

A group of amateur bowlers goes to a tournament. The distributoion of scores was found to be approximately normal: N(198, 16), i.e. mean of 198 and standard deviation of 16.

Important: For this problem, I do not expect you to come up with accurate answers, but I DO want you to be in the vague ballpark. For example, if some observation has a value that is just below the mean, do not report a quantity of greater than 50%! Instead, you should say: "Just below 50%". Also, if for any of your answers you expect the result to be "very very low" or "very very high", you can simply say so. That is, do not feel the need to specify a number.

1. About what percentage of the bowlers had a result LOWER than 198?

2. About what percentage of the bowlers had a result HIGHER than 214?

3. About what percentage of the bowlers had a result LOWER than 162? About what percentage of the bowlers had a result HIGHER than 182?

4. About what percentage of the bowlers had a result HIGHER than 197?

Note: In all cases, be sure to explain your reasoning. While you DEFINITELY should draw these out when doing the problems, you are not required to include those drawings in your assignment.

Problem #4

A manufacturing line wants to analyse their production. They begin by looking at the number of products that were produced over the previous 100 days. Here are the results. Note that they are not sorted.

760, 796, 869, 811, 789, 838, 801, 810, 837, 837, 858, 791, 806, 831, 776, 822, 801, 828, 830, 815, 819, 841, 827, 797, 843, 857, 814, 839, 821, 829, 798, 823, 784, 855, 848, 818, 852, 808, 803, 803, 836, 876, 808, 856, 843, 822, 821, 817, 841, 822, 824, 801, 806, 835, 794, 854, 802, 803, 813, 825, 839, 808, 816, 823, 779, 806, 770, 819, 817, 850, 808, 783, 872, 824, 804, 819, 808, 816, 807, 857, 781, 850, 822, 811, 778, 844, 792, 807, 794, 802, 777, 777, 819, 815, 805, 810, 782, 833, 819, 764

1. Paste this into a vector, and print a histogram.

2. Using R, summarize the variable by calculating the descriptive statistics including mean, SD, 5-number summary, and boxplot.

3. Calculate the z-score for the lowest and highest values in the group. Hint: You should NOT need to visually go through the list to determine the highest and lowest values. Look at the various statistics you have already obtained from R. Answer to two decimal places.

4. How many products would be produced with a z-score of 3?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database And Expert Systems Applications Dexa 2023 Workshops 34th International Conference Dexa 2023 Penang Malaysia August 28 30 2023 Proceedings

Authors: Gabriele Kotsis ,A Min Tjoa ,Ismail Khalil ,Bernhard Moser ,Atif Mashkoor ,Johannes Sametinger ,Maqbool Khan

1st Edition

ISBN: 303139688X, 978-3031396885

More Books

Students also viewed these Databases questions

Question

1.The difference between climate and weather?

Answered: 1 week ago

Question

1. What is Fog ?

Answered: 1 week ago

Question

How water vapour forms ?

Answered: 1 week ago

Question

What is Entrepreneur?

Answered: 1 week ago

Question

Which period is known as the chalolithic age ?

Answered: 1 week ago