Question
You have in this comma delimited fileheight_sample.csvdata representing the measured heights of 1000 males. We would like toanalysewhether the distribution of heights is normal. We
You have in this comma delimited fileheight_sample.csvdata representing the measured heights of 1000 males. We would like toanalysewhether the distribution of heights is "normal".
We won't do any sophisticated statistical measures, we will simply calculate a few statistics, like the average height, the standard deviation, the minimum and maximum heights.
Then you can group the heights into "bins" to create a histogram. Clearly the histogram should range from the minimum height to the maximum. You can play with the "bin" widths, to see what sort of results you get. Create a chart of the histogram. Calculate the average of all the heights in each bin. Included here is ascreenshotfor a quick solution.
Hint:
- Design the spreadsheet so you can vary the bin width. The bin width shown in the screenshot is 4.
- Note that for each bin shown in the example the counts are inclusive of the lower bin height but exclusive (ie the bin does not include the count of) the upper bin height.
- Import the data from the comma-delimited file and save it as an Excel file, then work on the graph and analysis on a separate sheet from the height sample sheet.
- Name cell and cell ranges when necessary.
- CreateBinAandBinBaccording to the bin width first, then use string concatenation to create the first yellow column, which is used as the x-axis in the graph. An array formula using string concatenation only works in Excel, notCalc. So use normal formula and drag-fill inCalc, and array formula in Excel (hint: using TEXT function may be useful).
- The second yellow column counts the number of persons in each range. You should work out at least one way of doing it, but it is encouraged to try more ways too:
- Use COUNTIF() and drag-fill (Consider the use of & for cell reference in the condition).
- Use array formula and drag-fill - the hack version (TRUE's internal representation is 1,False 0) - SUM(), relational operators and use multiplication (*) to simulate logical AND.
- Use array formula using FREQUENCY function.
- Instead of using Nested IF in SUM for a conjunction of multiple conditions, you might think of usingSUMIFS()
- To work out the average height of each bin, use an array formula - the hack version - useAVERAGE()and relational operators. You may also tryAVERAGE()and NestedIF() or AVERAGEIF()
- Use the two highlighted yellow columns to plot the graph.
- Try changing the bin size to 6. How many of the height measurements fall in the 200-206 bin?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started