Question
You have in this comma delimited file height_sample.csv data representing the measured heights of 1000 males. We would like to analyse whether the distribution of
You have in this comma delimited file height_sample.csv data representing the measured heights of 1000 males. We would like to analyse whether the distribution of heights is "normal".
We won't do any sophisticated statistical measures, we will simply calculate a few statistics, like the average height, the standard deviation, the minimum and maximum heights.
Then you can group the heights into "bins" to create a histogram. Clearly the histogram should range from the minimum height to the maximum. You can play with the "bin" widths, to see what sort of results you get. Create a chart of the histogram. Calculate the average of all the heights in each bin. Included here is a screenshot for a quick solution.
Hint:
- Design the spreadsheet so you can vary the bin width. The bin width shown in the screenshot is 4.
- Note that for each bin shown in the example the counts are inclusive of the lower bin height but exclusive (ie the bin does not include the count of) the upper bin height.
- Import the data from the comma-delimited file and save it as an Excel file, then work on the graph and analysis on a separate sheet from the height sample sheet.
- Name cell and cell ranges when necessary.
- Create BinA and BinB according to the bin width first, then use string concatenation to create the first yellow column, which is used as the x-axis in the graph. An array formula using string concatenation only works in Excel, not Calc. So use normal formula and drag-fill in Calc, and array formula in Excel (hint: using TEXT function may be useful).
- The second yellow column counts the number of persons in each range. You should work out at least one way of doing it, but it is encouraged to try more ways too:
- Use COUNTIF() and drag-fill (Consider the use of & for cell reference in the condition).
- Use array formula and drag-fill - the hack version (TRUE's internal representation is 1, False 0) - SUM(), relational operators and use multiplication (*) to simulate logical AND.
- Use array formula using FREQUENCY function.
- Instead of using Nested IF in SUM for a conjunction of multiple conditions, you might think of using SUMIFS()
- To work out the average height of each bin, use an array formula - the hack version - use AVERAGE() and relational operators. You may also try AVERAGE() and Nested IF() or AVERAGEIF()
- Use the two highlighted yellow columns to plot the graph.
Try changing the bin size to 6. How many of the height measurements fall in the 200-206 bin?
180 170 180 10 146-150 150-154 154-158 158-162 162-166 166-170 170-174 174-178 178-182 182-186 186-190 190-194 194-198 198-202 202-206 206-210 210-214 Bina BinB Num Total Heights Average 146 150 1 146 146 150 154 3 455 151.7 154 158 10 1559 155.9 158 162 17 2714 159.6 162 166 47 7697 163.8 166 170 83 13921 167.7 170 174 98 16805 171.5 174 178 140 24587 175.6 178 182 172 30869 179.5 182 186 143 26235 183.5 186 190 99 18536 187.2 190 194 92 17594 191.2 194 198 55 10738 195.2 198 202 19 3787 199.3 202 206 13 2640 203.1 206 210 5 1033 206.6 210 2.14 3 636 212 8 8 8 8 8 8 8 8 8 40 30 20 10 IS ID 134 134 I* IS. I. ID I 70. 174 174 176 I 76 IM 12. 10. 124 I 12 I 120 20. I 70 ZIT 20.. 210 210 214 1000 213 146 180.0 10.2 Number Tallest Shortest Average St. Deyn 180 170 180 10 146-150 150-154 154-158 158-162 162-166 166-170 170-174 174-178 178-182 182-186 186-190 190-194 194-198 198-202 202-206 206-210 210-214 Bina BinB Num Total Heights Average 146 150 1 146 146 150 154 3 455 151.7 154 158 10 1559 155.9 158 162 17 2714 159.6 162 166 47 7697 163.8 166 170 83 13921 167.7 170 174 98 16805 171.5 174 178 140 24587 175.6 178 182 172 30869 179.5 182 186 143 26235 183.5 186 190 99 18536 187.2 190 194 92 17594 191.2 194 198 55 10738 195.2 198 202 19 3787 199.3 202 206 13 2640 203.1 206 210 5 1033 206.6 210 2.14 3 636 212 8 8 8 8 8 8 8 8 8 40 30 20 10 IS ID 134 134 I* IS. I. ID I 70. 174 174 176 I 76 IM 12. 10. 124 I 12 I 120 20. I 70 ZIT 20.. 210 210 214 1000 213 146 180.0 10.2 Number Tallest Shortest Average St. DeynStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started