Question
Unit Activity Unit: Inferences and Conclusions from Data This activity will help you meet these educational goals: Mathematical Practices You will make sense of problems
Unit Activity
Unit: Inferences and Conclusions from Data
This activity will help you meet these educational goals:
Mathematical Practices You will make sense of problems and solve them, reason abstractly and quantitatively, use mathematics to model real-world situations, use appropriate tools strategically, and look for and make use of structure.
_________________________________________________________________________
Directions and Analysis
Task 1: The Number of Hispanics (Latinos) in the United States
US Census Regions (from the Energy Information Administration)
Consider the population of Hispanic (Latino) people in the United States, according to the 2010 US Census. Examine the data for the 2010 US Census in this spreadsheet.
- How do the columns titled Number and % of Total Population relate to the column titled Total?
response here:
- Make a histogram of the state data in the column titled % of Total Population. (If you need help, follow these instructions for using the online probability tools. Note that you can copy a column of data from the spreadsheet and paste it into the Histogram tool.) Set useful limits and intervals, and label the histogram appropriately. Export an image of the histogram, and paste it below.
response here:
- Generate a box plot of the state data in the column titled % of Total Population. (You can copy a column of data from the spreadsheet and paste it into the Box Plot tool.) Be sure to add appropriate labels to your box plot. Export an image of your box plot, and paste it below.
response here:
- Describe the spread, shape, and skewness, if any, of the graph.
response here:
- What information about central tendencies can you determine from the histogram and the box plot?
response here:
- Return to the Box Plot tool, click the Exclude Outliers checkbox, and then click Update. The plot boundaries contract, with outlier data points showing beyond the whiskers of the plot. Outliers are generally considered to be points that are more than 1.5 times (interquartile range) below Q1 or above Q3. What are the minimum and maximum values for the box plot once you exclude outliers? Based on your box plot, how many outliers do you have?
response here:
- Which states are represented by the outlier data? What do these states have in common that might contribute to making them outliers?
response here:
- According to the US Census data, the Hispanic (Latino) population of the United States as a whole is 16.3% of the total 2010 US population (as shown in cell G5). Where would this percentage fit into the list of the distribution of the individual states on your latest box plot? Does it seem surprising that it would fit there? How might you explain this situation?
response here:
Task 2: Diamonds
If you were in the diamond business, you would have to know how to price diamonds accurately. Otherwise you would either be losing money (selling too cheaply) or losing customers (selling too expensively.) Many people buy diamond engagement rings, and it is often a significant personal purchase. In fact, buying a diamond and buying a house are similar in two ways:
- Both a house and a diamond can be significant investments of your hard-earned money.
- No two are exactly alike (unlike two new cars), so you can't just shop around and clearly see who has the best price.
So, having some sort of "price ruler" can be very useful for a lot of people. In this lesson, you'll practice building that ruler, based on data.
Read about the 4 Cs of diamonds (cut, clarity, color, and carat weight) for some background information. With this data, you will have a pretty good idea of what size diamond you can afford, and you will know whether a diamond you see at the jeweler's is worth the cost.
- "Affordable" Diamond Pricing
Imagine you have set out to purchase a diamond, and you want all the information you can possibly get before the purchase. You run an Internet search on diamonds to learn more about the different cuts, carat ranges, color values, and so on. After having researched a bit, you've decided on the basic cut, quality, color, and weight range that you can probably afford.
This diamond pricing spreadsheet gives three snapshots of a large set of data on diamond pricing. The 0.30-0.40 carats tab gives the 2012 data on diamonds ranging from 0.30 to 0.40 carats.
- Go to the Subset tab, which gives a subset of randomly chosen data from the first tab. Plot the data of weight versus price in this tab using the Scatter Plot tool. (If you need help, follow these instructions for using the online probability tools.) Export an image of the plot, and paste it in the space below. Find the equation of the regression line and the value of the correlation coefficient (r).
response here:
- What can you say about the relationship between the price and the weight?
response here:
- What can you say about the slope of the regression line?
response here:
- Does the y-intercept make sense?
response here:
- The VVS1 0.30-0.40 carats tab is also a subset of the data in the first tab; it contains data on diamonds with VVS1 clarity. There are about 200 diamonds in this subset. Plot the data of weight versus price in this tab. Export an image of the plot, and paste it in the space below. Find the equation of the regression line and the value of the correlation coefficient (r).
response here:
- What can you say about the relationship between the price and the weight?
response here:
- How does this graph differ from the one you did before, in which the diamonds' clarity was not specified?
response here:
- More Extravagant Diamonds
Open the VVS1 1.00+ carats tab, which gives data on diamonds with VVS1 clarity that weigh more than 1.00 carat.
- Plot the data of weight versus price in this tab. Export an image of the plot, and paste it in the space below. Also record the equation of the regression line and the value of the correlation coefficient.
response here:
- Looking at the graph, is this line a good fit for predicting the price of a diamond?
response here:
- Complete the equation for the best-fit line. Seeing that relationship, estimate the price of diamonds weighing more than 3.5 carats and enter the values in the table. Comment on how well these estimates match the actual sale prices for these three diamonds.
response here:
Weight
Actual Price
Estimated Price:
Linear Relationship
3.64 carats
$254,392
4.51 carats
$301,671
4.83 carats
$374,480
- To obtain the regression line, you clicked Line of Best Fit. But the actual data values do not seem to be quite linear. There's an upward curve. Let's investigate another option.
Click Custom Fit. You will see a default quadratic equation. That might be a good choice for this kind of curve. Click Update to see how this curve fits your data. (It will graph a curve in green.) Export an image of the plot, and paste it in the space below. Comment on the data fit as compared with the best-fit line, and record the correlation coefficient value for this quadratic curve.
response here:
- In the Notes section, the tool displays the coefficients for your best-fit quadratic equation. In the space below:
- Record the equation for the quadratic equation (using the coefficients from the Notes section).
- Using this equation, estimate the price of diamonds weighing more than 3.5 carats and record them in the table.
- Comment on which model provides the best estimate for those three diamonds, the quadratic estimate or the linear estimate (from question 3, above).
response here:
Weight
Actual Price
Estimated Price:
Quadratic Relationship
3.64 carats
$254,392
4.51 carats
$301,671
4.83 carats
$374,480
Task 3: Worldwide Health and Wealth
In one of the lessons of this unit, you watched the video Joy of Stats. In this video, Professor Hans Rosling uses a special tool to see 200 years of world history statistics on the relationship between health (life expectancy) and wealth (income per person).
Now it's your turn to use this tool. Open this interactive graph. Click the How to use button to watch a video tutorial on how to change the axes, find the raw data, interpret the circles, and so on.
You'll use the tool to analyze the health and wealth example demonstrated in the video clip. For this example, note that wealth is on a log scale, meaning that powers of 10 are equal distances apart on the horizontal scale.
- Mathematical Relationships
- In general, what can you say about the relationship between health (life expectancy at birth) and wealth? Mention possible reasons for this relationship.
response here:
- Set the graph to present time. Switch the Income per person scale from log to lin(ear), using the tab beside the x-axis label. What shape do the spots now seem to make?
response here:
- Switch back to the log scale. Describe the mathematical relationship you observe.
response here:
- Does a nonlinear relationship make sense for this data?
response here:
- Run the graph over time from 1800 to present. You can move the graph manually or press the Play button below the x-axis. There's a lot of change in the data over time. Describe the basic trend in the data.
response here:
- Comment on whether the mathematical relationship is the same over the past 200 years (same basic best-fit line) or whether the relationship changes. Support your statement with reasons.
response here:
- Association and Causation Possibilities
In this section, you will try to explain specific observations. In some cases, you will have to do research to do so. Please record any resources you use to answer these questions in the Resources section near the end of this document.
- How can you explain health as a result of wealth (using third variables)? What are some reasons that a wealthy individual might live longer? What are some reasons that citizens of a wealthy society might live longer?
response here:
- How can you explain wealth as a result of health (using third variables)? What are some reasons that a healthy individual might accumulate more wealth? What are some reasons that a healthy society might generate more wealth per person?
response here:
- In the video, it was pointed out that there was a worldwide dip in life expectancy from 1917 through 1919 because of the worldwide influenza epidemic. Slowly run through the years from 1900 to the present. (Use the slowest setting on the controller.) Click on any country that appears to go through a notable dip in longevity. Research one such country's history and explain why this dip might have occurred at that time (for example, Russia in the 1930s or China from 1958 through 1960).
response here:
- What are the current outliers today? Identify a country that is rich but not so healthy. Interpret the results in terms of average wealth and average health. Research on the country online to find out what you can about the wealth and health of its people.
response here:
- Looking at these country dips and current outliers, what factors besides wealth (or more specific than wealth) seem to strongly affect longevity?
response here:
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started