Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Arial 10 BIU A. A. The data we will analyze was collected by a faculty member at the University of lowa, at the time they

image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribed
Arial 10 BIU A. A. The data we will analyze was collected by a faculty member at the University of lowa, at the time they were looking to buy a house in lowa City. The faculty member gathered data from the Johnson County government with the goal of using statistical methods to make sure they would get a good deal. Pose the question: 1. Describe your project topic. You might want to describe the characteristics that were measured for each home you have chosen to include in your project (Choose 30 - 50 (rows - homes for your sample). Even though there are several characteristics to choose from, please describe only the main numenc topic (column's) of this project. You will not analyze your categorical data until problem #11. 2. For the quantitative variable you have selected, estimate what you believed the population mean was and explain why you chose that number. You are asked to pinpoint a specific value rather than a range of numbers, this estimate is for the population and was originally meant to be found BEFORE any data was collected (and therefore should not come from the data). Note you are using sample data, but we will be using that sample data later to test our guess about the population mean. Again, do not concern yourself with the categorical data until you get to problem #11. (2 points) The source of the data and the sampling method: 3. Our lessons and videos gave us many guidelines for taking the "perfect sample" in a perfect world - a world where we can identify everyone or everything in the population, where we have unlimited resources to collect our data, where we can take an unbiased sample with 100% response or 100% usable data, and where the sample is a good representation of the population. Most likely, you are not able to have access to a perfect sample. However, in a perfect world, how would you have taken a sample if you had all necessary resources? Please provide all relevant details that address the guidelines we have studied regarding population and sampling- describe what resources you would need and how you would use them, which random sampling technique you would use and why, and how your methods would reduce selection bias, nonresponse bias, and response bias. (5 points) Please review the Population provided. Then provide a detailed description of how you chose your 30 - 50 homes for your sample, what method(s) did you use? This should include. What were the details within the population that helped you eliminate 747 or more homes from the4. Please review the Population provided. Then provide a detailed description of how you chose your 30 - 50 homes for your sample; what method(s) did you use? This should include. What were the details within the population that helped you eliminate 747 or more homes from the population to use as your sample? Did you randomly select your 30 - 50 homes, or use a systematic sample, etc. Since we did not actually collect this data ourselves, we need to explain what details within the given population were appealing to you... as if you were looking to buy a home in lowa. (5 points) Issues of representation and sampling bias: 5. What were the challenges you faced in choosing your representative or random sample? What is the difference between a random sample, a representative sample, and a systematic sample? Does random sampling guarantee representative samples? Are you choosing to use the data given to you as the population or are you choosing to use all homes in lowa? Cite your sources. (2 points) 6. Depending on what you are treating as the population (the given data list or all homes in lowa) ... Overall, how well do you think your sample represented the population? This analysis should include things such as... What subgroups are in your population? Even if a stratified sample was not taken, most populations do contain subgroups. Since the sample contains objects, are the proportions from the population subgroups comparable to the proportion of those groups in your sample? Do the ages of the objects in your sample accurately reflect the distribution of the ages of the objects ins.docx Arial 10 B U A . EE your population? For example, if your population contains 25% of house older than 1990 does your sample also contain about 25% of homes older than 1990? (5 points) (Good resources for population information are www.census.gov for US data and http //www sunyjcc edu/about/facts-figures for JCC data) 7. What percentage of experimental units were non-responsive or otherwise not helpful / did not provide useful data? (2 points) 8. To make sure that your sample is a fair representation, without bias, one needs to follow some survey sampling best practices. There are various biases that are possible when choosing a representative sample or random sample. Explain possible sources of bias or any other inaccuracies and the likelihood of them. In particular, name the type(s) of bias that might have occurred in your sampling (selection, nonresponse, response), and be very specific about how that bias may have occurred. (For example, "Since I asked only students in my night class how many hours per week they work, I have selection bias because daytime students did not have a chance to be in my survey. Night classes have more students who work full-time during the day, so I likely obtained higher numbers than if I had asked students during a variety of times and places on campus.") (5 points) PART II (Descriptive Statistics) For part II, please use Minitab for all computations and graphs. All Minitab output (descriptive statistics and graphs) must be copied and neatly pasted into your word-processed document. The following steps describe how to use Minitab for computation and copy paste. a. Your numerical data must first be in one column in Minitab. b. Using copy and paste: Highlight whatever you would like to copy, and click Edit | Copy. Go to your word processing document and click Edit | Paste. If you are having any trouble with copy and paste, please contact me! C. Five-Number Summary, Mean, and Standard Deviation: You can find these components using the Stat | Basic Statistics | Display Descriptive Statistics function.Arial 10 U A . . The data - numerical summary: 9. Delete the homes you are not using and save your sample as a new Minitab file. On another worksheet within this new file enter your main numeric data set into one column in Minitab, If the column of data does not match the sample size you have chosen, STOP! You have copy pasted in error and will need to go back and make sure you select the entire column and copy paste again. In a separate column, enter the categorical data you collected. Copy this list of the complete data set into your project report, followed by the Descriptive Statistics of the numeric data from Minitab (1 point) 10. In a few short paragraphs, write up a numerical summary of the data, and interpret those numbers in the context of your study (e.g, "the mean is 25 hours worked per week" rather than " = 25"). This analysis should include things such as The measures of center of the data set (including the mean , the median, and the mode - if there is one), and an interpretation of what the similarities or differences of those numbers tells you about the data The range of your data and the standard deviation. Explain/interpret the meaning of those number in terms of your data set, and explain why you think these measures of variability are "big" or "small" for your data set. In other words, what do you think caused an excessive amount of variability in your data, or why do you think your data is homogeneous? Clearly labeled 5-number summary (Minimum, 21, Median(Q?), 23. Maximum) Interpret what this 5 number summary tells you about your data Identify any outliers (include a statistical justification of your answer). What do you think caused these outliers? Was there an unusual subject in your sample, for example? (15 points) The distribution of data - graphical summaries: 11. For the categorical data, use Minitab to construct a bar chart, pareto chart, or pie chart that you think best 9docx Arial 10 B U A A . represents the data. This graph should be appropriately labeled and scaled. Paste the graph into your report, explain why you chose this type of graph, and explain what this graph tells you about the categorical data you collected. (5 points) 12. For the numeric data, choose two types of graphs that you think best display the overall distribution of the data and the dispersion of the data. These graphs should be appropriately labeled and scaled. Paste the graphs into your report, explain why you chose these types of graphs, and explain what each of these graphs tell you about the numeric data you collected (e.g., where you see clusters of data, regions of greatest spread, outliers, what the graphs say about overall dispersion, modes, etc.). You will make some more specific interpretations/conjectures in the next question. (5 points) 13. Describe the general shape of your distribution using the terminology we learned in class (e.g., unimodal, bimodal, skewed, symmetric, uniform). Explain your answer and give any real-life reasons/factors/causes (not definitions for the shapes) you think might have caused this particular shape, [For example, do not tell me the distribution is bimodal "because there are two lumps in the histogram" or "because the data is grouped over 15-20 and 35-40" - explain what characteristics of your sample would cause a bimodal graph. You might say something like, "The bimodal shape of the graph with a mode over 15-20 hours per week and another over 35-40 hours per week is due to the fact that the most of the students in the sample were either working part time (first mode) or full time (second mode). "]. (5 points) PART III (Inferential Statistics) Confidence Interval for the numeric data: 14. Find the 95% confidence interval for the population average using Minitab (3 points) and explain/interpret its meaning in terms of your data. (5 points) Paste all of the Minitab output for the CI (including the name of the test that was used) into the project. Hypothesis Test for the numeric data: 15. Using your data, you will perform a hypothesis test to determine whether or not the true population meanArial 10 B I U A . A . PART III (Inferential Statistics) Confidence Interval for the numeric data: 14. Find the 95% confidence interval for the population average using Minitab (3 points) and explain/interpret its meaning in terms of your data. (5 points) Paste all of the Minitab output for the Cl (including the name of the test that was used) into the project. Hypothesis Test for the numeric data: 15. Using your data, you will perform a hypothesis test to determine whether or not the true population mean differs from the population mean value that you conjectured in problem #2. Let = 0.05. a. State (mathematically) and label the hypotheses for this test. These should be consistent with your answer to problem #2 of this project. (2 points) b. Compute the test statistic. Show your work using Microsoft Equation Editor or the symbols menu. Please check your work with Minitab, and paste all of the Minitab output into your project document. (2 points) c. Find the p-value using the Minitab output from part (b) above. (2 points) d. Using = 0.05, state whether you reject or retain H, and why, and also state a "plain English" conclusion in the context of your project topic. (4 points) Conclusion (5 points) 16. What are the requirements/assumptions that must be satisfied in order for a confidence interval or hypothesis test to be valid? Did your sampling techniques and data satisfy those requirements? Based on that and on the answers you gave in earlier components of this project about population representation, bias, etc., what do you believe about the validity of your inferences about the population in problems 14-15? Consequently, what overall conclusions can you draw about whether it is reasonable to use your data to make an inference about the population? (Note: Most students do not take a truly random sample. As a result, most of you have biased data and cannot rely on any inferences made about the larger population.)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

First Course In Mathematical Logic

Authors: Patrick Suppes, Shirley Hill

1st Edition

0486150941, 9780486150949

More Books

Students also viewed these Mathematics questions

Question

5. It is the needs of the individual that are important.

Answered: 1 week ago

Question

3. It is the commitment you show that is the deciding factor.

Answered: 1 week ago