Answered step by step
Verified Expert Solution
Question
1 Approved Answer
'1' Mali - Maram Alllulv X C: Conrmed Calendly X Log in - WSSChools X F Content X Eb 53312662 X 9' Continue' P3 lmplew
'1' Mali - Maram Alllulv X C: Conrmed Calendly X Log in - WSSChools X F Content X Eb 53312662 X 9' Continue' P3 lmplew X t Dashboard X 9 Take Test' 54 SQL' Untitled document C n leam-us-east-1prod-f|eet01-xythos.content.blackboardcdncom ' 53312662 1th0 + El ()3 STAT 250 Fall 2022 Data Analysis Assignment 3 You may not uzlaad this le to any airline homework help sites. Please see our course syllabus [or honor code rules. Thank you. Your solutions document should include the following items. Points will be deducted if the following are not included 1. Type your Name and STAT 250 with your correct section number (e.g. STAT 250xxx) right justied and then Data Analysis Assignment #3 centered on the top of page 1 below your name to begin your solutions document. . Number your pages across your entire solutions document. . Your solutions document should include the ANSWERS ONLY with each answer labeled by its corresponding number and subpart. Keep the answers in order. . Generate all requested graphs and tables using StatKey or Bguroo, where stated. . Upload your solutions document onto Blackboard as a pdf le using the link provided by your instructor. It is your responsibility for uploading a readable file. . You may not work with other in ivrdnals on this assignment. It is an honor code violation if you do. Please note: all StatKey or Rguroo Instructions provided in the parts oflhe problem: will be presented in italics. Elements of good technical writing: Use complete and coherent sentences to answer the questions. Graphs must be appropriately titled and should refer to the context of the question. Graphical displays must include labels with units if appropriate for each axis Units should always be included when referring to numerical values. When making a comparison you must use comparative language, such as \"greater than\DV' Mall - Maram Alhu" X C Conrmed Calendly X Log in - WSSChools X P Content X Eb 53312662 X 9' Continue' P3 lmplew X t Dashboard X 9 Take Test' 54 SQLl x Untitled document C t; leam-us-east-1prod-f|eet01-xythos.content.blackboardcdncom M if! 53312662 100% + E! 03 Investigation 1: Appropriateness of Inference: Price of a Haircut For the following scenario. answer the questions below. Please note, do not conduct inference in this problem; just answer each question. A random sample of 24 Mason students was collected and each student was asked, among other things, the total cost of their last hair service (including cuts, styling, etc.). The researcher does not have any information about the population from which the sample was collected. The data set is called HairPrr'ce. a) If we attempt to conduct statistical inference using the collected sample, what is the parameter of interest? Use the correct symbol and describe the parameter in context in one sentence. Check the specic conditions necessary to consider conducting inference using theory- based methods using the t-distributionr There are three to consider: (I) Was a random sample collected; (2) Is the population where the sample comes from normal; and (3) Is the sample size greater than or equal to 30'] Answer each of these questions in one sentence What could be checked using the sample data if our sample size is less than 307 Answer this question in one sentence. Depending on your answer to part (a), construct one or two frequency histograms in Rguroo. Remember to properly title and label the graph(s). Copy and paste this graph (or these graphs) into your document. Describe the shape of the histogram(s) in one sentence. Depending on your answer to part (a), construct one or two horizontal boxplots in Rguroo. Remember to properly title and label the graph(s). Copy and paste this graph (or these graphs) into your document Does the boxplot (or do the boxplots) show any outliers? Answer this question in one sentence and identify any outliers if they are present. Considering your answers to pans (e) and (g), is theorybased inference appropriate in this case? If you respond \"yes,\" provide a reason for your response. If you respond \"no,\" state the reason why not and present another possibility if the researcher still wanted to conduct statistical inference. Use one or two complete sentences in your response. a Mall - Maram Alhu" X C: Conrmed Calendly X Log in - WSSchools X C a 53312662 leam-us-east-1prod-f|eet01-xythos.content.blackboardcdncom W Content X en 53312662 x 9' Contlnue'PSlmplel x a Dashboard x F TakeTest-SASQLl 100% + El 03 Investigation 2: Earnings among Voters A political scientist wondered if there is a signicant difference between the proportion of Democrats and Republicans earning over $100,000. To obtain the dam, she used the National Election Pool. The NEP is a consortium of major new networks (ABC, CBS, CNN, and NBC) that pools together resources to gather voting and exit poll data from a random sample of Voters. On Election Day, November 8, 2022, exit poll data showed that among a random sample of 774 voters, 408 were registered as Republican and 366 were registered as Democrat. Of the 408 Republican voters, 216 earn over $100,000 a year. Of the 366 Democrat voters, 162 earn over $100,000 a year. Use a : 0.05. a) Define the population parameter using context and symbols in one complete sentence. b) State the hypotheses using the political scientist's claim. c) Check the specic conditions necessary to consider conducting inference using theory- based (or distribution based) methods. There are two to consider: (1) Was the data collected randomly from the population; and (2) Are there at least ten successes and failures in each group? Answer each of these questions in one sentence and show that condition (2) is true or false using calculation to obtain the failures (note the successes are given). Calculate and label the two sample proportions separately and round the values to four decimal places. Next, calculate the difference between these sample proportions by subtracting (Republican Democrat). Type all of these calculations and label each of them. Calculate the pooled proportion estimate needed in the calculation of the standard error of the test statistic. Type this calculation and round to four decimal places. Calculate the test statistic value using your proportions obmined in parts (d) and (e) and type your work. Round your test statistic value to three decimal places. Obtain your p-value using your test statistic calculated in part (i) in StatKey using Theoretical Distribution: a Normal . Copy and paste the image of the standard normal distribution and type the value of the p-value below the image. Verify your test statistic and p-Value using Rgurao. Go to Analytics ~> Analysis a Proportion Inference # Two Populations. See the image below to ll in each box correctly. Then, click the Test of Hypothesis tab and choose Large Sample 2 under method. Finally, correctly set your alternative hypothesis and signicance level and click Previewr Copy and paste only the output and table displayed under the title \"Two Population Proportion Test ofl-Iypothesis" x Unlliled document Mail - Maram Alhumo X Confirmed - Calendly X W Log in - W3Schools X Content X Bb 53312662 X Continue: P3 Implemashboard X Take Test: $4 SQL Im x Untitled document - ( X C learn-us-east-1-prod-fleet01-xythos.content.blackboardcdn.com/blackboard.learn.xythos.prod/5a30bcf95ea52/53312662?X-Blackboard-Expiration=1669852800000&X-Blackboard-Signature=kG2VelT6cHevwZuDo... AM 53312662 4 / 8 - - 100% Two Population Proportion Inference Ox Dataset : Select a Datase X - Response / Success ? Response : Select a Factor Yes Success : Select a Level Failure: Create Plot -> Scatterplot. In the Dataset dropdown, select Running Time. Change the predictor (explanatory) and response variables accordingly. Properly title and label your graph and axes. Copy and paste your scatterplots into your solutions document. b) Interpret the scatterplot of VO2 Max and Time using trend, strength, and shape (form) in one compete sentence c) Interpret the scatterplot of Age and Time using trend, strength, and shape (form) in one compete sentence. d) Provide both correlation coefficients. Go to Analytics > Analysis > Linear Regression Simple Regression. In the Dataset dropdown, select Running Time. Change the predictor (explanatory) and response variables accordingly. Click Preview. You will need to complete this twice, once for each explanatory variable. The correlation is presented as "Pearson Correlation Coefficient (r)." Please state both correlation values in your solutions. 5 e) Use StatKey to create a bootstrap distribution of correlations. From the main page, select CI for Slope, Correlation. Edit the data by copying only two columns (explanatory and response) into the box (repeat for each explanatory). Then, generate 10,000 samples and use each standard error to produce 2SE confidence intervals to estimate the population correlation. Present each confidence interval and comment on whether 0 is captured in each interval. f) Which of the two explanatory variables would be the better predictor of Time? Base your answer on the scatterplots, the correlation coefficients and their confidence intervals. State your answer in one or two complete sentences including an explanation for your variable choice. g) Find the fitted line for the explanatory variable VO2 Max and the response variable Time, run a simple linear regression analysis. You may use the same output as in part (d) but look at "Equation of Least Squares Line" to help you state the fitted line equation in your solutions document.Mail - Maram Alhumo X Confirmed - Calendly X W Log in - W3Schools X Content X Bb 53312662 X Continue: P3 Implemashboard X Take Test: $4 SQL Im X Untitled document - ( X C learn-us-east-1-prod-fleet01-xythos.content.blackboardcdn.com/blackboard.learn.xythos.prod/5a30bcf95ea52/53312662?X-Blackboard-Expiration=1669852800000&X-Blackboard-Signature=kG2VelT6cHevwZuDo... AIM 53312662 8 / 8 - 100% + 3 h) Produce the fitted line plot for VO2 Max and Time and copy it into your solutions document. Scroll down the page in your output from part (d) and copy and paste the graph labelled "Response Versus Numerical Predictor." i) Interpret the slope of the regression line for VO2 Max and Time in context of the problem. j) Would it be meaningful to interpret the y-intercept for VO2 Max and Time? Explain why or why not in one sentence. k) Provider for VO2 Max and Time and explain what this value means in context of the problem. Again, refer to the output from part (d), but look at "Coefficient of Determination (R-Squared)." 5 1) Test whether the slope is significant using theory-based inference (assuming all conditions hold). Go to Analytics > Analysis > Linear Regression > Simple Regression. In the Dataset dropdown, select Running Time. Change the predictor explanatory) and response variables accordingly. Under the tab "Test of Association," check Slope under Alternative Hypothesis and leave Not Zero selected. Under "Methods" choose Theoretical t-statistic. Keep the Significance Level at 0.05. State the hypotheses, show work to obtain the t-test statistic using the output, and use the p-value provided in the output to make your decision. Finally, draw a conclusion in one complete sentence. 6 m) If a randomly selected runner had a VO2 Max of 53, predict their 5K finishing time. Use the regression equation from part (g) and show all work and calculations. 7
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started