Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Page 1 > of 6 ZOOM + Problem 1 - NHANES The American National Health and Nutrition Examination Surveys (NHANES) are collected by the US
Page 1 > of 6 ZOOM + Problem 1 - NHANES The American National Health and Nutrition Examination Surveys (NHANES) are collected by the US National Center for Health Statistics, which has conducted a series of health and nutrition surveys since the early 1960s. Since 1999, approximately 5,000 individuals of all ages are interviewed each year. For this problem you will need to install the NHANES package in RStudio with a built-in data frame called NHANES. library (NHANES) library (mosaic) data (NHANES) Part A: Create a histogram for the distribution of SleepHrsNight for individuals aged 18-22 (inclusive) via the bootstrap. Use at least 10000 iterations. Include the plot and report the mean sleep hours for this age group. Optional: how does your sleep compare? Part B: Now we want to build a confidence interval for the proportion of women we think are pregnant at any given time. Bootstrap a confidence interval with 10000 iterations. Include in your write-up a histogram of your simulation results, along with a 95% confidence interval for the proportion. To speed things up, you can use this code to subset the NHANES data frame to one with only women. Let's get rid of the N/A values for our variable of interest (Pregnant Now) in our filtered data frame: NHANES women % filter (Gender=="female", !is.na (Pregnant Now) ) Problem 2 - Iron Bank The Securities and Exchange Commission (SEC) is investigating the Iron Bank, where a cluster of employees have recently been identified in various suspicious patterns of securities trading. Of the last 2021 trades, 70 were flagged by the SEC's detection algorithm. Trades are flagged periodically even when no illicit market activity has taken place. For that reason, the SEC often monitors individual and institutional trading but does not investigate detected incidents that may be consistent with random variability in trading patterns. SEC data suggest that the overall baseline rate of suspicious securities trades is 2.4%. Are the observed data (70 flagged trades out of 2021) consistent with the SEC's null hypothesis that, over the long run, securities trades from the Iron Bank are flagged at the same baseline rate as that of other traders? Use Monte Carlo simulation (with at least 100000 simulations) to calculate a p-value under this null hypothesis. Include the following items in your write-up: Page 1 > of 6 ZOOM + Problem 1 - NHANES The American National Health and Nutrition Examination Surveys (NHANES) are collected by the US National Center for Health Statistics, which has conducted a series of health and nutrition surveys since the early 1960s. Since 1999, approximately 5,000 individuals of all ages are interviewed each year. For this problem you will need to install the NHANES package in RStudio with a built-in data frame called NHANES. library (NHANES) library (mosaic) data (NHANES) Part A: Create a histogram for the distribution of SleepHrsNight for individuals aged 18-22 (inclusive) via the bootstrap. Use at least 10000 iterations. Include the plot and report the mean sleep hours for this age group. Optional: how does your sleep compare? Part B: Now we want to build a confidence interval for the proportion of women we think are pregnant at any given time. Bootstrap a confidence interval with 10000 iterations. Include in your write-up a histogram of your simulation results, along with a 95% confidence interval for the proportion. To speed things up, you can use this code to subset the NHANES data frame to one with only women. Let's get rid of the N/A values for our variable of interest (Pregnant Now) in our filtered data frame: NHANES women % filter (Gender=="female", !is.na (Pregnant Now) ) Problem 2 - Iron Bank The Securities and Exchange Commission (SEC) is investigating the Iron Bank, where a cluster of employees have recently been identified in various suspicious patterns of securities trading. Of the last 2021 trades, 70 were flagged by the SEC's detection algorithm. Trades are flagged periodically even when no illicit market activity has taken place. For that reason, the SEC often monitors individual and institutional trading but does not investigate detected incidents that may be consistent with random variability in trading patterns. SEC data suggest that the overall baseline rate of suspicious securities trades is 2.4%. Are the observed data (70 flagged trades out of 2021) consistent with the SEC's null hypothesis that, over the long run, securities trades from the Iron Bank are flagged at the same baseline rate as that of other traders? Use Monte Carlo simulation (with at least 100000 simulations) to calculate a p-value under this null hypothesis. Include the following items in your write-up
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started