Question
age,sex,bmi,children,smoker,region,charges 19,female,27.9,0,yes,southwest,16884.924 18,male,33.77,1,no,southeast,1725.5523 28,male,33,3,no,southeast,4449.462 33,male,22.705,0,no,northwest,21984.47061 32,male,28.88,0,no,northwest,3866.8552 31,female,25.74,0,no,southeast,3756.6216 46,female,33.44,1,no,southeast,8240.5896 37,female,27.74,3,no,northwest,7281.5056 37,male,29.83,2,no,northeast,6406.4107 60,female,25.84,0,no,northwest,28923.13692 25,male,26.22,0,no,northeast,2721.3208 62,female,26.29,0,yes,southeast,27808.7251 23,male,34.4,0,no,southwest,1826.843 56,female,39.82,0,no,southeast,11090.7178 27,male,42.13,0,yes,southeast,39611.7577 a. The body mass index (BMI) of a policyholder
age,sex,bmi,children,smoker,region,charges 19,female,27.9,0,yes,southwest,16884.924 18,male,33.77,1,no,southeast,1725.5523 28,male,33,3,no,southeast,4449.462 33,male,22.705,0,no,northwest,21984.47061 32,male,28.88,0,no,northwest,3866.8552 31,female,25.74,0,no,southeast,3756.6216 46,female,33.44,1,no,southeast,8240.5896 37,female,27.74,3,no,northwest,7281.5056 37,male,29.83,2,no,northeast,6406.4107 60,female,25.84,0,no,northwest,28923.13692 25,male,26.22,0,no,northeast,2721.3208 62,female,26.29,0,yes,southeast,27808.7251 23,male,34.4,0,no,southwest,1826.843 56,female,39.82,0,no,southeast,11090.7178 27,male,42.13,0,yes,southeast,39611.7577
a. The body mass index (BMI) of a policyholder is given in the variable "bmi." Construct a histogram of the data. Does it look normally distributed to you? Why or why not?
b. Simulate five sets (using rnorm()) of normally distributed data with the same sample size, mean, and standard deviation as the BMI data. Construct a histogram for each.
c. The data simulated in part (b) is normal. How many of your truly normal simulated datasets are similar to your actual data? Consider the shape of the distribution, skewness, kurtosis, etc.
d. What we've done in this problem is an implicit version of hypothesis testing (which we will learn about soon). We assume (H0) that our data are normally distributed. Consider your answer to part (e). You can use it to estimate the probability that your data reflects the null hypothesis (how many normal datasets out of five reflected your data?), or p-value. What is your p-value and what can you conclude from it (If p < 0.05, we have sufficient evidence to reject the null hypothesis that our data is not normally distributed).
Please show how to solve using R.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started