Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Biostatistics 1. *(1 point) The one-sample binomial test can be used to A. B. C. D. compare rates or proportions from two independent populations or

Biostatistics 1. *(1 point) The one-sample binomial test can be used to A. B. C. D. compare rates or proportions from two independent populations or samples. compare rates or proportions from two paired populations or samples. compare a rate or proportion from one population or sample. compare rates or proportions from three or more independent populations or samples. 2. *(1 point) The one sample t test is used to test hypotheses about what? 3. *(1 point) When reporting a percentage from a survey, instead of reporting a confidence interval, typically a margin of error is reported. The margin of error refers to a 95% confidence interval expressed as the observed proportion the margin of error, and the margin of error is equal to 1.96 times the standard error of the proportion. The standard error of the proportion will depend on the observed proportion, and the sample size. Instead of reporting a different margin of error based on each observed response or proportion, what value is typically used when the computing the margin of error? 4. *(1 point) Literature Example. The association between weekly hours of physical activity and mental health: A three-year follow-up study of 15-16 year-old students in the city of Oslo, Norway (Aase Sagatun, et al. BMC Public Heath, 2007, 7:155. Open Access) Background: Mental health problems are a worldwide public health burden. The literature concerning the mental health benefits from physical activity among adults has grown. Adolescents are less studied, and especially longitudinal studies are lacking. This paper investigates the associations between weekly hours of physical activity at age 15-16 and mental health three years later. The paired t test was used to compare baseline and follow-up scores within boys and girls. Below are the results for boys for SDQ - Emotional Symptoms. N SDQ - Emotional Symptoms Baseline Follow up **Paired t-test Boys Mean SD 1074 1.64 1082 1.73 p-value** 0.183 1.69 1.69 What was the sample size used in the paired t test? A. 1082 B. 1074 C. Cannot be determined Page 1 of 4 5. *(1 point) To use the paired t test with small sample sizes A. the distribution of the differences should be (approximately) Normal. B. the distributions of the two populations from which the samples are obtained should be (approximately) Normal. C. the distributions of the two populations from which the samples are obtained, as well as the differences, should be (approximately) Normal. Page 2 of 4 *(1 point) A study (2012) examined changes in attitudes, experiences, readiness, and confidence levels of medical residents on perform screening, brief intervention, and referral to treatment (SBIRT) and factors that moderate these changes. A cohort of 121 medical residents received an educational intervention. Self-reported experience, readiness, attitude, and confidence toward SBIRT-related skills were measured at baseline and at follow-up. A test was used to compare resident experience, confidence, and readiness before and after SBIRT training. Page 3 of 4 6. *(1 point) Standing and supine systolic blood pressures are taken on 100 subjects and the mean (SD) systolic blood pressure when standing was 140 (10) and when supine was 150 (10). If you want to test the hypothesis that the means for standing and supine pressure are the same, an appropriate test to use is? A. B. C. D. One-sample binomial test Two-sample t test Paired t test Chi-square test 7. *(1 point) When using the chi-square test involving a 2 x 2 contingency table, it is recommended that the expected cell counts in each cell be greater than or equal to what value? 8. Consider the following study results comparing a new experimental treatment to a placebo for survival 5 years after treatment: Group Treatment Placebo Total Died Survived Total 0 1000 1000 10 990 1000 10 1990 2000 A. *(1 point) The chi-square test (for a 2 x 2 contingency table) would not be appropriate to use because there is a 0 cell in the 2 x 2 table (i.e., there are no patients in the treatment group who died). TRUE or FALSE? 9. Rubella (German measles) can cause severe birth defects if a woman acquires it during pregnancy. Preblud et al (1981) assessed the risk to fetuses when mothers received the rubella vaccine (which is a live attenuated virus) shortly before or shortly after they became pregnant (before they knew they were pregnant). They found birth defects in 0 of 112 children born to such women. Is this enough information to convince you the vaccine is \"safe\"? Specifically, a) *(1 point) Based on the analyses you did in (b), do these data suggest that the rate of birth defects in women receiving the rubella vaccine is higher, lower, or about the same as the background rate you found in (a). Explain your reasoning. b) Note any concerns, limitations, or cautions you would attach to the conclusion you came to in (a). Page 4 of 4 Confidence\tintervals By\tproviding\ta\trange of\tvalues\tfor\tthe\tsummary\tof\tinterest\t(e.g.\tfor\ta proportion\tor\ta\tmean)\twith\ta\tgiven\tlevel\tof\tconfidence\t(a\tconfidence\tinterval), we\tcan incorporate\tsampling\tvariability\tin\tour\testimate E.g.\tif,\tin\ta\tsample\tof\t100\tsingleton\tbirths\twith\tnutritional\tsupplementation, we\tobserve\t12\tlow\tbirth\tweight\tbabies,\twe\tmight\testimate\tthat\tthe\tlow\tbirth weight\tfraction\tin\tthe\tpopulation\tis\t0.12, and\tthe\t95%\tconfidence\tinterval is (0.06,\t0.20) The\testimate\tfrom\tthe\tsample\t(.12\tin\tthis\texample)\tis\toften\treferred\tto\tas\ta \"point\testimate\"\tto\tdistinguish\tit\tfrom\tthe\t\"interval\testimate\"\tprovided\tby\tthe confidence\tinterval\t(0.06,\t0.20\tin\tthis\texample) Spring\t2017 Biostat\t310 323 Confidence\tintervals If\tmany\tsamples\tare\tdrawn\trandomly\tfrom\tthe\tsame\tpopulation,\teach\tsample\twill have\tits\town\testimate of\tthe\ttrue\tparameter\t(e.g.\tthe\tmean)\tin\tthe\tunderlying population\t[In\tpractice,\tof\tcourse,\tone\tis\ttypically\tonly\table\tto\tdraw\ta\tsingle\tsample] If,\tsay,\ta\t95%\tconfidence\tinterval\tfor\tthe\tparameter\t(e.g.\tthe\tmean)\tis\tcalculated\tfor each\tsample,\tthen\t95%\tof\tthese\tintervals\twill,\tin\tfact,\tcontain\tthe\ttrue\tpopulation value Confidence\tlevel z\tmultiplier 95%\tCI\tmost\tcommonly\tused 90% 1.65 Other\tCI's\tare\tpossible 95% 1.96 99% 2.57 We've\tbeen using\t2.0 As\tthe\tconfidence\tlevel\tincreases,\tthe\tconfidence\tinterval\tgets\twider As\tthe\tsample\tsize\tincreases,\tthe\tconfidence\tinterval\tgets\tnarrower o A\t95%\tCI\tfor\tp\tis\tgiven\tby + 2* (1 ) (for\tlarge\tn) o A\t95%\tCI\tfor\t is , + 2*s/ (for\tlarge\tn) Spring\t2017 Biostat\t310 324 Confidence\tintervals 95%\tconfidence\tintervals\tfrom\t20\tdifferent\tstudy\tsamples from\tthe\tsame\tpopulation True\tpopulation proportion If\tthe\ttrue\tpopulation\tvalue\tlies\twithin\tthe\tconfidence\tinterval,\tit\tcan\tbe anywhere\tin\tthe\tinterval,\tnot\tnecessarily\tclose\tto\tthe\tpoint\testimate. Spring\t2017 Biostat\t310 325 Confidence\tintervals\t- Interpretation You\tare\tgoing\tto\troll\ta six-sided\tdie You\tplan\tto\tdo\ta\tstudy\tto\testimate\tthe long-term\tCVD\trisk\tamong\tyoung\twomen What\tis\tthe\tprobability\tof rolling\ta\t4? You\trolled\tthe\tdie\tand\tit came\tup\tas\ta\t3 What\tis\tthe\tprobability\tyou rolled\ta\t4? What\tis\tthe\tprobability\tyou rolled\ta\t3? Spring\t2017 What\tis\tthe\tprobability\tthat\tthe\t95% confidence\tinterval\tfor\tthe\trisk\tof\tCVD\twill contain\tthe\ttrue\tpopulation\trisk? You\tperformed\tthe\tstudy\tand\tthe\t30year\trisk\tfor\tgeneral\tCVD\tamong women\twas\t9.2%\t(95%\tCI\t=\t8.9%,\t9.5%) What\tis\tthe\tprobability\tthis\t95%\tCI contains\tthe\ttrue\tpopulation\trisk? Biostat\t310 326 From\tEssential\tof\tBiostatistics\tin\tPublic\tHealth,\tSullivan Spring\t2017 Biostat\t310 327 Confidence\tintervals Interpretation:\tIf\tI\trepeat\tthe\tsampling\tprocedure\tmany\ttimes,\tthe confidence\tintervals\twill\tinclude\tthe\ttrue\tpopulation\tparameter\t95%\tof the\ttimes - Avoid\t\"There\tis\ta\t95%\tprobability\tthat\tthe\tconfidence\tinterval includes\tthe\ttrue\tpopulation\tparameter\" - The\tjargon\tis\t\"I\tam\t95%\tconfident\tthat\tthe\tconfidence\tinterval includes\tthe\ttrue\tpopulation\tparameter\" The\tconfidence\tinterval\t(and\tmargin\tof\terror)\taccounts\tfor\tsampling variability,\tbut\tnot\tbias Spring\t2017 Biostat\t310 328 Confidence\tIntervals\tfor\t Suppose\twe\thave\tcontinuous\tobservations\twith\tpopulation\tmean,\t, and\tpopulation\tstandard\tdeviation,\ts . is\tN(,s2/n) The\tsampling\tdistribution\tof\tX . over\trepeated\tsamples:\tit\tis\toften s2/n is\tthe\tstandard\tdeviation\tof X referred\tto\tas\tthe\t\"standard\terror\" A\t95%\tCI\tfor\t is . + 2*s/ Spring\t2017 Biostat\t310 329 Confidence\tIntervals\tfor\t In\tpractice,\ts is\tusually\tnot\tknown.\tBut\twe\tdo\thave\tan\testimate\tof\ts available\tfrom\tthe\tdata\t- the\tsample\tstandard\tdeviation,\ts We\tcan\tuse\ts\tin\tplace\tof\ts in\tthe\tCI\tcalculation... o For\tlarge\tn\t(>30),\tnothing\tchanges\t- a\t95%\tCI\tfor\t is , + 2*s/ o For\tsmall\tn\t(<\t30),\twe\tneed\tto\taccount\tfor\tincreased\tuncertainty\tby using\tthe\t\"t\tmultiplier\"\tinstead\tof\tthe\tz\tmultiplier\t- a\t95%\tCI\tfor\t is , + tn-1*s/ t\twith\tn-1\tdegrees\tof\tfreedom\t(df) o Spring\t2017 The\t\"t\tmultiplier\"\tis\tdetermined\tby\tthe\t(Student's)\tt\tdistribution Biostat\t310 330 .4 Student's\tt-distribution .3 t\tdistributions,\tlike the\tstandard Normal\t(N(0,1)), are\tcentered\tat\t0. 0 .1 .2 You\tcan\tsee\tthat\ta t\tdistribution\twith 30\t\"degrees\tof freedom\"\t(df)\tis close\tto\ta\tN(0,1) distribution\tbut that\tt-distributions with\tlower\tdf have fatter\ttails -4 -2 t(2) distribution t(30) distribution 0 x 2 4 t(5) distribution Std Normal distribution Std Normal\tdistribution\t=\tstandard\tNormal\tdistribution Spring\t2017 Biostat\t310 331 Confidence\tIntervals\tfor\t The\tt\tdistribution\tlooks\tvery\tsimilar\tto\tthe\tZ\tdistribution,\tjust\ta\tbit\twider (fatter\ttails)\t wider\tconfidence\tintervals (this\tis\ta\t\"penalty\"\twe\tpay\tfor using\ts\tin\tplace\tof\ts in\tthe\tcalculation) t\tdistribution\tis\tindexed\tby\t\"degrees\tof\tfreedom\" Only\tapplies\tto\tCI\tfor\t,\tnot\tCI\tfor\tp Use\tt\tinstead\tof\tZ\twhen\t(i)\ts is\tunknown and\t(ii)\tn\tis\tsmall (Note,\tin\tpractice\tit\tis\tokay\tto\talways\tuse\tt\tinstead\tof\tZ) For\t95%\tConfidence\tInterval For\tillustration\tonly; you\tdon't\tneed\tto remember\tthese\tt values,\texcept\tfor\tlarge n,\tt\tis\tapproximately 2\t(\t1.96) Spring\t2017 n tn-1 5 2.78 10 2.26 20 2.09 30 2.05 50 2.01 1.96 Biostat\t310 As\tn\tgets\tlarge,\tthe\t\"t\" multiplier\tgets\tvery close\tto\tthe\t\"z\" multiplier 332 Confidence\tIntervals\tfor\t Problem:\tSuppose\twe\tmeasure\tbirth\tweights\tof\tthe\tinfants\tof\ta\trandom . =\t7.0\tlb and\ts\t=\t1.1\tlb.\tFind\ta\t95%\tCI sample\tof\t21\tteen\tmoms\tand\tfind\tthat\tX for\tthe\ttrue\tpopulation\tmean\tbirth\tweight\tof\tteen\tmothers. Answer:\tThe\t95%\tCI\tis\t7.0\t+ 2.09*1.1/ 21 =\t(6.5,\t7.5)\tlb t20 Interpretation:\tWe\tare\t95%\tconfident\tthe\ttrue\tpopulation\tmean\tis\tbetween 6.5\tand\t7.5\tlb. Spring\t2017 Biostat\t310 333 FYI\t- Are\tyou\tconfident\tyour\tcoin\tis\tfair? You\tneed\tto\tmake\tsure\tthe\tcoin\tfor\tthe\tcoin\ttoss\tis\tfair. You\ttoss\tthe\tcoin\t20\ttimes and\tget\t9\theads. Is\tthis\tenough\tto\tgive\tyou\tconfidence\tthe\tcoin\tis\tfair? Based\ton\tobserving\t9\theads\tout\tof\t20\ttosses\t(0.45),\twhat\tis\tthe\ttrue\tprobability likely\tto\tbe? I.e.,\tmaybe\tthe\tcoin\tis\tnot\tperfectly\tfair,\tbut\tis\tit\tgood\tenough? n =\t20\tx\t0.45\t=\t9,\twhich\tis\tless\tthan\t10\tso\twe\tcannot\tuse\tthe\tlarge\tsample\tsize method Given\twe\tknow\tthe\tdistribution\tof\tcoin\ttosses\thas\ta\tbinomial\tdistribution,\twe\tcan use\tthe\tbinomial\tdistribution\tto\tcompute\ta\t95%\tconfidence\tinterval\tfor\tp. Sampling\tdistribution\tfor\t , if\tp=0.45\t&\tn=20 To\tconstruct\ta\t95%\tconfidence\tinterval for\tthe\ttrue\tprobability\tof\ttossing\ta head,\twe\tdetermine\tthe\tnumber\tof heads\tin\t20\ttosses\tsuch\tthat P[\t#\ttosses\tor\tless]\t\t.025\t(or\t2.5%) and P[#\ttosses\tor\tmore]\t\t.025\t(or\t2.5%) 95%\tconfidence\tinterval\tis\t4\tto\t14 or\t0.2\tto\t0.7 Spring\t2017 Biostat\t310 334 FYI\t- An\t\"exact\"\tCI\tbased\ton\tthe\tbinomial\tdistribution The\t95%\tconfidence\tinterval\tfor\tp\tbased\ton\tthe\tbinomial\tdistribution\tis\tvalid no\tmatter\tthe\tsample\tsize. Confidence\tintervals\t(as\twell\tas\thypothesis\ttests)\tthat\tdo\tnot\tassume\tthe\tsample size\tis\tlarge\tare\toften\tcalled\t\"exact\"\tmethods\t(versus\tlarge\tsample\tsize\tor asymptotic\tmethods) But\tthe\tmethods\tARE\tNOT\t\"exact\"\tin\tthe\tway\tyou\twould\tnormally\tthink\tof. For\ta\tstatistician\tan\texact\t95%\tCI\tmeans\tthat\tno\tmatter\tthe\tsample\tsize,\tthe confidence\tlevel\tof\tthe\t95%\tCI\tmust\tbe\t95%\tor\tless,\tnot\texactly\tequal\tto\t95%! In\tthe\tprevious\texample,\twe\tfound that\tP[\t4\ttosses\tor\tless]\t\t.025,\tbut\tthe\tactual value\tof\tP[\t4\ttosses\tor\tless]\tis\t0.019. And\tP[14\ttosses\tor\tmore]\t=\t0.021 So,\tthe\tinterval\t4\tto\t14\tor\t0.2\tto\t0.7\thas\ta\tconfidence\tlevel\tof\t1\t- (0.019+0.021) =\t.96\tor\t96% The\tproblem\tof\tfinding\ta\tconfidence\tinterval\twith\ta\tconfidence\tlevel\texactly\tequal to\t95%\tis\tthere\tare\t\"gaps\"\tin\tbinomial\tdistribution\t(i.e.,\tit\tis\ta\tprobably\tmass function\tand\tnot\ta\tdensity\tfunction,\tsuch\tas\tthe\tNormal\tdistribution.) If\twe\ttried\tusing\t5\tto\t13,\tthe\tconfidence\tinterval\twould\thave\ta\tconfidence\tlevel\tof 88.7% (\tP[\t5\ttosses\tor\tless]\t= .055\tand\tP[13\ttosses\tor\tmore]\t=\t0.058\t) Spring\t2017 Biostat\t310 335 Summary Interpretation:\tIf\tI\trepeat\tthe\tsampling\tprocedure\tmany\ttimes,\tthe confidence\tintervals\twill\tinclude\tthe\ttrue\tpopulation\tparameter\tX% of\tthe\ttimes\t(e.g.,\t95%\tof\tthe\ttime\tfor\t95%\tCis) The\tjargon\tis\t\"I\tam\t95%\tconfident\tthat\tthe\tconfidence\tinterval includes\tthe\ttrue\tpopulation\tparameter\" As\tthe\tconfidence\tlevel\tincreases,\tthe\tconfidence\tinterval\tgets\twider As\tthe\tsample\tsize\tincreases,\tthe\tconfidence\tinterval\tgets\tnarrower o o o o A\tCI\tfor\tp\tis\tgiven\tby\t + z* (1 ) (for\tlarge\tn) An\t\"exact\"\tCI\tfor\tp\tis\tbased\ton\tthe\tbinomial\tdistribution\t(for\tsmall\tn) A\tCI\tfor\t is\tgiven , + z*s/ (for\tlarge\tn) A\tCI\tfor\t is\tgiven , + tn-1*s/ (for\tsmall\tn) Spring\t2017 Biostat\t310 336 The\tQuest\tto\tPrevent\tHIV\tInfection Case\tStudy:\tIntroduction\tto\thypothesis\ttesting\t&\tOne\tsample\ttest\tof\ta\tproportion Spring\t2017 Biostat\t310 337 HIV\tand\tSTIs HIV\t=\tHuman\timmunodeficiency\tvirus STI\t=\tSexually\ttransmitted\tinfection In\tthe\tlate\t1980's\ta\tnumber\tof\tobservational\tstudies\tsuggested\tthat\tinfection with\tvarious\tSTIs\tincreased\tthe\trisk\tof\tacquiring\tHIV. Biologically,\tit\tmade\tsense. This\tsuggested\tthat\timproved\tSTI\tdiagnosis\tand\ttreatment (especially\tin\tareas with\tpoor\thealth\tcare)\tmight\treduce\tthe\trisk\tof\tacquiring\tHIV. A\t\"community\trandomized\ttrial\"\twas\tinitiated\tin\tMwanza,\tTanzania\tto\ttest\tthis\tidea. Intervention:\tEnhanced\tsyndromic\ttreatment\tof\tSTI\tversus\tstandard\tof\tcare\t( SOC) 6\tpaired\tcommunities\t- in\teach\tpair,\t1 intervention and\t1\tSOC Outcome:\tHIV\tincidence\tin\tcohorts\tformed\tin\teach\tcommunity\tat\tthe beginning\tof\tthe\ttrial - Specifically,\tin\thow\tmany\tof\tthe\t6\tpairs\tdid\tthe\ttreatment\tcommunity\thave\ta lower\tHIV\tincidence\tthan\tthe\tSOC\tcommunity? Spring 2017 Biostat 310 338 Spring\t2017 Biostat\t310 339 Mwanza HIV\tPrevention\tTrial Tx community SOC community Which\thas lower\tHIV incidence? 1 Cohote Mbulula Tx 2 Mbenza Vulensala SOC 3 : : : Pair 4 5 6 What\twould\tit\ttake\tto\tconvince\tyou that\tthe\ttreatment\tis\ttruly effective\tin\treducing\tHIV\tincidence? Spring\t2017 Biostat\t310 340 Hypothesis\tTesting\t- Overview Form\ta\thypothesis Collect\trelevant\tdata If\tdata\tare\tnot\tconsistent\twith\thypothesis,\treject\tthe\thypothesis Spring\t2017 Biostat\t310 341 Hypothesis\tTesting\t- A\tbrief\tsummary Form\ta\t(null)\thypothesis We\twant\tto\tprove\tSTI\ttreatment\treduces\tHIV\trisk Assume\tthe\topposite\t(!) H0:\tSTI\ttreatment\tdoes\tnot reduce\tHIV\trisk By\tassuming\tthe\topposite\tof\twhat\twe\twant\tto\tprove\twe\tare\tplaying\tthe skeptic,\tthe\t\"devil's\tadvocate\". We\tare\tassuming\tthat\tthe\tthing\twe\twant\tto\tprove\tis\tnot\ttrue (\"null\thypothesis\"). Only\tif\tthe\tdata\tare\tso\toverwhelming\tthat\tit\twould\tconvince\ta\tskeptic are\twe\twilling\tto\treject\tthis\tnull\thypothesis\tand\taccept\tthat\twhat\twe are\ttrying\tto\tprove\tis\ttrue. Spring 2017 Biostat 310 342 Hypothesis\tTesting\t- A\tbrief\tsummary Collect\trelevant\tdata o Collect\tdata\tthat\twill\thelp\tus\tdecide\tif\twe\tshould\treject\tthe null\thypothesis,\tor\tnot. In the Mwanza trial, the treatment communities had lower HIV incidence rates in 6 out of 6 paired communities Spring 2017 Biostat 310 343 Hypothesis\tTesting\t- A\tbrief\tsummary If\tdata\tare\tnot\tconsistent\twith\thypothesis,\treject\tthe\tnull hypothesis\tin\tfavor\tof\tthe\talternative o What\tdo\twe\tmean\tby\t\"not\tconsistent\"? o Can\twe\tquantify\t\"not\tconsistent\"? Spring 2017 Biostat 310 344 Hypothesis\tTesting A\tformal\tanalysis\tof\tthe\tMwanza\tdata\t... 1. State\tthe\tscientific\tquestion Does\tthe\tintervention\treduce\tHIV\tincidence\tcompared\tto\tSOC? 2. Convert\tthe\tscientific\tquestion\tinto\tstatistical\thypotheses (null H0 and\talternative\tHA)\tappropriate\tfor\tthe\tstudy\tdesign This\tstudy\tis\tlike\ta\tBinomial\texperiment.\tA\t\"success\"\toccurs\twhen\tthe intervention\tcommunity\thas\tlower\tHIV\tincidence\tthan\tits\tpaired control\tcommunity. Let\tp\tbe\tthe\tprobability\tof\ta\t\"success\" H0:\tP\t=\t0.5\t(Treatment\thas\tno\timpact) What\tabout\tHA? We\tmight\tconsider\tHA:\tP\t>\t0.5\t(Treatment\tis\teffective) Spring 2017 Biostat 310 345 Hypothesis\tTesting 3. Choose\ta\treasonable\ttest\tstatistic\tthat\tsummarizes\thow\twell\tthe\tdata agree\twith\tthe\tnull\t(and\talternative)\thypothesis X\t=\tnumber\tof\tcommunity\tpairs\tin\twhich\tHIV\tincidence\tis\tlower\tin\tthe intervention\tarm If\tH0 true,\tthen\twe'd\texpect\tX\t=\t3 If\tHA true,\tthen\twe'd\texpect\tX\t>\t3 4. Determine\tthe\tdistribution\tof\tthat\ttest\tstatistic\twhen\tHo\tis\ttrue (i.e.,\tunder\tthe\tnull\tdistribution) X\tis\tBinomial\twith\tn\t=\t6 H0 true\tmeans\tp\t=\t0.5 X\t~\tBinomial(n=6,\tp=0.5)\tif\tH0 is\ttrue Spring 2017 Biostat 310 346 Hypothesis\tTesting 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Here\tis\ta\tBinomial\tdistribution\twith\tn\t=\t6\tand\tP\t=\t0.5 0 1 2 3 4 5 6 Probability\tof 6\t\"successes\"\toccurs\tabout\t0.016\t(or\t1.6%) Spring\t2017 Biostat\t310 347 Hypothesis\tTesting 5. Calculate\tthe\tp-value - probability\tof\tdata\t(test\tstatistic)\t\"at\tleast\tas extreme\"\tas\twas\tobserved,\tassuming\tH0 true X\t=\t6\tis\twhat\twas\tobserved\tand\tit\tis\tthe\tmost\textreme\t(as\tfar\tfrom\tthe null)\tpossible\tobservation\twe\tcould\thave\tmade 0.15 0.20 0.25 0.30 There\tare\tno\tpossible\tvalues\t\"more\textreme 0.00 0.05 0.10 p-value\t=\t0.016 0 1 2 3 4 5 6 The\tp-value\ttells\tus\thow\toften\twe\twould\tsee\tdata\tlike\tthis (or\tmore\textreme),\twhen\tH0 is\ttrue Spring 2017 Biostat 310 348 Hypothesis\tTesting:\tOne-sided\tvs\tTwo-sided\tTest But,\twhat\tif\tthe\ttreatment\thad\tincrease\tHIV\tinfection? Choice\tof\tone-sided\tor\ttwo-sided\ttest\tdepends\ton\tcontext Typically\tuse\ta\ttwo-sided\ttest\tunless o there\tare\tvery\tstrong\treasons\tfor\tbelieving\tthat\tone\talternative direction\tis\tnot\tpossible o there\tis\tno\tconcern\tor\tinterest\tin\tthe\tconsequence\tof\tone\tof\tthe alternative\tdirections Default:\ttwo-sided\ttest For\tthe\tMwanza\tstudy,\twe\tmight\tinstead\tconsider HA:\tp\t 0.5\t(Treatment\thas\tan\timpact) Spring\t2017 Biostat\t310 349 FYI\t- Null\t&\tAlternative\tHypothesis Null\thypothesis\t- is\twhat\tyou\twant\tto\tshow\tis\tnot\ttrue It\tis\tthe\tassumption\tyou\tstart\twith\tand\tthe\tP-value\tis\tcalculated assuming\tthis\tassumption\tis\ttrue. Alternative\thypothesis\t- is\tthe\tnegation\tof\tthe\tnull\thypothesis,\t\"more\tor less\"\twhat\tyou\twhat\tto\tshow\tis\ttrue Typically\tthe\talternative\thypothesis\tis\ttwo-sided,\teven\tthough\tthe\tresearch hypothesis may\tbe\tone-sided\t(e.g.,\tthat\tthe\ttreatment\twill\treduce\tSTIs) Always set\tup\tthe\tnull\thypothesis\twas\twhat you\twant\tto\tshow\tis\tnot\ttrue Superiority\tstudy - Null:\tNo\tdifference Alternative:\tDifference Equivalence\tstudy - Null:\tDifference Alternative:\tNo\tdifference Non-inferior\tstudy - Null:\tInferior Alternative:\tNon-inferiority Spring\t2017 Biostat\t310 350 Hypothesis\tTesting 5. Calculate\tthe\tp-value:\tfor\ta\ttwo-sided\ttest,\twe\tneed\tto\tconsider\tthe chance\t(under\tH0)\tof\tobserving\textreme\tvalues\tin\teither\tdirection We\tobserved\t6\t\"successes\". As\textreme\ta\tresult\tin\tthe\tother\tdirection would\tbe\t0\t\"successes\" p-value\t=\t0.032 The\tp-value\ttells\tus\thow\toften,\tunder\ta\ttwo-sided\talternative, we\twould\tsee\tdata\tlike\tthis\t(or\tmore\textreme),\twhen\tH0 is\ttrue Spring 2017 Biostat 310 351 Hypothesis\tTesting 6. Interpret\tthe\tp-value\tby\tgiving\tstatistical\t(reject or\tfail\tto\treject\tH0) and\tscientific\tconclusions Reject\tH0 if\tthe\tdata\tare\timprobable,\tassuming\tH0 true In\tother\twords,\treject\tH0 if\tthe\tp-value\tis\tsmall A\tcommon\tconvention\t- reject\tH0 if\tp-value\t<\t0.05* (More\ton\tchoice\tof\t\"0.05\"\tlater) \"I\treject\tH0 and\tconclude\tthat\tproviding\tsyndromic\ttreatment\tof STI's\tdoes\treduce\tthe\trisk\tof HIV\t(\tp-value\t=\t0.032)\" Spring\t2017 Biostat\t310 352 Hypothesis\tTesting\tfor\tDecision\tMaking Hypothesis\ttesting\tdeals\twith\tmaking\tdecisions H0:\tSTI\ttreatment\tdoes\tnot\tprevent\tHIV HA:\tSTI\ttreatment\tdoes\tprevent\tHIV The\tdecision\tis\tto\treject\tH0 or\tnot\treject\tH0.\tThere\tare\ttwo\tkinds\tof errors\tone\tcan\tmake\t... 1. Reject\tH0 when\tH0 is\ttrue 2. Fail\tto\treject\tH0 when\tH0 is\tfalse\t(HA is\ttrue) We\twould\tlike\tto\tcontrol\t(at\tan\tacceptably\tsmall\tlevel)\tthe\tprobability of\ta\twrong\tdecision... Spring\t2017 Biostat\t310 353 Hypothesis\tTesting\tfor\tDecision\tMaking When\tyou\tplan\ta\thypothesis\ttest,\tyou\tpre-specify\tthe\tmaximum acceptable\tType\tI\terror\trate,\t. (I.e.,\tthe\tprobability\tof\tmaking\ta\tType\tI\terror) Then\tyou\tcollect\tdata,\twhich\tleads\tto\ta\tp-value\t- the\tprobability\tof observing\tdata\tlike\tthis\t(or\tmore\textreme),\tassuming\tH0 is\ttrue The\tdecision\trule\tto\tis\tto Reject\tH0 if\tp-value\t< Fail\tto\treject\tH0 if\tp-value\t>\t This\tprocedure\tguarantees\tthat\tthe\tType\tI\terror\trate\twill\tbe\tno greater\tthan\t Spring\t2017 Biostat\t310 354 Hypothesis\tTesting\tfor\tDecision\tMaking What is the probability of a wrong decision? Type\tI\terror\t=\tReject\tH0 when\tH0 is\tit\ttrue =\tsignificance\tlevel\t=\tP(Type\tI) I.e., is\tthe\tprobability\tof\tmaking\ta\tType\tI\terror Hypothesis\ttesting\tcontrols\tthe\tType\tI\terror Type\tII\terror\t=\tFail\tto\treject\tH0 when\tH0 it\tis\tfalse\t(HA true) b =\tP(Type\tII\terror) (i.e.,\tprobability\tof\tmaking\ta\tType\tII\terror) Power\t=\t1-b (i.e.,\tprobability\tof\trejecting\tH0 when\tit\tis\tfalse) Hypothesis\ttesting\tdoes\tnot\tcontrol\tthe\tType\tII\terror Spring 2017 7 True 8 True Fail to Reject 7 1- Reject 7 1\t- Biostat 310 355 Hypothesis\tTesting\t- Important\tnotes The\tp-value\tis\tnot the\tprobability\tthat\tthe\tnull\thypothesis\tis\ttrue (or\tfalse) With\ta\tlarge\tp-value\tdo\tnot\tsay\t\"we\taccept\tH0\";\trather\twe\tsay\t\"we fail\tto\treject\tH0\" E.g.,\twe\t\"failed\tto\tshow\tan\tassociation\";\tnot\tthere\tis\t\"no\tassociation\" We\tfailed\tto\tshow\tthere\tis\ta\tdifference;\tnot\tthere\tis\t\"no\tdifference\" Recall\tthe\tp-value\tis\tthe\tprobability\tof\tdata\t(test\tstatistic)\t\"at\tleast\tas extreme\"\tas\twas\tobserved,\tassuming\tH0 is\ttrue Why\t\"at\tleast\tas\textreme\"? Spring\t2017 Biostat\t310 356 Hypothesis\tTesting Suppose\tthe\tresult\tof\tMwanza trial\thad\tbeen\t5\tout\tof\t6... If\twe\tdecided\tto\treject\tH0 with\tX\t=\t5,\tthen\tsurely\twe\twould\treject\twith X\t=\t6. Thus,\tthe\tprobability\tof\tmaking\ta\tType\tI\terror\tby\trejecting\twith observed\tdata\t- the\tp-value - is\tthe\tprobability\tof\t\"as\textreme\"\t(X\t= 5)\tor\t\"more\textreme\"\t(X\t=\t6). And,\tfor\ta\ttwo-sided\ttest,\t(X=0)\tand (X=1). p-value would\tbe: P(X=0)\t+\tP(X=1)\t+\tP(X=5)\t+\tP(X=6)\t=\t0.22 Spring\t2017 Biostat\t310 357 P-value\tExample Average\tcholesterol\tlevel\tfor\tUS\twomen,\taged\t21-40\tyears\tis\t190\tmg/dl In\ta\tsample\tof\t100\tfemale\tAsian\timmigrants,\tthe\taverage\tcholesterol level\twas\t181 mg/dl\twith\ta\tstandard\tdeviation\tof\t40\tmg/dl Is\tthis\tevidence\tthat\tthe\tcholesterol\tlevels\tfor\tAsian\timmigrants\tare different\tfrom\tthe\tgeneral\tUS\tfemale\tpopulation? Sampling distribution of X, if = 190 190 P-value\tExample P-value\t=\t2\tx\t0.012 =\t0.024 Shaded area = 0.012 Shaded area = 0.012 181 190 199 Mwanza HIV prevention trial Impact\tof\timproved\ttreatment\tof\tsexually\ttransmitted\tdiseases\ton\tHIV\tinfection\tin rural\tTanzania\t:\trandomized\tcontrolled\ttrial.\tGrosskurth H.\tet\tal.\tLancet. 1995. Intervention\treduced\tHIV\tincidence\tin\t6\tout\tof\t6\tcommunity\tpairs In\taddition,\ta\tcomparison\tof\tHIV\trisk\tin\tintervention\tvs.\tcontrol\tresulted in\tRR\t=\t0.58 95%\tCI:\t0.42\t- 0.79 p-value\t=\t0.007 A\thappy\tending? Spring\t2017 Biostat\t310 360 HIV and Sexually Transmitted Infections (STIs) Unfortunately, 4 subsequent trials have failed to replicate the Mwanza result Was it ... - The details of the intervention? - Stage of the (HIV) epidemic? - Types of STI's? - Risk behaviors? Or could it have been ... chance? Spring\t2017 Biostat\t310 361 Nature,\tMarch\t2016 (except\tthe\tdanger\tsign\twas\tadded\tby\tthe\tinstructor) Spring\t2017 Biostat\t310 362 \"Highly\" statistically\tsignificant In\tgeneral,\tthe\tphrase\t\"highly\tstatistically\tsignificant\" or\t\"highly\tsignificant\" is\tshorthand\tfor\tsaying\tthe\tp-value\twas\ta\tlot\tsmaller\tthan\t0.05. It\tdoes\tnot\timply\tthat\tthere\twas\ta\tvery\tsmall\tchance\tof\tmaking\ta type\tI\terror Spring\t2017 Biostat\t310 363 What\tis\ta\tp-value? P-value\t<\t.05\t\"statistically\tsignificant\" - Eureka! P-value\t<\t.001\t\"highly\tstatistically\tsignificant\" - Eureka!\tEureka! Eureka! P-value\t>\t.05\t\"no\tdifference\" - No\tworries,\tI\twanted\tto\tshow\tthere\twas\tno\tdifference Spring\t2017 Biostat\t310 364 Nature\tFebruary\t2014 Spring\t2017 Biostat\t310 365 Typical\tscenario\tin\tresearch Study\tData Hypothesis\ttrue We\twant\tto\tknow: Probability\t(hypothesis\ttrue\tgiven\tobserved\tdata) But\tthe\tP-value\ttells\tus: Probability\t(observed\tstudy\tdata\tor\tmore\textreme\tgiven hypothesis\tnot\ttrue) Spring\t2017 Study\tdata Hypothesis\tnot\ttrue Biostat\t310 366 PLoS Medicine\tAugust\t2005 Spring\t2017 Biostat\t310 367 Hypothesis\tTesting\tfor\tDecision\tMaking Fail to Reject 7 7 True 1 - =\t.95 8 True \t=\t.20 Reject 7 \t=\t.05 1\t- =\t.80 Suppose\tout\tof 200 studies 100 100 The\tprobability\tof\tmaking\ta\tcorrect\tdecision\tis? If\tthe\tnull\thypothesis\tis\trejected,\tthe\tprobability\tof\tmaking\ta\tcorrect\tdecision\tis? Spring 2017 Biostat 310 368 \"Specificity\" \"No disease\" \"Disease\" 7 True 8 True \"Negative Fail to Reject 1 - =\t.95 \t=\t.20 7 test\" \"Positive test\" Reject 7 \t=\t.05 Suppose\tout\tof 200 studies 100 1\t- =\t.80 \"Sensitivity\" 100 \"Prevalence\" = 100/200 \"PPV\" If\tthe\tnull\thypothesis\tis\trejected,\tthe\tprobability\tof\tmaking\ta\tcorrect\tdecision\tis? Spring 2017 Biostat 310 369 Hypothesis\tTesting\tfor\tDecision\tMaking Fail to Reject 7 7 True 1 - =\t.95 8 True \t=\t.20 Reject 7 \t=\t.05 1\t- =\t.80 Suppose\tout\tof 210 studies 200 10 The\tprobability\tof\tmaking\ta\tcorrect\tdecision\tis? If\tthe\tnull\thypothesis\tis\trejected,\tthe\tprobability\tof\tmaking\ta\tcorrect\tdecision\tis? Spring 2017 Biostat 310 370 Science August 28, 2015 Spring\t2017 Biostat\t310 371 Confidence\tinterval\tversus\tHypothesis\ttest A\tconfidence\tinterval\tfocuses\ton\tthe\tquestion: \"How\tlarge\tis\tthe\tdifference\" A\thypothesis\ttest\tfocuses\ton\tthe\tquestion: \"How\tsure\tare\twe\tthat\tthere\tis\ta\tdifference\" Although\tthey\tfocus\ton\tdifferent\tquestions,\tthey\ttypically\tgive complementary\tconclusions Is\tthe\trate\tof\tfacial\tpain\t(self-report) similar\tfor\tfemales\tand\tmales? Hypothesis\ttest:\tP-value\t=\t0.0016 Confidence\tinterval:\t95%\tconfidence interval\tfor the\tdifference\tin\tthe proportion\tof\tfemales\tversus\tmales with\tfacial\tpain\tis\t3\tto\t11%. 16% 15% 14% 12% 10% 8% 8% 6% 4% 2% 0% Female N=593 Spring\t2017 Biostat\t310 Male N=432 372 Summary Hypothesis\ttesting - Null\thypothesis - Alternative\thypothesis - Test\tstatistic - Null\tdistribution - p-value - Type\tI\tand\tII\terrors - Significance\tlevel, - Power,\t1\t- - One-tailed\t(one-sided)\tversus\ttwo-tailed\t(two-sided) - Statistically\tsignificant Spring\t2017 Biostat\t310 373 Next One-sample\t&\ttwo-sample\ttests\tof\ta\tproportion\t&\tof\ta\tmean One-sample\tbinomial\ttest Chi-square\ttest One-sample\tt\ttest\t&\tPaired\tt\ttest Two-sample\tt\ttest Spring\t2017 Biostat\t310 374 One-sample\ttest\tof\ta\tproportion The\ttest\tin\tMwanza\tHIV\tstudy\texample\tis\tan\texample\tof\ta\tonesample\ttest\tof\ta\tproportion A\tone\tsample\ttest\tof\ta\tproportion\ttests\tthe\tnull\thypothesis\tH0:\tp\t=\tpo Question:\tDoes\tthe\tinterevention\timpact\tHIV\tincidence? Hypotheses: H0:\tp\t=\t0.5 HA:\tp\t 0.5 Data:\tX\t=\t6\t(out\tof\t6) Null\tDistribution:\tBinomial(n\t=\t6,\tP\t=\t0.5) p-value:\tP(X6\tor\tX 0|\tn=6,\tP=0.5)\t=\t0.032 Conclusion:\tUsing\ta\tsignificance\tlevel\t()\tof\t.05,\treject\tH0 and\tconclude intervention\tdoes\treduce\tHIV\tincidence\t(p-value\t=\t.032) Spring\t2017 Biostat\t310 375 Revisiting\tMendel's\tpeas Gregor\tMendel Case\tStudy:\tAnother\tone-samle\ttest\tof\ta\tproportion Spring\t2017 Biostat\t310 376 Another\tone-sample\ttest\tof\ta\tproportion Mendel's\tdata:\tIn\tone\texperiment,\tMendel\treported\tthat\t882\tof\t1181 pea\tpods\twere\t\"inflated\"\t(vs.\t\"constricted\") His\ttheory\tpredicted\tthat\t75%\tof\tthe\tpods\tshould\tbe\tinflated Question:\tIs\tthere\tevidence\tto\tcontradict\tMendel's\ttheory\tthat\t75%\tof\tpods should\tbe\tinflated? Hypotheses: H0:\tp\t=\t0.75 HA:\tp\t\t0.75 ' =\t.747 Data:\tX\t=\t882\t(out\tof\t1181);\tP Null\tDistribution: X ~\tBinomial(n\t=\t1181,\tP\t=\t0.75), or, p) ~\tN(.75,\t.75*.25/1181) by\tcentral\tlimit\ttheorem\t(CLT)\tsince\tn\tis\tlarge (np) 10 and\tn(1- p)) 10 ) Spring\t2017 Biostat\t310 377 Another\tone-sample\ttest\tof\ta\tproportion Compute\tthe\tp-value: Probability\tof\tobserving\ta\ttest\tstatistic\t\"at least\tas\textreme\"\tas\twas\tobserved,\tassuming\tH0 is\ttrue If\tp) ~\tN(po,\tpo(1-po)/n)\tthen\tZ\t= Here\tp0 =\t0.75 If\twe\tassume\tH0 true,\tthen\tZ\t= ,-+. + /. (12/. ) 4 ~\tN(0,1) ,-6.89 + .:;(12.:;) 11<1 p) =\t0.747\tand\twe\tcan\tcalculate\tZ\t=\t-0.238 So\tthe\tp-value\tfor\ta\ttest\tof\tH0:\tp\t=\t0.75\tvs\tHA:\tp\t\t0.75\tis\t... Spring\t2017 Biostat\t310 378 Another\tone-sample\ttest\tof\ta\tproportion 0.0 0.1 0.2 0.3 0.4 The\ttwo-sided\tp-value\tis\tgiven\tby the\tsum\tof\tthe\tareas\tunder\tthe Normal\tcurve\tfor\tZ\t< -.238\tand\tZ\t>\t.238 -3 -2 -1 0 1 2 3 Based\ton\tthis\tZ\tstatistic,\tis\tthe\ttwo-sided\tp-value\t<\t0.05? Spring\t2017 Biostat\t310 379 Another\tone-sample\ttest\tof\ta\tproportion Assuming\ta\t2-sided\ttest,\tbecause\tthe\t(standardized)\ttest\tstatistic\tis\tZ, we\tknow\tthat\tp-value\t<\t0.05,\twhenever\tZ\t<\t-2\tor\tZ\t>\t2 -2\tand\t+2\tare\tcalled\tthe\t\"critical\tvalues\" Rejecting\tH0 when\tp-value\t<\t0.05\tis\tequivalent\tto\treject\tH0 when Z\t<\t-2\tor\tZ\t>\t2 Test\tstatistic: Z\t=\t-0.238\t(compare\tto\tcritical\tvalues) P-value\t=\t0.81\t{using\ta\tcomputer\tprogram\tfor\tthe\tNormal\tdistribution to\tcompute\tP(Z\t<\t-.238)\t+\tP(Z\t>\t.238)\t} Conclusion: Using\ta\tsignificance\tlevel\t()\tof\t0.05,\twe\tfail\tto\treject\tH0 and\tconclude\tthere\tis\tnot\tsufficient\tevidence\tto\treject\tthat\tp\t=\t0.75 (p-value\t=\t0.81) Spring\t2017 Biostat\t310 380 Prevention\tof\tMother\tto\tChild Transmission\tof\tHIV Case\tstudy: Spring\t2017 Two-sample\ttest\tof\tproportions Chi-square\ttest\tand\tdistribution R\tx\tC\tcontingency\ttables Biostat\t310 381 HIV\tand\tPMTCT An\t(untreated)\tHIV-infected\tpregnant\twoman\thas\ta\t20-40%\tchance\tof infecting\ther\tinfant\tin\tutero\tor\tduring\tbirth This\tgave\trise\tto\ta\tgeneration\tof\tHIV-infected\tbabies\tin\tAfrica\tand efforts\tto\tdevelop\tstrategies\tto\tprevent\tMTCT\tof\tHIV One\tproposed\tstrategy\twas\tto\tgive\tthe\tHIV-infected\twomen anti-retrovirals (ARVs)\tduring\tpregnancy.\tIn\ttheory,\tthis\tshould\treduce the\tamount\tof\tvirus\tcirculating\tin\ther\tbody\tand\treduce\tthe\tprobability of\tHIV\ttransmission\tto\tthe\tinfant The\tACTG\t076\ttrial,\tinitiated\tin\t1991,\twas\tdesigned\tto\ttest\tthis\ttheory Spring\t2017 Biostat\t310 382 HIV\tand\tPMTCT ACTG\t076 Initiated\tin\t1991 477\tHIV-infected\tpregnant\twomen\tenrolled;\tthese\twomen\tdid\tnot yet\tneed\tAZT\tfor\ttheir\town\thealth Half\trandomized\tto\tAZT\t(2nd-3rd trimester\tthrough\tdelivery); half\trandomized\tto\tplacebo Infant\toutcome\tdata\tavailable\ton\t402\tmother-infant\tpairs; 61\t(15%)\tinfants\tinfected\twith\tHIV Infant's\tstatus Mother's\ttreatment HIV-infected HIV-negative AZT 15 183 Placebo 46 158 Estimated\trisk\tof\tHIV\tin\tAZT\tgroup\t=\t15/198\t=\t0.076 Estimated\trisk\tof\tHIV\tin\tPlacebo\tgroup\t=\t46/204\t=\t0.225 Estimated\tRR=\t0.076\t/\t0.225\t=\t0.34 Spring\t2017 Biostat\t310 383 Chi-square\tTest\t(two-sample\ttest\tof\tproportions) 1. State\tthe\tscientific\tquestion Does\tAZT\treduce\tMTCT\tof\tHIV\tcompared\tto\tplacebo? 2. Convert\tthe\tscientific\tquestion\tinto\tstatistical\thypotheses\t(null\tand alternative).\tLet\tp\tdenote\tthe\tprobability\tthat\tthe\tinfant\tis\tHIV+ H0:\tpAZT =\tpcontrol (RD\t=\t0,\tRR\t=\t1,\tOR\t=\t1) HA: pAZT \tpcontrol (RD\t\t0,\tRR\t\t1,\tOR\t\t1) 3. Choose\ta\treasonable\ttest\tstatistic,\twhere\twe\tknow\thow\tit\tshould behave\tif\tH0 is\ttrue X2=\tChi-square\tstatistic =\t (ABC -DBC )E DBC Oij =\tobserved\tcounts\tfor\t\"cell\"\tin\tith row\tand\tjth column (excluding\trow\twith\ttotals\tand\tcolumns\twith\ttotal) Eij =\texpected\tcounts\t(under\tH0)\tfor\t\"cell\"\tin\tith row\tand\tjth column Spring\t2017 Biostat\t310 384 Calculating\t\"Eij\"\tfor\tthe\tChi-square\tTest Suppose H0:\tpAZT =\tpcontrol is\ttrue.\tWhat\tnumbers\tare\texpected\tin\tthe\ttable? HIV-infected HIV-negative Total AZT R1 =\t198 Placebo R2 = 204 Total C1\t=\t61 C2\t=\t341 N\t=\t402 In\tgeneral,\tEij =\t(Ri /\tN) x\t(Cj /\tN\t)\tx\tN\t=\t(Ri x\tCj)\t/\tN E.g.,\tunder\tH0 (i.e.,\tif treatment\tand\tHIV\tinfection\tare\tindependent) E11 =\tExpected\tnumber\tof\tAZT\tand\tHIV-infected =\tProbability\t(AZT\tand\tHIV-infected)\tx\tNumber\tof\tsubjects =\tProbability(AZT)\tx\tProbability(HIV-infected)\tx\tNumber\tof\tsubjects =\t(R1 /\tN) x\t(C1 /\tN\t)\tx\tN =\t(198\t/\t402)\tx\t(61\t/\t402\t)\tx\t402\t=\t(198\tx\t61)\t/\t402\t=\t30.0 Spring\t2017 Biostat\t310 385 Calculating\t\"Eij\"\tfor\tthe\tChi-square\tTest Suppose H0:\tpAZT =\tpcontrol is\ttrue.\tWhat\tnumbers\tare\texpected\tin\tthe\ttable? HIV-infected HIV-negative Total AZT 30 168 R1 =\t198 Placebo 31 173 R2 = 204 C1\t=\t61 C2\t=\t341 N\t=\t402 Total Recall\tthe\tobserved\tnumbers: HIV-infected HIV-negative AZT 15 183 Placebo 46 158 X2=\tChi-square\tstatistic =\t Spring\t2017 (FGH -IGH )E IGH Biostat\t310 =\t17.5\tfor\tthe\tPMTCT\tstudy 386 Chi-square\tTest 4. Determine\tthe\tsampling\tdistribution\tof\tthat\ttest\tstatistic\twhen\tH0 is true\t(null\tdistribution) X2 ~\tKL ,\tif\tH0 is\ttrue KL denotes\tthe\tchi-square\tdistribution\twith\t1\tdegree\tof\tfreedom (\"yet\tanother\tprobability\tdistribution\") 5. Compute\tthe\tp-value - probability\tof\ttest\tstatistic\t\"at\tleast\tas\textreme\" as\twas\tobserved\t(here\tX2 =\t17.5)\t,\tassuming\tH0 true P(2 17.5)\t=\t.0003\t(p-value\talways\ttwo-sided\tfor\tchi-square\ttest) FYI\t- using\tExcel:\t=1- CHISQ.DIST(17.5,1,TRUE) 6. Interpret\tp-value\t(reject or\tfail\tto\treject\tH0)\tusing\t =\t0.05 Reject\tH0\t,\tconclude\tAZT\tis effective\tat\tpreventing\tMTCT\tof\tHIV Spring\t2017 Biostat\t310 387 Chi-square Distribution 1.0 The\tChi-square\tdistribution\tis\tused\tto\tfind\tthe\tp-value\tof\tthe\tChi-squared test\tstatistic L Like\tthe\tt\tdistribution,\tit\tdepends\ton\t\"degrees\tof\tfreedom\"\t- i.e.\tMN For\ta\tcontingency\ttable\twith\tr\trows\tand\tc\tcolumns,\tthe\tdegrees\tof freedom\tfor\tthe\tChi-squared\ttest\tare\t(R-1)\tx\t(C-1) The\tchi-square\tdistribution\tis\tskewed\tand\tonly\ttakes\tpositive\tvalues 0.6 0.8 1\tdf For\tillustration;\tnothing\tto remember/memorize\there 3\tdf 0.0 0.2 0.4 5\tdf 0 Spring\t2017 2 4 6 Biostat\t310 8 10 388 Chi-square\tTest The\tChi-square\ttest\tcan\tbe\tused\tto\tcompare\tthe\tproportion\tof\t\"successes\"\t(binary outcome)\tbetween\ttwo\tgroups - Test\tthe\tassociation\tbetween\t2\tbinary\tmeasures\t(2\tx\t2\ttable) - The\tfollowing\tnull\thypotheses\tare\tall\tequivalent p1 =\tp0 RD\t=\t0 RR=1 OR=1 More\tgenerally,\tChi-square\ttests\tcan\tbe\tused\tto\ttest\tfor\tan\tassociation\tbetween\tany two\tcategorical\tmeasures\t(R\tx\tC\ttable\twith\tR\trows\tand\tC\tcolumns) - E.g.\tCancer (yes/no)\tand\tnumber\tof\tcigarettes\tsmoked\tper\tday (0,\t<5,\t5-14,\t15-24,\t25-49\tor\t50+) - E.g.,\tHair\tcolor\tand\teye\tcolor p-values\tare\tcomputed\tusing\tthe\tChi-square\tdistribution - This\tmethod\tassumes\tsample\tsize(s)\tare\tlarge, e.g.\tall\texpected\tcell\tcounts should\tbe\t>\t5 - \"Exact\"\ttests\t(e.g.,\t\"Fisher's\tExact\tTest\")\tcan\tbe\tused\tfor\tsmall\tsample\tsizes,\tbut can\tbe\tcomputer\tintensive\t&\t\"conservative\" Spring\t2017 Biostat\t310 389 Chi-square\tTest\t- more\texamples Example\t1:\tRemember\tthe\tVitamin\tC\tand\tColds\tStudy Cold-Yes Cold-No TOTAL Vitamin\tC 17 122 139 Placebo 31 109 140 TOTAL 48 231 279 RR\t=\t(17/139)\t/\t(31/140) =\t0.55 H0:\tpVitC =\tpPlac (RR=1\tor\tRD=0) HA:\tpVitC \tpPlac (RR1\tor\tRD\t0) Spring\t2017 Biostat\t310 390 Chi-square\tTest\t- more\texamples Example\t1:\tThe\tVitamin\tC\tand\tColds\tStudy Vitamin\tC Cold-Yes Cold-No TOTAL 17\t23.9 122\t115.1 139 109\t115.9 140 231 279 Placebo 31\t24.1 TOTAL 48 Expected\tcell\tcounts RR\t=\t(17/139)\t/\t(31/140) =\t0.55 H0:\tpVitC =\tpPlac (RR=1\tor\tRD=0) HA:\tpVitC \tpPlac (RR1\tor\tRD\t0) X2 =\t4.81 X2 ~\tKL if\tH0 is\ttrue p-value\t=\t0.028 Conclude:\tAt\tthe =\t0.05\tlevel,\twe\treject\tthe\tnull\thypothesis\tand conclude\tthat\tthere\tis\ta\tsignificant\tassociation\tbetween\tVitamin\tC and\tcolds. Spring\t2017 Biostat\t310 391 Chi-square\tTest\t- more\texamples Example\t2:\tFrom\tDoll\tand\tHill\t(1952).The\ttable\tdisplays\tthe\tdaily\taverage number\tof\tcigarettes\tfor\tlung\tcancer\tpatients\tand\tcontrol\tpatients (case-control\tstudy) Cancer Control Total None 7 0.5% 61 4.5% 68 <5 55 4.1% 129 9.5% 184 daily # cigarettes 5-14 15-24 25-49 50+ total 489 475 293 38 1357 36.0% 35.0% 21.6% 2.8% 570 431 154 12 42.0% 31.8% 11.3% 0.9% 1059 906 447 50 2714 h0:\tno\tassociation\tbetween\tlung\tcancer\tand\tcigarette\tsmoking ha:\tassociation\tbetween\tlung\tcancer\tand\tcigarette\tsmoking x2 ~\t9l if\th0 is\ttrue p-value\t<\t.001 conclude:\tat\tthe =\t0.05\tlevel,\twe\treject\tthe\tnull\thypothesis\tand conclude\tthat\tthere\tis\ta\tsignificant\tassociation\tbetween\tsmoking\tan\tlung cancer. spring\t2017 biostat\t310 392 chi-square\ttest\t- examples example\t3:\tis\ta\tparticular\tfruit\tfly\tmutation\tsex\tlinked?\tcollect\tsex and\tmutation\tinformation\ton\t100\tfruit\tflies\t(cross-sectional\tstudy) male female wild type 40\t(57%) 30 70 mutant 15\t(50%) 15 45 100 rr =1.14 or =1.33 h0:\tno\tassociation\tbetween\tsex\tand\tmutation ha:\tassociation\tbetween\tsex\tand\tmutation ~\tkl p-value\t =\t.51 hypothesis\tand\tconclude\tthat\tthere\tis\tnot\tenough\tevidence\tto support\tan\tassociation\tbetween\tsex\tand\tmutation. 393 summary one-sample\ttest\tof\ta\tproportion two-sample\ttest\tof\tproportions r x\tc\tcontingency\ttables chi-square\ttest 394 energy\tdrinks case\tstudy:\tone-sample\tt-test\tand\tpaired\tt-test 395 in\tfebruary,\t2014\ta\tgroup\tof\tgerman\tresearchers\tpresented\tresults\tof a\tmeta-analysis\tof\t7\tstudies\tshowing\tthat\tconsumption\tof\tenergy drinks\twere\tlinked\tto\tchanges\tin\theart\tmeasurements\tlike\t\"qt interval\"\tand\tblood\tpressure in\tone\tstudy\t(phan\tand\tshah,\t2014),\tblood\tpressure\twas\tmeasured before\tand\t3\thours\tafter\tconsumption\tof\ta\tcaffeinated\t\"energy\tshot\" in\t10\tparticipants the\tdata\tare\tthe\tdifferences\tin\tsystolic\tblood\tpressure\t(after\t- before) for\teach\tparticipant. 396 one-sample\tt-test 1. state\tthe\tscientific\tquestion does\tconsumption\tof\ta\tcaffeinated\tenergy\tshot\taffect\tsystolic blood pressure 2. convert\tthe\tscientific\tquestion\tinto\tstatistical\thypotheses (null and alternative) h0:\tbpdiff =\t0 ha:\tbpdiff \t0 bpdiff is\tthe\tpopulation\tmean\tchange\t(difference)\tin\tblood\tpressure after\tconsumption\tof\ta\tcaffeinated\tenergy\tshot 397 3. choose\ta\treasonable\ttest\tstatistic t stugvv q we xyzb[[ \\ t\tis\tthe\t\"standardized\"\tversion\tof\t_`mann _`mann is\tthe\tsample\tmean\tchange\tin\tblood\tpressure\t(3\thours\tafter versus\tbefore\tconsumption\tof\ta\tcaffeinated\tenergy\tshot) l is\tthe\tsample\tvariance\tof\tthe\tchange\tin\tblood\tpressue n is\tthe\tsample\tsize is\ttrue,\twe'd\texpect\t_`mann to\tbe\tclose\tto\tzero. how\tclose?\twe have\tto\tconsider\tthe\tsampling\tvariability. - t\t =\tf.Kg\t/ d.e k6 =6.26 398 4. determine\tthe\tdistribution\tof\tthat\ttest\tstatistic\twhen\th0 ~\tt(9) i.e.,\tthe\tt\tstatistic\thas\ta\tt-distribution\twith\t9\tdegrees\tof\tfreedom (when\th0 is\ttrue) in\tgeneral,\tthe\tt\tstatistic\thas\ta\tt-distribution\twith\tn-1\tdegrees\tof freedom 5. compute\tthe\tp-value probability\tof\ttest\tstatistic\t\"at\tleast\tas extreme\"\tas\twas\tobserved,\tassuming\th0 true for\tt =\t6.26\tand\t9\tdegrees\tof\tfreedom,\tp-value\t<\t.001 6. interpret\tp-value\t(reject or\tfail\tto\treject\th0) using\ta =0.05,\twe\treject\tthe\tnull\thypothesis\tand\tconclude\tthat\tmean blood\tpressure\tis\tsignificantly\thigher\tfollowing\tconsumption\tof\ta caffeinated\tenergy\tshot. 399 .4 recall:\tstudent's\tt-distribution .3 t\tdistributions,\tlike the\tstandard normal\t(n(0,1)), are\tcentered\tat\t0. 0 .1 .2 you\tcan\tsee\tthat\ta t\tdistribution\twith 30\t\"degrees\tof freedom\"\t(df)\tis close\tto\ta\tn(0,1) distribution\tbut that\tt-distributions with\tlower\tdf have fatter\ttails -4 -2 t(2) distribution t(30) x 2 4 t(5) std normal normal\tdistribution\t =\tstandard\tNormal\tdistribution 400 meta\tanalysis\tconcluded\tthat\tenergy\tdrinks\tincreased\tblood pressure,\tincreased\theart\tcontraction\trates\tand\tlengthened\tqt interval caffeine\tsuspected\tas\tthe\tculprit \"we\tdon't\tknow\texactly\thow\tor\tif\tthis\tgreater\tcontractility\tof\tthe heart\timpacts\tdaily\tactivities\tor\tathletic\tperformance.\" \"long-term\trisks\tto\tthe\theart\tfrom\tdrinking\tenergy\tdrinks\tremain unknown\" 401 one-sample\tt-test\tvs\t(two-sample)\tpaired\tt-test the\tone-sample\tt-test\tis\tused\tto\ttest\thypotheses\tabout\ta\tsingle population\tmean h0: =6 versus\tha: 6 the\t(two-sample)\tpaired\tt-test\tis\ta\tspecial\tcase\tof\tthe\tone-sample\tt-test there\tare\ttwo\t(paired)\tmeasurements\tper\tunit\tof\tanalysis\t(two\tpaired samples),\tbut\twe\tanalyze\tthe\tdifference\tin\tthe\tmeasurements\ton\teach unit\t(i.e.,\tone\tsample\tof\tdifferences)\tas\tin\tthe\tenergy\tdrink\texample the\tsample\tsize\tis\tthe\tnumber\tof\tdifferences h0:\tmannlmlnol =0\tversus\tHA:MaNNlmlnol 402 one-sample\tt-test\t(or\tpaired\tt-test): a\ttest\tabout\ta\tpopulation\tmean\t(or\tmean\tdifference) assumptions o each\tobservation\t(or\tpair\tobservations)\tis\tindependent\tof\tother observations\t(pairs) observations\t(or\tdifferences)\tare\tapproximately\tnormally distributed. but\tif\tsample\tsize\tis\tlarge\t(n\t 30),\tthen\tnormal distribution\tfor\tobservations\tis\tnot\tnecessary p-values\tobtained\tfrom\tthe\tt-distribution. or,\tif\tn\t 30,\tthe\tnormal distribution\tcan\tbe\tused. 403 something's\tfishy\t... case\tstudy:\ttwo-sample\tt-test 404 fish\toil in\t2008,\tkoch\tet\tal.\tpublished\tresults\tof\ta\tstudy\tin\tthe\tbritish\tj.\tof dermatology double-blind,\tplacebo-controlled\trandomized\ttrial intervention:\t5.4g\tn-3\tpufa\tdocosahexaenoic\tacid\t(aka\tomega-3) result:\tintervention\tgroup\tshowed\ta\t\"significant\tclinical improvement\tof\tatopic\teczema\tin\tterms\tof\ta\tdecreased\tscorad\". scorad\tis\ta\tmeasure\tof\tseverity\tof\tatopic\teczema\t(scoring atopic dermatitis) conclusion:\t\"our\tdata\tsuggest\tthat\tdietary\tdha\t...\tmight\thave\ta beneficial\timpact\ton\tthe\toutcome\tof\tatopic\teczema\" 405 406 fish\toil\tstudy\tdesign n\t =\t44\t(21\tDHA,\t23\tplacebo) participants\tsuffering\tfrom\tatopic\teczema,\taged\t18\t- 44 randomization\t(1:1)\tstratified\tby\tgender,\tage,\tbmi daily\tdosing,\tdha\t5.4\tg,\tor\tplacebo,\t(7\tcapsules day!)\tfor\t8\tweeks scorad\tmeasurement\tat\t0,\t4,\t8,\t20\tweeks 407 fish\toil\t- summary\tof\tthe\tdata 408 two-sample\tt-test does\tdietary\tdha\tsupplementation\treduce\tatopic\tdermatitis\tseverity? and\talternative) h0:\tdha =\tPlacebo ha:\tdha \tplacebo what\tare\tdha and\tplacebo\there? the\tresults\tin\tthe\tpaper\tfocus\ton\ta\tcomparison\tat\tweek\t8 at\tweek\t8: placebo\t =\t0 \tplacebo\tor\th0:\tdha 409 rst +vwxyz{ +vwxyz+ a\tstandardized\tversion\tof s+vwxyz srst + nrst n+vwxyz =sample\tmeans\tat\tWeek\t8 t\t~\tfl i.e.,\tt\thas\tt-distribution\twith\t42\tdegrees\tof\tfreedom in\tgeneral,\tdegrees\tof\tfreedom\t =\tn1 +\tn2 410 probability\tof\ttest\tstatistic\t\"at\tleast\tas\textreme\"\tas was\tobserved,\tassuming\th0 rrst r+vwxyz{ =\t28.5\t-33.4\t=\t-4.9 sdha =\tsplacebo=\t9.6 ndha =\t21,\tnplacebo >\t0.05) This\twas\tinterpreted\tas\tevidence\tthat\tthe\tintervention\twas\teffective Spring\t2017 Biostat\t310 412 Fish\tOil Spring\t2017 Biostat\t310 413 Summary Two\tsample\tt-test\tis\ta\ttest\tabout\tpopulation\tmeans Assumptions: - the\ttwo\tsamples\tare\tindependent\t(possibly\tdifferent\tsizes) - each\tobservations\twithin\ta\tsample\tis\tindependent\tof\tother observations\twithin\tthe\tsample - Observations\tare\tapproximately\tNormally\tdistributed. But\tif\tthe sample\tsize(s)\tare\tlarge,\tthen\tNormal\tdistribution\tfor observations\tis\tnot\tnecessary P-values\tobtained\tfrom\tthe\tt-distribution. Or,\tif\tn\t 30,\tthe\tNormal distribution\tcan\tbe\tused Watch\tout\tfor\tpaired\tversus\tunpaired\tdata Avoid\t\"p-value\tshopping\" Spring\t2017 Biostat\t310 414 Summary\tof\tHypothesis\ttests,\tso\tfar Name Hypothesis Test\tStatistic Null\tDistribution One-sample\ttest of\tproportion H0:\tp\t=\tpo X\t=\tnumber Binomial(n,po) \"successes\Here is actual question for #9, can you help me for this? as last one? 14. Rubella (German measles) can cause severe birth defects if a woman acquires it during pregnancy. Preblud et al (1981) assessed the risk to fetuses when mothers received the rubella vaccine (which is a live attenuated virus) shortly before or shortly after they became pregnant (before they knew they were pregnant). They found birth defects in 0 of 112 children born to such women. Is this enough information to convince you the vaccine is \"safe\"? Specifically, a) Do a web search to find the background rate of birth defects in the US population. What did you find? Birth defects affect one in every 33 babies(about 3% of all new babies) born in the US each year. b) Using the binomial formula, we can show that the probability of 0 birth defects in n women, if the true background rate is p, can be computed as (1-p)n. Do this computation for the probability, p, you found in (a) using n=112. Repeat the computation for a few values of p that are smaller than the value you found in (a) and a few values of p that are larger than the value you found in (a). Report your results. When p=3%(or 0.03) and n=112 Probability=(1-0.03)112 =108.64 For the values of p less than 0.03 For p=0.025 and n=112 Probability=(1-0.025)112 =109.2 For p=0.02 and n=112, Probability=(1-0.020)112 =109.76 For the values of p>0.03 Taking p=0.035 and n=112, Probability=(1-0.035)112 =108.08 Taking p=0.04, Probability=(1-0.04)112 =107.52 For p=0.05, Probability=(1-0.05)112 =106.4 c) *(1 point) Based on the analyses you did in (b), do these data suggest that the rate of birth defects in women receiving the rubella vaccine is higher, lower, or about the same as the background rate you found in (a). Explain your reasoning. When we use a value of p less than the p we found in (a) i.e values less than p=0.03, the rate of birth defects is higher. When we use a value of p greater than 0.03, the rate of birth defects is lower. As p reduces, the rate increases. This is based on the results we have gotten through the formula. d) Note any concerns, limitations, or cautions you would attach to the conclusion you came to in (c). Since we are just varying the p and the sample size is the same, it would be better to vary the sample size too to see the effect of different sample sizes on that probability. Here is actual question for #9, can you help me for this? as last one? 14. Rubella (German measles) can cause severe birth defects if a woman acquires it during pregnancy. Preblud et al (1981) assessed the risk to fetuses when mothers received the rubella vaccine (which is a live attenuated virus) shortly before or shortly after they became pregnant (before they knew they were pregnant). They found birth defects in 0 of 112 children born to such women. Is this enough information to convince you the vaccine is \"safe\"? Specifically, a) Do a web search to find the background rate of birth defects in the US population. What did you find? Birth defects affect one in every 33 babies(about 3% of all new babies) born in the US each year. b) Using the binomial formula, we can show that the probability of 0 birth defects in n women, if the true background rate is p, can be computed as (1-p)n. Do this computation for the probability, p, you found in (a) using n=112. Repeat the computation for a few values of p that are smaller than the value you found in (a) and a few values of p that are larger than the value you found in (a). Report your results. When p=3%(or 0.03) and n=112 Probability=(1-0.03)112 =108.64 For the values of p less than 0.03 For p=0.025 and n=112 Probability=(1-0.025)112 =109.2 For p=0.02 and n=112, Probability=(1-0.020)112 =109.76 For the values of p>0.03 Taking p=0.035 and n=112, Probability=(1-0.035)112 =108.08 Taking p=0.04, Probability=(1-0.04)112 =107.52 For p=0.05, Probability=(1-0.05)112 =106.4 c) *(1 point) Based on the analyses you did in (b), do these data suggest that the rate of birth defects in women receiving the rubella vaccine is higher, lower, or about the same as the background rate you found in (a). Explain your reasoning. When we use a value of p less than the p we found in (a) i.e values less than p=0.03, the rate of birth defects is higher. When we use a value of p greater than 0.03, the rate of birth defects is lower. As p reduces, the rate increases. This is based on the results we have gotten through the formula. d) Note any concerns, limitations, or cautions you would attach to the conclusion you came to in (c). Since we are just varying the p and the sample size is the same, it would be better to vary the sample size too to see the effect of different sample sizes on that probability

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Linear Algebra and Its Applications

Authors: David C. Lay

4th edition

321791541, 978-0321388834, 978-0321791542

More Books

Students also viewed these Mathematics questions