Answered step by step
Verified Expert Solution
Link Copied!

Question

00
1 Approved Answer

QUESTIONS 1. Explain why Figure 1 (along with the shift pattern of Ms. Gilbert) suggests that Ms. Gilbert may be guilty of excess deaths on

image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
QUESTIONS 1. Explain why Figure 1 (along with the shift pattern of Ms. Gilbert) suggests that Ms. Gilbert may be guilty of excess deaths on the medical ward. 18 MART 1 Public Policy and Social Science 2. Why was the evidence from Figure 1 (along with the shift pattern of Ms. Gilbert) not conclusive evidence that Ms. Gilbert was guilty of the excess deaths? Suggest an explanation that could have caused the association without the unusual activity of Ms. Gilbert. 3. What is the relevance of the coin-tossing story to the trial of Ms. Gilbert? 4. Cobb argued that a jury would likely fall into the prosecutor's fallacy." What is it, and why was the defense concerned about it? tempting, and so common, that it has become known to statisticians as the prosecutor's fallacy. Because the false logic beckons so seduc- tively, it is often used as the basis for arguing, as the Cobb report did, that the statistical evidence was likely to be misinterpreted by the jury in a way that favored the prosecution and was therefore prejudicial. CONCLUSION Judge Ponsor ruled that the statistical evidence should not be allowed at trial. Nevertheless, the other, nonstatistical evidence proved to be enough to convince the jury, and after many days of deliberation, Gilbert was convicted on three counts of first-degree murder, one count of second-degree murder, and two counts of attempted mur- det. After a penalty phase of the trial, the jury voted 8-4 for a death Cobb & Gehlbach Statistics in the Courtroom 17 sentence, and because the vote was not unanimous, Gilbert's life was spared. She is now serving a sentence of life in prison without possi- bility of parole. The statistical analysis that uncovered the pattern linking Gilbert's presence to the excess deaths was an essential part of the process that brought her to justice. The two juries that Gilbert faced, and their different roles in our system of justice, illustrate neatly the proper interpretation of hypothesis testing. First, a small p-value dias allow you to rule out chancelike variability as a plausible explanation for an observed pattern. It tells you that the observed pattern is so ex- treme as to qualify as a surprise in the eyes of science. Second, if your data are observational. a small p-value does not tell you what has tuned the surprise. Association is not causation. Inferences about cause are much more straightforward with a randomized experiment. ADDITIONAL READINGS A NURSE ACCUSED By the mid-1990s, Kristen Gilbert had been working for several years wa nurse at the Veteran's Administration (VA) hospital in Northamp- ton, Massachusetts. For a time, she had been one of the nurses the ochers most often looked up to as an example of skill and competence She had established a reputation for being particularly good in a cri- sia. If a patient went into cardiac arrest, for example, she was often the first to notice that something was wrong. She would sound a "code blue," the signal that brough the aid of the rocitation team. She stayed calm, and she know how to give a shor of the stimulant epi- nephrine, a synthetic form of adrenaline, to try to restart a patient's heart. Often the adrenaline did its job, the heart began to beat again, and the patient's life was saved, 4 Pin Serience Lately, though, other nurses had become increasingly suspicious that something was not right. To some, it seemed that there were 100 many codes called, too many crise when Gilbert was on the ward. Over time, the suspicions became stronger. Several patients who went into arrest died, and to some of the staff, the number of death was a sinister sign. An investigation was launched. Although an initial report by the VA found that the numbers of deaths were consistent with the parters at other VA hospitals, the suspicions of the staff remained. Eventually, after additional investigation, including a statistical analy sis by one of us (Gehlbach). Assistant U.S. Attorney William Welch convened a grand jury in 1998 to hear the evidence against Gilbert Welch accused her of lailling several patients by giving them focal doses of heart stimulant, and he wanted her indicted for multiple murders Kristen Gilbert was the mother of two young children. Although the war divorced, she had been dating a male friend for some time. She had a steady job, one that paid reasonably well, and her skills a nuese was generally recognised. What could possibly motivate her to commit the murders that she was now suspected of these were not mercy Killing the victims in Welchisindictmen were not old man or in poor health but were middle-aged, and healthy enough that their death wete unexpected. Welch argued that Kristen Gilbers did have a for her actions. She liked the thrill of a crisis, she needed the recognition dar came from her silful handling of a cardiac arrest, and, especially dhe wanted to impras her boyfriend, who also worked at the hospital Purt of the evidence againt Gilbert dealt with her motivation, part of it came from the testimony of coworkers about her access to the epi nephrine des accused of using in the alleged munters, and purt came from a ploysician who died shout the symptoms of the man who had died. Taken zogether, die evidence was certainly the, but would I be convincing No one had seen Gilbert give fatal injections, and al- though the patienu' death were expected, the symptoms could have been comidered content with other possible causes of death. It turned out that a major part of the evidence against Gilbert wat stitical HYPOTHESIS TESTING I bu share w ble Body Cou... HYPOTHESIS TESTING I A key question for the grand juros was this: Was it true that there were more deal when Kristen Gifhert was working. Not just one or two extra deathee or two could easily be due just to coincidence-but Cobb & Gile enough to be truly picious? If not, there might not be enough ev. idence to justify bringing Gilbert to trial. On the other hand, anan wwer of yes would call for an explanation, and enough other evidence pointed to Gilbert to make an indictment all but certain The prosecutors recognised that the key question about ce deutha was one that could only be answered using statistics, and so they asked Stephen Gehlbach, who had done the statistical analysis of the hospital records to presenta ummary of the rules to the grand jury. In what follows, we will present you with a similar sum- mary of the statistical evidence. As you read through the summary Imagine yourself as one of the grand jarors. Do you find the evidence strong enough to bring Gilbert to trial? "The statistical substance involves hypothesis testing, a form of reasoning that is probability calculations to decide whether or not an observed outcome should be regarded as sensual creme-that it qualities as a scientific surprise." The logic and interpretation of hypothesis testing is fundamental to a lot of work in the natural and social sciences, important enough that anyone serious about understanding how science works should undenstand this form of reasoning Unfortunately, in many statistics course, the logic of hypothesis testing is taught at the same time as some of the probability calculations that you need for particular applica tion, and the details of the computations tend to eclipse the un derlying logic. Part of the challenge facing Gehlbach was to make the loke clear to the grand jury without going into the details of the calculations GEHLBACH'S TESTIMONY TO THE GRAND JURY Dr. Gehlbach's testimony was delivered orally, with Gehlbach in the witnes sand, talking to the members of the grand jury. The next sev eral paragraphe summarize three parts of Gehlbach testimony, a fint part about the pattern of death, by shift and by year, on the medical ward where Gilbert worked a second part about variability and values and a third part shout a statical test for whether the er linking the extendech to Gilbert's prence on the ward w 100 extrum arded due to ordinary expectable variability 10 0 1 200 101 102 1903 1904 1906 Night AMD AM4PM) Even 4 - FIGURE The pum of dort by year and the The summaries don't use the exact words from the grand jury testi mony, but they cover some of the same substance Part One: The Pattern of eaths Imagine that at this point in Gehlbach's testimony, the jurors are looking at a gruph like the one in Figure 1: Dr. Gehlbach: "The graph you see shows data from the VA hospital where Kristen Gilbert world. Each set of three bars shows one year's worth of clau, starting in 1988 and running through 1997 Within each set of three bars, there is one bar for each shift. The left bar is for the night lift, midnight to 8 AM the middle bar is for the day shift, 8AM to and the right bar is for the evening shift.4 M. to midnight. The height of each bar tells how many deaths there were on a shift for the year in question. "Now look at the pattern from one year to the next. For the first two years,' and '89, the bars are short, showing roughly to deaths per year on cach life. Then there is a dramatic increase. For the year 1990 through 1995, there is onc shift in each of three with 25 to 35 deaths per year. C Gehabilities Com Then for the last two years, the bars are all short again, a bit under en dels per year on cach shift. "How does this pattern fit with Kristen Gilbert's time at the VA? Irurit that Me Gilbert hegan work on Ward C the medical and, in March of 1990, and sopped working at the VA in February of 1996. Looking at the death by year the pattern trada Me Gilbert wode history small number of death in years when she didn't work at the VA. and large numbers when she was there "We can learn more by looking at the different shifts. You'll potice that in each of the years that M. Gilliers wocodon Ward Cone of the thread always show more death than the other two. For five of the years. 1991 og 1995, the evening shift that stands out. During the five years Ey Cour... Animations :) E League Scoreboar... Pearson CANVAS Then for the last two years, the bars are all short again, a bit under en deaths per year on each shift "How does this partem fit with Kristen Gilben's time at the VA? It turns out that Ms. Gilbere began work on Ward C the medical and, in March of 1990, and stopped working at the VA in February of 1996. Looking at the deaths by year the pattern tracks Me Gilbert work history imall numbers of deaths in years when she didn't work at the VA, and large numbers when she was there "We can learn more by looking at the different shifts. You'll notice that in each of the years that Ms. Gilbert world on Wone of the three shifts always show more deaths than the other two. For five of the six yean. 1991 through 1995, it's the evening shift that shout. During these five years, Ms. Gilbert was signed to the evening shift "What about the exception, 19902 That year it is the night shift, not the evening shift that ands out as having an unually large number of death. Well, it turns out that 1990 was also an exception for Kristen Gilbert work history That year she was signed not to the evening shift but to the night Shift At this point in the argument, there is a des partem acting Gilbert's presence with exceeded lowever, in principle the pattern migliebe nothing more than the result of ordinary expectable variation The goal of a Mattical test in this station would be to determine whether the number of excess death were too extreme to be accounted for by such valtion. In order to prepare the furots to think about satistical ther, Gehlbach fint explained the basic ideas in a more familiar com Part Two: Variability and p-values De Gehlbach "To understand the idea of a statistical test, think about toning a coin. How can you decide whether there's something spicious about of 10 coin flips Ordinarily, we expect a coin to be fair, which means theresa 50-50 chance of heads. This is our hypothesis, the starting point of our reasoning. If you flip 10 times, and the coin in fair, then on average yould expect five heads to show up. But you know that you might person, or even eightheade Things vary, and sometimes the variation is de just recuno "Now suppose yo 10 heads in lollip Is the much to be apicul Huwstrumen cotcome do we need before we should doubt oor hypothesis that the chance of hrade i 50! "Tower thuis Gelon, statisticiam compete a values start with the hypothes 50-50 chance of headed compute the probability of six head, wven heads, and Deltrument that the probability of at least a heads in 10 flipsis about 0.36. This mother of the time when you make 10 fps with air coin, you'll get the If something happens of the time slothing Eor the time you do 10 flips of fair coin, you'o get even heads or more. So, seven out of 10 ian't really surprising, either. "If you pet ning heads in 10 flips, however, that's musul If the con la falt, then you're unlikely to get a rule that extreme. The probability of p-value, for nine or more heade is only about 0.01, IN "For 10 out of 10 the p-value is about 0.001, or one in 1000. This is a result to extreme that you'd almost ever get it from a fair coin. If you saw mc pull a coin out of my pocket, flip it 10 times, and yet heads every time, you'd be justified in thinking there's something going on besides just chance variation "That's how statisticians we a p value. If we see a result with a really low p-value, then either we've seen a beally are outcome or else the hypothesis we used to compute the p valse must be wrong "In many medical trial-testing whether antihistamines relieve your symptoms of allergies, or things like that we compute a p-value suming the drug has no effect, and a probability of one out of 100 is unusual enough, and the evidence would be considered strong cough, to conclude that the medicine actually worked. G&G Sis is the DEATH ON THE GILBERT PRESENT 34 24 TABLE The basis of the states 212 1350 1542 13 1041 De Now, with the basic logic out on the walls was time to print formation. What follows is just one focused part of the set of Gellbach ictually presented Part Three: A Statistical Text Athis point, Geilbudslowed the jury data like that shown in Tablet De Gehlbach "The table summarecords for the Il man lading op so the end of February 1996. (That Febrny was the month when Me Guben coworkers met with their perven express the concema, shortly after cha Me Gilbert took medicale) With 527 days during the period in question, and three shifu per days there Wen 16 shifes in all Out of the 16tif, there were Sep 24 2020 Sep 2009 GILBERT ON TRIAL The grand jury found the evidence pervasive and indicted Gilbert Because the VA hospital is legally the property of the federal govern ment, Gilbert would stand trial in federal district court on four counts of murder and three additional counts of attempted murder The question of jurisdiction was important because although the state of Massachusetts has no death penalty, Gilbert was facing a fed eral Indictment, governed by federal rather than state laws, and Assistant U.S. Attorney Welch decided to ask for the death penalty. Kristen Gilbert would be on trial for her life. Befour the trial got under way, the judge, Michael A. Ponsor, hud to rule on whether the jury should be allowed to hear the statisticale idence. On the one hand this seems like a no brainer. After all, if the evidence was an important part of what was presented to the grand jury, if it was appropriate for them to hear, and if they found it com pelling, what could possibly be wrong with letting the trial jury hear the same timony! On the other hand, a counterargument might be that allowing the statistical evidence would just lead to the anhelpful distraction of "dueling experts. The court system allows expert testi mony when the evidence involves specialised technical or scientific inues that go beyond what member of the jury would ordinarily be familiar with. The purpose of the experts is to provide explanations of the science or of the technical facts involved, along with the appropri ate conclusions. In other words, they help the jury understand the Cobb & Get in the 11 evidence bets and the US Supreme Court has set guidelines aimed at making sure that scientification is not admited. The goal is to help ensure that the verdict will be scientifically sound Nevertheless, attorneys sometimes say that if there is apertestimony on one side, the other side hires another expert who will disagree and the jury, rather than think through the explanation will simply is moet it all. One expert cances the other. Although this view may be everly cynical, no doubt it does have a basis in fact, Rather than rely on the crude strategy of dueling experts, Gilbert's defense atomes asked the other of the two of us (Cobb) to prepare a written port for the joy summarising the reasons why it would no le appropriate for the new jury, the trial jury to hear the wine evidence thatchbach led presenud ale to the grand jury. In the nesterveral paragraph, you will read a summary of the main points is dat report. This time, put yourself in the position of Judge Post Do you find the points permet Would you have allowed the jury to be the statical evidence or not HYPOTHESIS TESTING II So far in the Gehlbach time the interpretation of hypothesis terting has found on what it is that a tiny p-valuede sell you. It telle you that the observed to be explained due to chance variation. This was actly the data for the rary Were there to many cadetle when there be pidou la tha finel The dear away in the Colbwport the focus on things that my plus de tell you Unfortunately for people who need to unde lyphosting the valid con copie The COBB'S REPORT TO JUDGE PONSOR Lexing anide a variety of secondary technical fucs, the Cobb report made three main points. One of them was to agree with the bottom line conclusion in Gehlbach's testimony. The other two dealt with two limitations on what you can learn from a tiny palut. 12 con Social Scimo Point One: The Defense and Prosecution Statisticians Agree As mentioned earlier, often the two experts who provide testimony on scientific evidence disagree. However, that was not what happened in the Gilbert Case Cobbis report agreed with chilbachistestimony before the grand jury. We both cheaght the pattern linking Giller's presence on the ward with cace deaths was far too strong to be re- garded a mere coincidence due to chancelike variation. We both thought too, that in the ance of any innocent explanation for the partern, the association was more than strong enough to justify the indictment. Why then, shouldn't the trial jury hear the testimony? To answer that question we proceed with Cobb's ochet two points Point Two Association Is Not Causation The grand jury and the trial jury have quite different decisions 10 make as they weigh the evidence, and the difference is dosely tied to what a value does and does not tell yon. The grand jury had to de cide whether or not Gilbert should stand trial. Was there enough picion to justify the expense oo the government and the pacho Jopical burden on Gilbert to hold what promised to be a long and pensive trial! A grand jury does not have to decide guilt or innocence beyond a reasonable doubt. For them, the standard is much lower. They are simply asked to determine whether the level of suspicion is high enough. This is precisely the kind of question that logic of hy prochestening is designed to awwe. In statistics, and in since generally, the bar is quite high for what deserve to be considered trong suspicion, typically a p-value of 0.05 or 0.01 A low value establishes picion by long aucun was an explanation Notice that a low value does not provide an explanation, Indoesn't wy. "Here. This is the son for the excess death. What it was is much more limited: "Whatever the explanation may be you can be qulle confident that it is sur mere chance variation." The trial jury luded to decide whether the facts look api- ciou. By the time comes to trial, the decisionalspiciun had been made. The wil jury leaked to decide the reason for dhe aplicou facu Were the dead caused by Gillent giving fatalinjalone? Or were there enough uncertainties that the case could not be determined byondamnable doube Because Cobb & Gehlbach: Statistics in the Courtroom 9 DEATH ON SHIFT? Yes GILBERT PRESENT No Total 257 217 1350 1567 1384 1641 Yes 40 No 34 Total 74 TABLE 1 The basis of the statistical test Note: The table is based on the following data Number of days 541 Number of this 1641 Number of death 74 Deather than 0.045 Shits with pre 257 Expected number of deaths 11.59 Derved number of deaths Now, with the basic logic out on the table, it was time to present a formal test. What follows is just one focused part of the set of tests Gehlbach actually presented. Part Three: A Statistical Test At this point, Gehlbach showed the jury data like that shown in Table 1. Dr. Gehlbach: "The table summarizes records for the Page of 16 evidence betterand the U.S. Supreme Court has set guidelines aimed at making sure that unscientific testimony is not admitted. The goal is to help ensure that the verdict will be scientifically sound. Nevertheless, attorneys sometimes say that if there is expert testimony on one side, the other side hires another expert who will disagree, and the jury, rather than think through the explanations, will simply ig- nore it all. One expert cancels the other. Although this view may be overly cynical, no doubt it does have a basis in fact. Rather than rely on the crude strategy of dueling experts, Gilbert's defense attorneys asked the other of the two of us (Cobb) to prepare a written report for the judge summarizing the reasons why it would not be appropriate for the new jury, the trial jury, to hear the same evidence that Gehlbach had presented earlier to the grand jury. In the next several paragraphs, you will read a summary of the main points in that report. This time, put yourself in the position of Judge Ponsor. Do you find these points persuasive? Would you have allowed the jury to hear the statistical evidence or not? HYPOTHESIS TESTING II So far, in the Gehlbach testimony, the interpretation of hypothesis testing has focused on what it is that a tiny p-value does tell you. It tells you that the observed result is too extreme to be explained as due to chancelike variation. This was exactly the relevant issue for the grand jury: Were there so many excess deaths when Gilbert was present as to be suspicious in the eyes of science? The clear answer was yes. In the Cobb report, the focus was on things that tiny p-values do not tell you. Unfortunately for people who need to understand hypothesis testing, these invalid conclusions are a constant temptation. They seem to make sense intuitively, but they are wrong, and so they have great potential to mislead the unwary. This potential for logical mischief dhe bar forced QUESTIONS 1. Explain why Figure 1 (along with the shift pattern of Ms. Gilbert) suggests that Ms. Gilbert may be guilty of excess deaths on the medical ward. 18 MART 1 Public Policy and Social Science 2. Why was the evidence from Figure 1 (along with the shift pattern of Ms. Gilbert) not conclusive evidence that Ms. Gilbert was guilty of the excess deaths? Suggest an explanation that could have caused the association without the unusual activity of Ms. Gilbert. 3. What is the relevance of the coin-tossing story to the trial of Ms. Gilbert? 4. Cobb argued that a jury would likely fall into the prosecutor's fallacy." What is it, and why was the defense concerned about it? tempting, and so common, that it has become known to statisticians as the prosecutor's fallacy. Because the false logic beckons so seduc- tively, it is often used as the basis for arguing, as the Cobb report did, that the statistical evidence was likely to be misinterpreted by the jury in a way that favored the prosecution and was therefore prejudicial. CONCLUSION Judge Ponsor ruled that the statistical evidence should not be allowed at trial. Nevertheless, the other, nonstatistical evidence proved to be enough to convince the jury, and after many days of deliberation, Gilbert was convicted on three counts of first-degree murder, one count of second-degree murder, and two counts of attempted mur- det. After a penalty phase of the trial, the jury voted 8-4 for a death Cobb & Gehlbach Statistics in the Courtroom 17 sentence, and because the vote was not unanimous, Gilbert's life was spared. She is now serving a sentence of life in prison without possi- bility of parole. The statistical analysis that uncovered the pattern linking Gilbert's presence to the excess deaths was an essential part of the process that brought her to justice. The two juries that Gilbert faced, and their different roles in our system of justice, illustrate neatly the proper interpretation of hypothesis testing. First, a small p-value dias allow you to rule out chancelike variability as a plausible explanation for an observed pattern. It tells you that the observed pattern is so ex- treme as to qualify as a surprise in the eyes of science. Second, if your data are observational. a small p-value does not tell you what has tuned the surprise. Association is not causation. Inferences about cause are much more straightforward with a randomized experiment. ADDITIONAL READINGS A NURSE ACCUSED By the mid-1990s, Kristen Gilbert had been working for several years wa nurse at the Veteran's Administration (VA) hospital in Northamp- ton, Massachusetts. For a time, she had been one of the nurses the ochers most often looked up to as an example of skill and competence She had established a reputation for being particularly good in a cri- sia. If a patient went into cardiac arrest, for example, she was often the first to notice that something was wrong. She would sound a "code blue," the signal that brough the aid of the rocitation team. She stayed calm, and she know how to give a shor of the stimulant epi- nephrine, a synthetic form of adrenaline, to try to restart a patient's heart. Often the adrenaline did its job, the heart began to beat again, and the patient's life was saved, 4 Pin Serience Lately, though, other nurses had become increasingly suspicious that something was not right. To some, it seemed that there were 100 many codes called, too many crise when Gilbert was on the ward. Over time, the suspicions became stronger. Several patients who went into arrest died, and to some of the staff, the number of death was a sinister sign. An investigation was launched. Although an initial report by the VA found that the numbers of deaths were consistent with the parters at other VA hospitals, the suspicions of the staff remained. Eventually, after additional investigation, including a statistical analy sis by one of us (Gehlbach). Assistant U.S. Attorney William Welch convened a grand jury in 1998 to hear the evidence against Gilbert Welch accused her of lailling several patients by giving them focal doses of heart stimulant, and he wanted her indicted for multiple murders Kristen Gilbert was the mother of two young children. Although the war divorced, she had been dating a male friend for some time. She had a steady job, one that paid reasonably well, and her skills a nuese was generally recognised. What could possibly motivate her to commit the murders that she was now suspected of these were not mercy Killing the victims in Welchisindictmen were not old man or in poor health but were middle-aged, and healthy enough that their death wete unexpected. Welch argued that Kristen Gilbers did have a for her actions. She liked the thrill of a crisis, she needed the recognition dar came from her silful handling of a cardiac arrest, and, especially dhe wanted to impras her boyfriend, who also worked at the hospital Purt of the evidence againt Gilbert dealt with her motivation, part of it came from the testimony of coworkers about her access to the epi nephrine des accused of using in the alleged munters, and purt came from a ploysician who died shout the symptoms of the man who had died. Taken zogether, die evidence was certainly the, but would I be convincing No one had seen Gilbert give fatal injections, and al- though the patienu' death were expected, the symptoms could have been comidered content with other possible causes of death. It turned out that a major part of the evidence against Gilbert wat stitical HYPOTHESIS TESTING I bu share w ble Body Cou... HYPOTHESIS TESTING I A key question for the grand juros was this: Was it true that there were more deal when Kristen Gifhert was working. Not just one or two extra deathee or two could easily be due just to coincidence-but Cobb & Gile enough to be truly picious? If not, there might not be enough ev. idence to justify bringing Gilbert to trial. On the other hand, anan wwer of yes would call for an explanation, and enough other evidence pointed to Gilbert to make an indictment all but certain The prosecutors recognised that the key question about ce deutha was one that could only be answered using statistics, and so they asked Stephen Gehlbach, who had done the statistical analysis of the hospital records to presenta ummary of the rules to the grand jury. In what follows, we will present you with a similar sum- mary of the statistical evidence. As you read through the summary Imagine yourself as one of the grand jarors. Do you find the evidence strong enough to bring Gilbert to trial? "The statistical substance involves hypothesis testing, a form of reasoning that is probability calculations to decide whether or not an observed outcome should be regarded as sensual creme-that it qualities as a scientific surprise." The logic and interpretation of hypothesis testing is fundamental to a lot of work in the natural and social sciences, important enough that anyone serious about understanding how science works should undenstand this form of reasoning Unfortunately, in many statistics course, the logic of hypothesis testing is taught at the same time as some of the probability calculations that you need for particular applica tion, and the details of the computations tend to eclipse the un derlying logic. Part of the challenge facing Gehlbach was to make the loke clear to the grand jury without going into the details of the calculations GEHLBACH'S TESTIMONY TO THE GRAND JURY Dr. Gehlbach's testimony was delivered orally, with Gehlbach in the witnes sand, talking to the members of the grand jury. The next sev eral paragraphe summarize three parts of Gehlbach testimony, a fint part about the pattern of death, by shift and by year, on the medical ward where Gilbert worked a second part about variability and values and a third part shout a statical test for whether the er linking the extendech to Gilbert's prence on the ward w 100 extrum arded due to ordinary expectable variability 10 0 1 200 101 102 1903 1904 1906 Night AMD AM4PM) Even 4 - FIGURE The pum of dort by year and the The summaries don't use the exact words from the grand jury testi mony, but they cover some of the same substance Part One: The Pattern of eaths Imagine that at this point in Gehlbach's testimony, the jurors are looking at a gruph like the one in Figure 1: Dr. Gehlbach: "The graph you see shows data from the VA hospital where Kristen Gilbert world. Each set of three bars shows one year's worth of clau, starting in 1988 and running through 1997 Within each set of three bars, there is one bar for each shift. The left bar is for the night lift, midnight to 8 AM the middle bar is for the day shift, 8AM to and the right bar is for the evening shift.4 M. to midnight. The height of each bar tells how many deaths there were on a shift for the year in question. "Now look at the pattern from one year to the next. For the first two years,' and '89, the bars are short, showing roughly to deaths per year on cach life. Then there is a dramatic increase. For the year 1990 through 1995, there is onc shift in each of three with 25 to 35 deaths per year. C Gehabilities Com Then for the last two years, the bars are all short again, a bit under en dels per year on cach shift. "How does this pattern fit with Kristen Gilbert's time at the VA? Irurit that Me Gilbert hegan work on Ward C the medical and, in March of 1990, and sopped working at the VA in February of 1996. Looking at the death by year the pattern trada Me Gilbert wode history small number of death in years when she didn't work at the VA. and large numbers when she was there "We can learn more by looking at the different shifts. You'll potice that in each of the years that M. Gilliers wocodon Ward Cone of the thread always show more death than the other two. For five of the years. 1991 og 1995, the evening shift that stands out. During the five years Ey Cour... Animations :) E League Scoreboar... Pearson CANVAS Then for the last two years, the bars are all short again, a bit under en deaths per year on each shift "How does this partem fit with Kristen Gilben's time at the VA? It turns out that Ms. Gilbere began work on Ward C the medical and, in March of 1990, and stopped working at the VA in February of 1996. Looking at the deaths by year the pattern tracks Me Gilbert work history imall numbers of deaths in years when she didn't work at the VA, and large numbers when she was there "We can learn more by looking at the different shifts. You'll notice that in each of the years that Ms. Gilbert world on Wone of the three shifts always show more deaths than the other two. For five of the six yean. 1991 through 1995, it's the evening shift that shout. During these five years, Ms. Gilbert was signed to the evening shift "What about the exception, 19902 That year it is the night shift, not the evening shift that ands out as having an unually large number of death. Well, it turns out that 1990 was also an exception for Kristen Gilbert work history That year she was signed not to the evening shift but to the night Shift At this point in the argument, there is a des partem acting Gilbert's presence with exceeded lowever, in principle the pattern migliebe nothing more than the result of ordinary expectable variation The goal of a Mattical test in this station would be to determine whether the number of excess death were too extreme to be accounted for by such valtion. In order to prepare the furots to think about satistical ther, Gehlbach fint explained the basic ideas in a more familiar com Part Two: Variability and p-values De Gehlbach "To understand the idea of a statistical test, think about toning a coin. How can you decide whether there's something spicious about of 10 coin flips Ordinarily, we expect a coin to be fair, which means theresa 50-50 chance of heads. This is our hypothesis, the starting point of our reasoning. If you flip 10 times, and the coin in fair, then on average yould expect five heads to show up. But you know that you might person, or even eightheade Things vary, and sometimes the variation is de just recuno "Now suppose yo 10 heads in lollip Is the much to be apicul Huwstrumen cotcome do we need before we should doubt oor hypothesis that the chance of hrade i 50! "Tower thuis Gelon, statisticiam compete a values start with the hypothes 50-50 chance of headed compute the probability of six head, wven heads, and Deltrument that the probability of at least a heads in 10 flipsis about 0.36. This mother of the time when you make 10 fps with air coin, you'll get the If something happens of the time slothing Eor the time you do 10 flips of fair coin, you'o get even heads or more. So, seven out of 10 ian't really surprising, either. "If you pet ning heads in 10 flips, however, that's musul If the con la falt, then you're unlikely to get a rule that extreme. The probability of p-value, for nine or more heade is only about 0.01, IN "For 10 out of 10 the p-value is about 0.001, or one in 1000. This is a result to extreme that you'd almost ever get it from a fair coin. If you saw mc pull a coin out of my pocket, flip it 10 times, and yet heads every time, you'd be justified in thinking there's something going on besides just chance variation "That's how statisticians we a p value. If we see a result with a really low p-value, then either we've seen a beally are outcome or else the hypothesis we used to compute the p valse must be wrong "In many medical trial-testing whether antihistamines relieve your symptoms of allergies, or things like that we compute a p-value suming the drug has no effect, and a probability of one out of 100 is unusual enough, and the evidence would be considered strong cough, to conclude that the medicine actually worked. G&G Sis is the DEATH ON THE GILBERT PRESENT 34 24 TABLE The basis of the states 212 1350 1542 13 1041 De Now, with the basic logic out on the walls was time to print formation. What follows is just one focused part of the set of Gellbach ictually presented Part Three: A Statistical Text Athis point, Geilbudslowed the jury data like that shown in Tablet De Gehlbach "The table summarecords for the Il man lading op so the end of February 1996. (That Febrny was the month when Me Guben coworkers met with their perven express the concema, shortly after cha Me Gilbert took medicale) With 527 days during the period in question, and three shifu per days there Wen 16 shifes in all Out of the 16tif, there were Sep 24 2020 Sep 2009 GILBERT ON TRIAL The grand jury found the evidence pervasive and indicted Gilbert Because the VA hospital is legally the property of the federal govern ment, Gilbert would stand trial in federal district court on four counts of murder and three additional counts of attempted murder The question of jurisdiction was important because although the state of Massachusetts has no death penalty, Gilbert was facing a fed eral Indictment, governed by federal rather than state laws, and Assistant U.S. Attorney Welch decided to ask for the death penalty. Kristen Gilbert would be on trial for her life. Befour the trial got under way, the judge, Michael A. Ponsor, hud to rule on whether the jury should be allowed to hear the statisticale idence. On the one hand this seems like a no brainer. After all, if the evidence was an important part of what was presented to the grand jury, if it was appropriate for them to hear, and if they found it com pelling, what could possibly be wrong with letting the trial jury hear the same timony! On the other hand, a counterargument might be that allowing the statistical evidence would just lead to the anhelpful distraction of "dueling experts. The court system allows expert testi mony when the evidence involves specialised technical or scientific inues that go beyond what member of the jury would ordinarily be familiar with. The purpose of the experts is to provide explanations of the science or of the technical facts involved, along with the appropri ate conclusions. In other words, they help the jury understand the Cobb & Get in the 11 evidence bets and the US Supreme Court has set guidelines aimed at making sure that scientification is not admited. The goal is to help ensure that the verdict will be scientifically sound Nevertheless, attorneys sometimes say that if there is apertestimony on one side, the other side hires another expert who will disagree and the jury, rather than think through the explanation will simply is moet it all. One expert cances the other. Although this view may be everly cynical, no doubt it does have a basis in fact, Rather than rely on the crude strategy of dueling experts, Gilbert's defense atomes asked the other of the two of us (Cobb) to prepare a written port for the joy summarising the reasons why it would no le appropriate for the new jury, the trial jury to hear the wine evidence thatchbach led presenud ale to the grand jury. In the nesterveral paragraph, you will read a summary of the main points is dat report. This time, put yourself in the position of Judge Post Do you find the points permet Would you have allowed the jury to be the statical evidence or not HYPOTHESIS TESTING II So far in the Gehlbach time the interpretation of hypothesis terting has found on what it is that a tiny p-valuede sell you. It telle you that the observed to be explained due to chance variation. This was actly the data for the rary Were there to many cadetle when there be pidou la tha finel The dear away in the Colbwport the focus on things that my plus de tell you Unfortunately for people who need to unde lyphosting the valid con copie The COBB'S REPORT TO JUDGE PONSOR Lexing anide a variety of secondary technical fucs, the Cobb report made three main points. One of them was to agree with the bottom line conclusion in Gehlbach's testimony. The other two dealt with two limitations on what you can learn from a tiny palut. 12 con Social Scimo Point One: The Defense and Prosecution Statisticians Agree As mentioned earlier, often the two experts who provide testimony on scientific evidence disagree. However, that was not what happened in the Gilbert Case Cobbis report agreed with chilbachistestimony before the grand jury. We both cheaght the pattern linking Giller's presence on the ward with cace deaths was far too strong to be re- garded a mere coincidence due to chancelike variation. We both thought too, that in the ance of any innocent explanation for the partern, the association was more than strong enough to justify the indictment. Why then, shouldn't the trial jury hear the testimony? To answer that question we proceed with Cobb's ochet two points Point Two Association Is Not Causation The grand jury and the trial jury have quite different decisions 10 make as they weigh the evidence, and the difference is dosely tied to what a value does and does not tell yon. The grand jury had to de cide whether or not Gilbert should stand trial. Was there enough picion to justify the expense oo the government and the pacho Jopical burden on Gilbert to hold what promised to be a long and pensive trial! A grand jury does not have to decide guilt or innocence beyond a reasonable doubt. For them, the standard is much lower. They are simply asked to determine whether the level of suspicion is high enough. This is precisely the kind of question that logic of hy prochestening is designed to awwe. In statistics, and in since generally, the bar is quite high for what deserve to be considered trong suspicion, typically a p-value of 0.05 or 0.01 A low value establishes picion by long aucun was an explanation Notice that a low value does not provide an explanation, Indoesn't wy. "Here. This is the son for the excess death. What it was is much more limited: "Whatever the explanation may be you can be qulle confident that it is sur mere chance variation." The trial jury luded to decide whether the facts look api- ciou. By the time comes to trial, the decisionalspiciun had been made. The wil jury leaked to decide the reason for dhe aplicou facu Were the dead caused by Gillent giving fatalinjalone? Or were there enough uncertainties that the case could not be determined byondamnable doube Because Cobb & Gehlbach: Statistics in the Courtroom 9 DEATH ON SHIFT? Yes GILBERT PRESENT No Total 257 217 1350 1567 1384 1641 Yes 40 No 34 Total 74 TABLE 1 The basis of the statistical test Note: The table is based on the following data Number of days 541 Number of this 1641 Number of death 74 Deather than 0.045 Shits with pre 257 Expected number of deaths 11.59 Derved number of deaths Now, with the basic logic out on the table, it was time to present a formal test. What follows is just one focused part of the set of tests Gehlbach actually presented. Part Three: A Statistical Test At this point, Gehlbach showed the jury data like that shown in Table 1. Dr. Gehlbach: "The table summarizes records for the Page of 16 evidence betterand the U.S. Supreme Court has set guidelines aimed at making sure that unscientific testimony is not admitted. The goal is to help ensure that the verdict will be scientifically sound. Nevertheless, attorneys sometimes say that if there is expert testimony on one side, the other side hires another expert who will disagree, and the jury, rather than think through the explanations, will simply ig- nore it all. One expert cancels the other. Although this view may be overly cynical, no doubt it does have a basis in fact. Rather than rely on the crude strategy of dueling experts, Gilbert's defense attorneys asked the other of the two of us (Cobb) to prepare a written report for the judge summarizing the reasons why it would not be appropriate for the new jury, the trial jury, to hear the same evidence that Gehlbach had presented earlier to the grand jury. In the next several paragraphs, you will read a summary of the main points in that report. This time, put yourself in the position of Judge Ponsor. Do you find these points persuasive? Would you have allowed the jury to hear the statistical evidence or not? HYPOTHESIS TESTING II So far, in the Gehlbach testimony, the interpretation of hypothesis testing has focused on what it is that a tiny p-value does tell you. It tells you that the observed result is too extreme to be explained as due to chancelike variation. This was exactly the relevant issue for the grand jury: Were there so many excess deaths when Gilbert was present as to be suspicious in the eyes of science? The clear answer was yes. In the Cobb report, the focus was on things that tiny p-values do not tell you. Unfortunately for people who need to understand hypothesis testing, these invalid conclusions are a constant temptation. They seem to make sense intuitively, but they are wrong, and so they have great potential to mislead the unwary. This potential for logical mischief dhe bar forced

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Financial Accounting

Authors: Robert Libby, Patricia Libby, Frank Hodge

9th edition

290-1259222138, 1259222136, 978-1259222139

Students also viewed these Accounting questions