Sampling distribution
18. There is a screening test for prostate cancer that looks at the level of PSA (prostate- specific antigen) in the blood. There are a number of reasons besides prostate cancer that a man can have elevated PSA levels. In addition, many types of prostate cancer develop so slowly that that they are never a problem. Unfortunately there is currently no test to distinguish the different types and using the test is controversial because it is hard to quantify the accuracy rates and the harm done by false positives. For this problem we'll call a positive test a true positive if it catches a dangerous type of prostate cancer. We'll assume the following numbers: Rate of prostate cancer among men over 50 = 0.0005 True positive rate for the test = 0.9 False positive rate for the test = 0.01 Let T be the event a man has a positive test and let D be the event a man has a dangerous type of the disease. Find P(D T) and P(D T-).16. Consider the Monty Hall problem. Let's label the door with the car behind it a and the other two doors b and c. In the game the contestant chooses a door and then Monty chooses a door, so we can label each outcome as 'contestant followed by Monty', e.g ab means the contestant chose a and Monty chose b. (a) Make a 3 x 3 probability table showing probabilities for all possible outcomes. (b) Make a probability tree showing all possible outcomes. (c) Suppose the contestant's strategy is to switch. List all the outcomes in the event 'the contestant wins a car'. What is the probability the contestant wins? (d) Redo part (c) with the strategy of not switching.6. (a) A coin is tossed 100 times and lands heads 62 times. What is the maximum likelihood estimate for 8 the probability of heads. (b) A coin is tossed n times and lands heads & times. What is the maximum likelihood estimate for o the probability of heads. 7. Suppose the data set y1. . .., In is a drawn from a random sample consisting of i.i.d. discrete uniform distributions with range 1 to N. Find the maximum likelihood estimate of N. 8. Suppose data 21, ...,2, is drawn from an exponential distribution exp(A). Find the maximum likelihood for A. 9. Suppose ri, ....In is a data set drawn from a geometric(1/a) distribution. Find the maximum likelihood estimate of a. Here, geometric(p) means the probability of success is p and we run trials until the first success and report the total number of trials, including the success. For example, the sequence FFFF'S is 4 failures followed by a success, which produces r = 5. 10. You want to estimate the size of an MIT class that is closed to visitors. You know that the students are numbered from 1 to n, where n is the number of students. You call three random students out of the classroom and ask for their numbers, which turn out to be 1, 3, 7. Find the maximum likelihood estimate for n. (Hint: the student #'s are drawn from a discrete uniform distribution.) Exam 2 Practice 2, Spring 2014 5 Bayesian updating: discrete prior, discrete likelihood 11. Twins Suppose 1/3 of twins are identical and 2/3 of twins are fraternal. If you are pregnant with twins of the same sex, what is the probability that they are identical? 12. Dice. You have a drawer full of 4, 6, 8, 12 and 20-sided dice. You suspect that they are in proportion 1:2:10:2:1. Your friend picks one at random and rolls it twice getting 5 both times. (a) What is the probability your friend picked the 8-sided die? (b) (i) What is the probability the next roll will be a 5? (ii) What is the probability the next roll will be a 15? 13. Sameer has two coins: one fair coin and one biased coin which lands heads with probability 3/4. He picks one coin at random (50-50) and flips it repeatedly until he gets a tails. Given that he observes 3 heads before the first tails, find the posterior probability that he picked each coin. (a) What are the prior and posterior odds for the fair coin? (b) What are the prior and posterior predictive probabilities of heads on the next flip? Here prior predictive means prior to considering the data of the first four flips. 6 Bayesian Updating: continuous prior, discrete likelihood 14. Peter and Jerry disagree over whether 18.05 students prefer Bayesian or frequentist statistics. They decide to pick a random sample of 10 students from the class and get Shelby to ask each student which they prefer. They agree to start with a prior f(0) ~ beta(2,2), where o is the percent that prefer Bayesian. (a) Let a be the number of people in the sample who prefer Bayesian statistics. What is the pif of a ? (b) Compute the posterior distribution of 0 given a1 = 6. (c) Use R to compute 50% and 90% probability intervals for 0. Center the intervals so that the leftover probability in both tails is the same. (d) The maximum a posteriori (MAP) estimate of 0 (the peak of the posterior) is given by 0 = 7/12, leading Jerry to concede that a majority of students are Bayesians. In light of your answer to part (c) does Jerry have a strong case? (e) They decide to get another sample of 10 students and ask Neil to poll them. Write down in detail the expression for the posterior predictive probability that the maiority of