2. For the previous problem, we were implicitly assuming each chapter had the same length. Remember that for Y; ~ Poisson(A), E[Y] = A for each chapter, i.c. the average number of occurrences of dark is the same in each chapter. Obviously this isn't a great assumption, since the lengths of the chapters vary; longer chapters should be more likely to more occurrences of the word. We can augment the model by considering properties of the Poisson distribution. The Poisson is often used to express the probability of a given number of events occurring for a fixed "exposure". As a useful example of the role of the exposure term, when counting then number of events that happen in a set length of time, we to need to account for the total time that we observe the process. For this text example, exposure is not time, but rather corresponds to the total length of the chapter. We will again let (31, .... n) represent counts of the word dark. In addition, we now count the total number of words in cach cach chapter (1, ..., I'm) and use this as our expsure. Let Y, denote the random variable for the counts of the word dark in a chapter with v; words. Let's assume that the quantities Y1, ...Yn are independent and identically distributed (IID) according to a Poisson distribution with unknown parameter A . Twig, p(Yi = yi | vi, 1000) = Poisson(yi | X . Vi 1000 for i = 1, ..., n. In the code below, chapter_lengths is a vector storing the length of each chapter in words. chapter_lengths % group_by(chapter) %>% summarize (chapter_length = sum(n) ) %>% ungroup %>% select (chapter_length) %>% unlist %>% as. numeric (a) What is the interpretation of the quantity 1006 in this model? What is the interpretation of A in this model? State the units for these quantities in both of your ansswers. (b) Fill in the 2x2 table of known and unknown variables and constants introduced in lecture 2. Make sure your table includes Y1, .... Yn; y1. .... yn, n, A, and vi. (c) Write down the likelihood in this new model. Use this to calculate maximum likelihood estimator for A. Your answer should include the vi's. (d) Plot the log-likelihood from the previous question in R using the data from the on the frequency of dark and the chapter lengths. Compute the maximum likelihood estimate and interpet its meaning (make sure you include units in your answers!)