Only questions 3 and 4:
Problem 1. In class we gave the following equation for the bigram probability of a sequence of words Wu}, ..., WU\"): k PT(W(1), __., WU\") = HPT.(W(i)lw(i1) = 10051)) (1) Using this formula, give an expression Ior the bigram probability of the sentence abab, where each character is treated as a word. Try to simplify the formula as much as possible. Problem 2. Let us suppose that there are two possible symbols/words in our language, a and b. There are three conditional distributions in the bigram model for this language, PT(W(")|W(FI) = a),Pr(W(i)|W('1) = b), and Pr(W()|W(i'1) = start), where start is the start symbol which begins any sentence. These conditional distributions are associated with the parameter vectors 9;, 632,, and 6mm, respectively (these parameter vectors were implicit in the previous problem). For the current problem, we will assume that these parameters are xed. Suppose that we are given a sentence W(1),...,W(k). We will use the notation away to denote the number of times that the symbol y occurs immediately following the symbol a: in the sentence. For example, 713\"; counts the number of times that symbol a occurs immediately following the symbol (1. Using Equation 1, give an expression for the probability of a sentence in our language: PT(W(1), u-3W(k)|'a,gba 'start) (2) The expression should make use of the nzny notation dened above. (Hint: the expres- sion should be analogous to the formula that we found for the likelihood of a corpus under a bag of words model.) Problem 3. Let us set the parameter vectors in our bigram model as follows: 67,, = (0.7,0.2,0.1) 67;, = (0.2, 0.701) ' 93m = (0.5,0.5, 0) For example, given the current symbol a, there is probability 0.7 of transitioning to the symbol a, and probability 0.2 of transitioning to the symbol 1). The third term in each vector is the probability of sentence ending after that symbol. Thus, given the current symbols (1 or b, there is probability 0.1 of the sentence ending. Using your answer to the previous problem and these parameter values, calculate the probability of the string aobb. Problem 4. In the previous problem we assumed that we knew the exact values of the parameter vectors 0;, 01,, and gnaw. In the current problem, we will assume that there are actually two possible sets of parameter vectors, 0.; and 6;. We do not know ahead of time which is the correct set of parameters. The rst set of parameters 0.1 is dened by: 9;, = (0.7,0.2,0.1) 67;, = (0.2, 0.701) ' 9mm = (0.5, 0.5, 0) The second set of parameters 6; is dened by: 67,, = (0207,01) 67;, = (0.7, 0.201) ' 9mm = (0.5, 0.5, 0) We will assume that both sets of parameters have equal prior probability: P091) = 19(62) = 0.5. Compute the marginal probability of the string oabb given these possible sets of param- eters