Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The main difference between retrieval functions of this form comes from their different choice of smoothing method that is applied to the unigram language model

image text in transcribed
image text in transcribed
The main difference between retrieval functions of this form comes from their different choice of smoothing method that is applied to the unigram language model Op. In general, when we smooth with a collection background language model, we can write this probability as P(w | 0D) = P.(w |BD) ifwED app(w | C) otherwise where Ps(W | OD ) is the discounted maximum likelihood estimate of observing word w in document D and an is a document- specific coefficient that controls the amount of probability mass assigned to unseen words to ensure that all of the probabilities sum to one. Noting that log is a monotonic transform (thus leading to equivalent results under ranking), and using the above smoothing formulation, we can show the following: log P(Q | D) = _ log p(q: | 0D) _ c(w, Q) log p(w | 0D) WEV (w, Q) logp. (w | 0p) + _ c(w, Q) log app(w | C) LED c(w, Q) logp. (w | 0p) + > c(w, Q) log app(w | C) - > c(w, Q) log app(w | C) P. (w | 8D) + 1Q| logan + _ c(w, Q) logp(w | C) BED c(w, Q) 108 app(w |C) IEV rank [ (w, Q) log Pluton + 1Q| logan WED a. [5 pts] Show that if we use the query-likelihood scoring method (i.e., p(Q D)) and the Jelinek-Mercer smoothing method (i.e., fixed co-efficient interpolation with smoothing parameter )) for retrieval, we can rank documents based on the following scoring function: + (1-A) x c(w, D) score(Q, D) = _ c(w, Q) log (1+ x x p(w | REF) x [DI) WEQND where the sum is taken over all the matched query terms in D, IDI is the document length, c(w,D) is the count of word w in document D (i.e., how many times W occurs in D), c(w, Q) is the count of word w in Q, A is the smoothing parameter, and p(WIREF) is the probability of word w given by the reference language model estimated using the whole collection. b. [5 pts] This scoring function above can also be interpreted as a vector space model. If we make this interpretation, what would be the query vector? What would be the document vector? What would be the similarity function? Does the term weight in the document vector capture TF-IDF weighting and document length normalization heuristics? Why

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Real Mathematical Analysis

Authors: Charles C Pugh

2nd Edition

3319177710, 9783319177717

More Books

Students also viewed these Mathematics questions

Question

what is the WACC of Walmart? please explain and please show work.

Answered: 1 week ago

Question

Self-confidence

Answered: 1 week ago

Question

The number of people commenting on the statement

Answered: 1 week ago