Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Task 2 : Design a Jelinek - Mercer based Language Model ( JM _ LM ) that ranks documents in each data collection using the

Task 2: Design a Jelinek-Mercer based Language Model (JM_LM) that ranks documents in
each data collection using the corresponding topic (query) for all 50 data collections.
Inputs: 50 long queries (topics) in the50Queries.txt and the corresponding 50 data collections
(Data_C101, Data_C102,..., Data_C150).
Output: 50 ranked document files (e.g., for Query R107, the output file name is
JM_LM_R107Ranking.dat) for all 50 data collections and save them in the folder
RankingOutputs.
For each long query (topic) Rx, you need to use the following equation to calculate a conditional
probability for each document D in the corresponding data collection (dataset):
3
where is the number of times query word qi occurs in document D,|D| is the number of
word occurrences in D, is the number of times query word qi occurs in the data collection
Data_Cx,|Data_Cx| is the total number of word occurrences in data collection Data_Cx, and
parameter \lambda =0.4.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

3. It is the commitment you show that is the deciding factor.

Answered: 1 week ago

Question

Technology. Refer to Case

Answered: 1 week ago