Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Task 2: Posting List Merging Algorithm 1 shows an algorithm in pseudo code from Manning et al. on how to intersect two posting lists from

image text in transcribed

Task 2: Posting List Merging Algorithm 1 shows an algorithm in pseudo code from Manning et al. on how to intersect two posting lists from an inverted index. The algorithm can be used to efficiently implement an AND-query in boolean retrieval: after receiving a query of the form "q1 AND q2" , the information retrieval system looks up the posting lists p1 and p2 for the query keywords ql and q2 from the inverted index, and finally intersects the two posting lists via the specified algorithm. The result is then a posting lists of all documents that contained both keywords. This works because set intersection is the set theoretic equivalent to the boolean AND operation. Algorithm 1 Algorithm for the intersection of two posting lists p1 and p2. 1: function INTERSECT P1, P2) 2: 3: while P1?NIL and P2?NIL do 4: 5: 6: 7: 8: 9: 10: answer- if DocID(PiDID(P2) then DOCID(p2) then (P1)) ADD(answer, DocID P1 ?NEXT(p1) P2 +NEXT(P2) else if DocIDP P1) DOCID(P2) then P1 NEXT(P1) else P2 NEXT(p2) 12: return answer Your task is to specify a similar algorithm in pseudo code that would implement a BUT-query of the form "q1 BUT q2" .The semantics of that query should be to return all documents that contain the keyword q1 but excluding those that also contain keyword q2. You can assume to already have the posting lists p1 and p2 from an inverted index available. Further you can assume that the posting lists are sorted on the document-ids. Task 2: Posting List Merging Algorithm 1 shows an algorithm in pseudo code from Manning et al. on how to intersect two posting lists from an inverted index. The algorithm can be used to efficiently implement an AND-query in boolean retrieval: after receiving a query of the form "q1 AND q2" , the information retrieval system looks up the posting lists p1 and p2 for the query keywords ql and q2 from the inverted index, and finally intersects the two posting lists via the specified algorithm. The result is then a posting lists of all documents that contained both keywords. This works because set intersection is the set theoretic equivalent to the boolean AND operation. Algorithm 1 Algorithm for the intersection of two posting lists p1 and p2. 1: function INTERSECT P1, P2) 2: 3: while P1?NIL and P2?NIL do 4: 5: 6: 7: 8: 9: 10: answer- if DocID(PiDID(P2) then DOCID(p2) then (P1)) ADD(answer, DocID P1 ?NEXT(p1) P2 +NEXT(P2) else if DocIDP P1) DOCID(P2) then P1 NEXT(P1) else P2 NEXT(p2) 12: return answer Your task is to specify a similar algorithm in pseudo code that would implement a BUT-query of the form "q1 BUT q2" .The semantics of that query should be to return all documents that contain the keyword q1 but excluding those that also contain keyword q2. You can assume to already have the posting lists p1 and p2 from an inverted index available. Further you can assume that the posting lists are sorted on the document-ids

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Datacasting How To Stream Databases Over The Internet

Authors: Jessica Keyes

1st Edition

007034678X, 978-0070346789

More Books

Students also viewed these Databases questions