Question
1. (a) You are given a collection of 3 documents, listed below. You are asked to create an inverted index for this collection of documents.
1. (a) You are given a collection of 3 documents, listed below. You are asked to create an inverted index for this collection of documents.
-
D1: SPMS offers Master of Science
-
D2: SPMS offers many courses
-
D3: MH6301 is a master course
(i) Suppose a white space tokenizer is used to identify the tokens. Briefly describe THREE (3) types of processing that can be
applied to the tokens before creating the index.
(ii) Write down the resultant documents after applying the three type of processing in Q1(a)(i).
(iii) Draw the inverted index that would be built for the three documents, after applying the three types of processing on their
tokens.
b) Briefly describe the following concepts.
-
Information need
-
Query
-
Document
-
Relevant document
-
TFIDF
-
Edit distance
-
A/B Testing
(c) The table below gives the sizes of postings lists for tokens e, k, m, and s, respectively.
(i) Recommend a query processing order for a Boolean query:
Term | e | K | m | s |
Postings size | 313 | 27 | 107 | 271 |
(m OR s) AND (k OR e). (ii) Estimate the minimum and maximum possible number of results
for a Boolean query: (e OR k) AND (NOT m)
(d) Discuss TWO (2) techniques to process phrase query like information retrieval.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started