Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

2. Consider the following three documents: Glimpse is an indexing and query system that allows for search through a file system or dacument collection quickly.

image text in transcribed
2. Consider the following three documents: Glimpse is an indexing and query system that allows for search through a file system or dacument collection quickly. Glimpse is the default search engine in a larger information retrieval system. It has also been used as part of some web based search engines The main processes in a retrieval system are document indexing, query processing, query evaluation and relevance feedback. Among these, efficient updating of the index is critical in large scale systems. Clusters are created from short snippets of documents retrieved by web search engines which are as good as clusters created from the full text of web documents. (a) Remove stop words and punctuation, and apply Porter's stemming algorithm to the three documents (use the online stemming application for this purpose to save your time, e.g., https:/l9oles/porter is demo.html or httpsilltext-processing.com/demolstem/ or http:Iftextanalysisonline.comltk-porter-stemmer; Note; the scripts only stem the documents, you need to remove the stop words afterwards) (b) Create an inverted index of the three documents, including the dictionary and the postings. The dictionary should also contain (for each term) statistics such as total number of occurrences in the collection and the document frequency. The postings for each term should contain the document ids and the term frequencies (depict multiple postings for a term as a linked list, similar to Figure 1.3 in the IR Book) (c) What are the search results for the following Boolean queries (in each case explain how you obtained them from the inverted index)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Accidental Data Scientist

Authors: Amy Affelt

1st Edition

1573877077, 9781573877077

More Books

Students also viewed these Databases questions