Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question B-4 Melbourne 4 Doc-id 1 2 3 4 house 39 19 19 12 for 11 19 20 sale 32 3 1 14 in 22

image text in transcribed

Question B-4 Melbourne 4 Doc-id 1 2 3 4 house 39 19 19 12 for 11 19 20 sale 32 3 1 14 in 22 15 3 1 Geelong 22 16 21 21 9 13 20 13 (houses OR for OR sale OR in OR Geelong OR Melbourne) (houses AND for AND sale AND in AND Geelong OR Melbourne) Suppose these are issued to a search engine that uses the ranked Boolean retrieval model. Assume, for simplicity, only four documents in the collection (with document ids 1-4). Answer the following questions. The above table gives the number of times each query-term occurs in each document. i) Compute the document scores and the ranking associated with the query (houses OR for OR sale OR in OR Geelong OR Melbourne). ii) How is the ranking produced probably sub-optimal and why does this happen? iii) Compute the document scores and the ranking associated with the query (houses AND for AND sale AND in AND Geelong OR Melbourne). iv) How is the ranking produced probably sub-optimal and why does this happen? v) How would you extend the Boolean retrieval model to handle AND NOT constraints (e.g., houses AND NOT Geelong)? Your proposed solution should give a higher score to documents that contain fewer occurrences of the term to the right of the AND NOT (e.g., Geelong). Please be as mathematical as possible. In other words, saying: "I would reduce the score for documents that contain the word to the right of AND NOT." is too vague. vi) Using the index, what would be the Boolean retrieval model scores given to documents 1-4 by your proposed scoring method for the query "houses AND NOT Geelong"? Question B-4 Melbourne 4 Doc-id 1 2 3 4 house 39 19 19 12 for 11 19 20 sale 32 3 1 14 in 22 15 3 1 Geelong 22 16 21 21 9 13 20 13 (houses OR for OR sale OR in OR Geelong OR Melbourne) (houses AND for AND sale AND in AND Geelong OR Melbourne) Suppose these are issued to a search engine that uses the ranked Boolean retrieval model. Assume, for simplicity, only four documents in the collection (with document ids 1-4). Answer the following questions. The above table gives the number of times each query-term occurs in each document. i) Compute the document scores and the ranking associated with the query (houses OR for OR sale OR in OR Geelong OR Melbourne). ii) How is the ranking produced probably sub-optimal and why does this happen? iii) Compute the document scores and the ranking associated with the query (houses AND for AND sale AND in AND Geelong OR Melbourne). iv) How is the ranking produced probably sub-optimal and why does this happen? v) How would you extend the Boolean retrieval model to handle AND NOT constraints (e.g., houses AND NOT Geelong)? Your proposed solution should give a higher score to documents that contain fewer occurrences of the term to the right of the AND NOT (e.g., Geelong). Please be as mathematical as possible. In other words, saying: "I would reduce the score for documents that contain the word to the right of AND NOT." is too vague. vi) Using the index, what would be the Boolean retrieval model scores given to documents 1-4 by your proposed scoring method for the query "houses AND NOT Geelong

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

PostgreSQL Up And Running A Practical Guide To The Advanced Open Source Database

Authors: Regina Obe, Leo Hsu

3rd Edition

1491963417, 978-1491963418

More Books

Students also viewed these Databases questions

Question

What is the basis for Security Concerns in Cloud Computing?

Answered: 1 week ago

Question

Describe the three main Cloud Computing Environments.

Answered: 1 week ago