Question
3. Assume that the total number of documents in a corpus is 128 and that the following words occur in the following number of documents:
3. Assume that the total number of documents in a corpus is 128 and that the following words occur in the following number of documents:
Computer occurs in 32 documents
software occurs in 8 documents
intelligent occurs in 16 documents
robust occurs in 128 documents
1) Calculate the TF-IDF weighted term vector (WTD = TFD x IDF) for the following document D:
Computer intelligent software robust computer software
(Hint: all the numbers above are powers of 2, and the log in the IDF weight is taken to the base 2).
2) Suppose one has a query Q specified as:
intelligent software
Assuming that query vector is computed just in terms of TF weights (no IDF weights), and similarity is measured by the cosine metric, what is the similarity between Q and D?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started