Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

HELP ASAP!!!! Assignment: Files: Queries.txt Write a program that implements the vector space model. You will test this program on the Cranfield dataset, which is

HELP ASAP!!!!
Assignment: image text in transcribed
Files:
Queries.txt
image text in transcribed
Write a program that implements the vector space model. You will test this program on the Cranfield dataset, which is a standard Information Retrieval text collection, consisting of 1400 documents from the aerodynamics field. The dataset is available on the Blackboard, Tasks: 1. Write a program that preprocesses the collection This step indudes the following tasks: 1 Stop word removal 2 Normalization 3 Tokenization For this task please use your own implementation of a tokenizer. Integrate the Porter stemmer and a stop-word eliminator into your code. You are encouraged to use the functions you implemented for Assignment 1. (10 points) 2. Write a program that indexes the collection and returns a ranked list of documents for each query in a list of queries. To do so you need to calculate the cosine similarity between each query and each document. To do so you need to calculate Te Idland follow all the steps of calculating cosine similarity (read the slides for more clarification. The cosine similarity algorithm has to be your original code. No points will be given for using available libraries such as NLTK (55 points) Return a ranked list of document based on their similarity to each query. It should produce a list consisting of pairs of query ids, along with the ids of the documents that are relevant and their similarity score (for each query, list in reverse order of similarity score) . queryidl documentid similarity Score1 queryidl documentid2 similarity Score2 queryidl document dx similarity Scorex queryld2 documentid1 similarity Score1 (35 points) Note: It is highly recommended that your code is as modularized as possibles many of the functions that you implement during this assignment will be needed in future assignments or in the term project Submission instructions: 1. Write a README file including a detailed note about the functionality of each of the above programs complete instructions on how to run them 2. Include all the files for this assignment in a folder called your unique-name Assignment2 3. Archive the folder using tgzor zip and submit on the Blackboard by the due date. 4. Make sure you include your name in each program and in the README file. S. Make sure all your programs run correctly on the CS machines. what investigations have been made of the wave system created by a static pressure distribution over a liquid surface. has anyone investigated the effect of shock generated vorticity on heat transfer to a blunt body. what is the heat transfer to a blunt body in the absence of vorticity what are the general effects on flow fields when the reynolds number is small . find a calculation procedure applicable to all incompressible laminar boundary layer flow problems having good accuracy and reasonable computation time. papers applicable to this problem (calculation procedures for laminar incompressible flow with arbitrary pressure gradient). has anyone investigated the shear buckling of stiffened plates papers on shear buckling of unstiffened rectangular plates under shear in practice, how close to reality are the assumptions that the flow in a hypersonic shock tube using nitrogen is non-viscous and in thermodynamic equilibrium . what design factors can be used to control lift-drag ratios at mach numbers above 5. Write a program that implements the vector space model. You will test this program on the Cranfield dataset, which is a standard Information Retrieval text collection, consisting of 1400 documents from the aerodynamics field. The dataset is available on the Blackboard, Tasks: 1. Write a program that preprocesses the collection This step indudes the following tasks: 1 Stop word removal 2 Normalization 3 Tokenization For this task please use your own implementation of a tokenizer. Integrate the Porter stemmer and a stop-word eliminator into your code. You are encouraged to use the functions you implemented for Assignment 1. (10 points) 2. Write a program that indexes the collection and returns a ranked list of documents for each query in a list of queries. To do so you need to calculate the cosine similarity between each query and each document. To do so you need to calculate Te Idland follow all the steps of calculating cosine similarity (read the slides for more clarification. The cosine similarity algorithm has to be your original code. No points will be given for using available libraries such as NLTK (55 points) Return a ranked list of document based on their similarity to each query. It should produce a list consisting of pairs of query ids, along with the ids of the documents that are relevant and their similarity score (for each query, list in reverse order of similarity score) . queryidl documentid similarity Score1 queryidl documentid2 similarity Score2 queryidl document dx similarity Scorex queryld2 documentid1 similarity Score1 (35 points) Note: It is highly recommended that your code is as modularized as possibles many of the functions that you implement during this assignment will be needed in future assignments or in the term project Submission instructions: 1. Write a README file including a detailed note about the functionality of each of the above programs complete instructions on how to run them 2. Include all the files for this assignment in a folder called your unique-name Assignment2 3. Archive the folder using tgzor zip and submit on the Blackboard by the due date. 4. Make sure you include your name in each program and in the README file. S. Make sure all your programs run correctly on the CS machines. what investigations have been made of the wave system created by a static pressure distribution over a liquid surface. has anyone investigated the effect of shock generated vorticity on heat transfer to a blunt body. what is the heat transfer to a blunt body in the absence of vorticity what are the general effects on flow fields when the reynolds number is small . find a calculation procedure applicable to all incompressible laminar boundary layer flow problems having good accuracy and reasonable computation time. papers applicable to this problem (calculation procedures for laminar incompressible flow with arbitrary pressure gradient). has anyone investigated the shear buckling of stiffened plates papers on shear buckling of unstiffened rectangular plates under shear in practice, how close to reality are the assumptions that the flow in a hypersonic shock tube using nitrogen is non-viscous and in thermodynamic equilibrium . what design factors can be used to control lift-drag ratios at mach numbers above 5

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Systems For Advanced Applications 17th International Conference Dasfaa 2012 Busan South Korea April 2012 Proceedings Part 1 Lncs 7238

Authors: Sang-goo Lee ,Zhiyong Peng ,Xiaofang Zhou ,Yang-Sae Moon ,Rainer Unland ,Jaesoo Yoo

2012 Edition

364229037X, 978-3642290374

More Books

Students also viewed these Databases questions