HELP ASAP!!!!
Assignment:
Files:
Queries.txt
Write a program that implements the vector space model. You will test this program on the Cranfield dataset, which is a standard Information Retrieval text collection, consisting of 1400 documents from the aerodynamics field. The dataset is available on the Blackboard, Tasks: 1. Write a program that preprocesses the collection This step indudes the following tasks: 1 Stop word removal 2 Normalization 3 Tokenization For this task please use your own implementation of a tokenizer. Integrate the Porter stemmer and a stop-word eliminator into your code. You are encouraged to use the functions you implemented for Assignment 1. (10 points) 2. Write a program that indexes the collection and returns a ranked list of documents for each query in a list of queries. To do so you need to calculate the cosine similarity between each query and each document. To do so you need to calculate Te Idland follow all the steps of calculating cosine similarity (read the slides for more clarification. The cosine similarity algorithm has to be your original code. No points will be given for using available libraries such as NLTK (55 points) Return a ranked list of document based on their similarity to each query. It should produce a list consisting of pairs of query ids, along with the ids of the documents that are relevant and their similarity score (for each query, list in reverse order of similarity score) . queryidl documentid similarity Score1 queryidl documentid2 similarity Score2 queryidl document dx similarity Scorex queryld2 documentid1 similarity Score1 (35 points) Note: It is highly recommended that your code is as modularized as possibles many of the functions that you implement during this assignment will be needed in future assignments or in the term project Submission instructions: 1. Write a README file including a detailed note about the functionality of each of the above programs complete instructions on how to run them 2. Include all the files for this assignment in a folder called your unique-name Assignment2 3. Archive the folder using tgzor zip and submit on the Blackboard by the due date. 4. Make sure you include your name in each program and in the README file. S. Make sure all your programs run correctly on the CS machines. what investigations have been made of the wave system created by a static pressure distribution over a liquid surface. has anyone investigated the effect of shock generated vorticity on heat transfer to a blunt body. what is the heat transfer to a blunt body in the absence of vorticity what are the general effects on flow fields when the reynolds number is small . find a calculation procedure applicable to all incompressible laminar boundary layer flow problems having good accuracy and reasonable computation time. papers applicable to this problem (calculation procedures for laminar incompressible flow with arbitrary pressure gradient). has anyone investigated the shear buckling of stiffened plates papers on shear buckling of unstiffened rectangular plates under shear in practice, how close to reality are the assumptions that the flow in a hypersonic shock tube using nitrogen is non-viscous and in thermodynamic equilibrium . what design factors can be used to control lift-drag ratios at mach numbers above 5. Write a program that implements the vector space model. You will test this program on the Cranfield dataset, which is a standard Information Retrieval text collection, consisting of 1400 documents from the aerodynamics field. The dataset is available on the Blackboard, Tasks: 1. Write a program that preprocesses the collection This step indudes the following tasks: 1 Stop word removal 2 Normalization 3 Tokenization For this task please use your own implementation of a tokenizer. Integrate the Porter stemmer and a stop-word eliminator into your code. You are encouraged to use the functions you implemented for Assignment 1. (10 points) 2. Write a program that indexes the collection and returns a ranked list of documents for each query in a list of queries. To do so you need to calculate the cosine similarity between each query and each document. To do so you need to calculate Te Idland follow all the steps of calculating cosine similarity (read the slides for more clarification. The cosine similarity algorithm has to be your original code. No points will be given for using available libraries such as NLTK (55 points) Return a ranked list of document based on their similarity to each query. It should produce a list consisting of pairs of query ids, along with the ids of the documents that are relevant and their similarity score (for each query, list in reverse order of similarity score) . queryidl documentid similarity Score1 queryidl documentid2 similarity Score2 queryidl document dx similarity Scorex queryld2 documentid1 similarity Score1 (35 points) Note: It is highly recommended that your code is as modularized as possibles many of the functions that you implement during this assignment will be needed in future assignments or in the term project Submission instructions: 1. Write a README file including a detailed note about the functionality of each of the above programs complete instructions on how to run them 2. Include all the files for this assignment in a folder called your unique-name Assignment2 3. Archive the folder using tgzor zip and submit on the Blackboard by the due date. 4. Make sure you include your name in each program and in the README file. S. Make sure all your programs run correctly on the CS machines. what investigations have been made of the wave system created by a static pressure distribution over a liquid surface. has anyone investigated the effect of shock generated vorticity on heat transfer to a blunt body. what is the heat transfer to a blunt body in the absence of vorticity what are the general effects on flow fields when the reynolds number is small . find a calculation procedure applicable to all incompressible laminar boundary layer flow problems having good accuracy and reasonable computation time. papers applicable to this problem (calculation procedures for laminar incompressible flow with arbitrary pressure gradient). has anyone investigated the shear buckling of stiffened plates papers on shear buckling of unstiffened rectangular plates under shear in practice, how close to reality are the assumptions that the flow in a hypersonic shock tube using nitrogen is non-viscous and in thermodynamic equilibrium . what design factors can be used to control lift-drag ratios at mach numbers above 5