Question
Please help python code for mapReduce, I got errors. See below: input file sample (data.txt) fomat is: e.g., document id, line 1 .T #this is
Please help python code for mapReduce, I got errors. See below:
input file sample (data.txt) fomat is:
e.g., document id, line
1 .T #this is stop word, should be remove after execute mapper.py
1 experimental investigation of the aerodynamics of a
...
2 .T #this is stop word, should be remove after execute mapper.py 2 simple shear flow past a flat plate in an incompressible fluid of small
________________
data.txt:
1.T 1 experimental investigation of the aerodynamics of a 1 wing in a slipstream . 1 .A 1 brenckman,m. 1 .B 1 j. ae. scs. 25, 1958, 324. 1 .W 1 experimental investigation of the aerodynamics of a 1 wing in a slipstream . 1 an experimental study of a wing in a propeller slipstream was 1 made in order to determine the spanwise distribution of the lift 1 increase due to slipstream at different angles of attack of the wing 1 and at different free stream to slipstream velocity ratios . the 1 results were intended in part as an evaluation basis for different 1 theoretical treatments of this problem . 1 the comparative span loading curves, together with 1 supporting evidence, showed that a substantial part of the lift increment 1 produced by the slipstream was due to a /destalling/ or 1 boundary-layer-control effect . the integrated remaining lift 1 increment, after subtracting this destalling lift, was found to agree 1 well with a potential flow theory .
1 simple one 1 an empirical evaluation of the destalling effects was made for 1the specific configuration of the experiment .
2 .T 2 simple shear flow past a flat plate in an incompressible fluid of small 2 viscosity . 2 .A 2 ting-yili 2 .B
_________________________________________________
output should be like:
format is: document id, word, 1 if word if word appears in the document
E.g., 1 experimental 1
1 experimental 1
1 experimental 1
...
1 simple 1
2 simple 1
_________________________
This is my Python code, pls correct:
#!/usr/bin/python3 #Assignment : NLTK Library to throw away stopwords, porter stemmer; #read input file line by line & generate word, document id, and 1 if word #appears in the doument.
from nltk.stem.porter import PorterStemmer from nltk.corpus import stopwords
import sys
stemmer = PorterStemmer() stop_words = set(stopwords.words('english'))
#print(stemmer.stem("magnificent")) #=> magnific #print("himself" in stop_words) #=> True
for line in sys.stdin: line = line.strip() documents = line.split(' ') words =line.split('\t')
for document in documents: for word in words: print(' %s \t %s \t %s ' % (word, document,1))
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started