Answered step by step
Verified Expert Solution
Question
1 Approved Answer
why lemmatizer fails? def wrangling_doc(doc): tokens=doc.split() re_punc = re.compile('[%s]' % re.escape(string.punctuation)) # remove punctuation from each word tokens = [re_punc.sub('', w) for w in tokens]
why lemmatizer fails?
def wrangling_doc(doc): tokens=doc.split() re_punc = re.compile('[%s]' % re.escape(string.punctuation)) # remove punctuation from each word tokens = [re_punc.sub('', w) for w in tokens] # remove remaining tokens that are not alphabetic tokens = [word for word in tokens if word.isalpha()] # filter out short tokens tokens = [word for word in tokens if len(word) > 4] #lowercase all words tokens = [word.lower() for word in tokens] # Lemmatize the tokens tokens = [word.lemmatizer() for word in tokens]
return tokens
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started