Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

why lemmatizer fails? def wrangling_doc(doc): tokens=doc.split() re_punc = re.compile('[%s]' % re.escape(string.punctuation)) # remove punctuation from each word tokens = [re_punc.sub('', w) for w in tokens]

why lemmatizer fails?

def wrangling_doc(doc): tokens=doc.split() re_punc = re.compile('[%s]' % re.escape(string.punctuation)) # remove punctuation from each word tokens = [re_punc.sub('', w) for w in tokens] # remove remaining tokens that are not alphabetic tokens = [word for word in tokens if word.isalpha()] # filter out short tokens tokens = [word for word in tokens if len(word) > 4] #lowercase all words tokens = [word.lower() for word in tokens] # Lemmatize the tokens tokens = [word.lemmatizer() for word in tokens]

return tokens

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Management With Website Development Applications

Authors: Greg Riccardi

1st Edition

0201743876, 978-0201743876

More Books

Students also viewed these Databases questions

Question

How do you add two harmonic motions having different frequencies?

Answered: 1 week ago