Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

# use Penn Treebank P.O.S for POS Tagging import nltk from nltk import word_tokenize from nltk.corpus import brown # Question 20: use given words like

# use Penn Treebank P.O.S for POS Tagging import nltk from nltk import word_tokenize from nltk.corpus import brown

# Question 20: use given words like BTWords (Brown corpus tagged words) or sample text # 20.a: Print the first 5 words from an alphabetically sorted list of the distinct words tagged as MD. (MD == Modal) BTWords = nltk.corpus.brown.tagged_words() ModalWords = [w for (w, t) in BTWords if t == 'MD'] sorted(set(ModalWords))[:5]

# 20.c: Identify three-word prepositional phrases of the form IN + DT + NN (e.g., in the lab) using raw_sent sentence. # Note: Textbook says DET, but current Brown corpus uses DT instead. # need to tokenize first, POS Tag and trigram. # see an example: to in trigrams for tagged_sent in brown.tagged_sents(): for (w1,t1), (w2,t2), (w3,t3) in nltk.trigrams(tagged_sent): if (t1.startswith('V') and t2 == 'TO' and t3.startswith('V')): print(w1, w2, w3)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Logics For Databases And Information Systems

Authors: Jan Chomicki ,Gunter Saake

1st Edition

1461375827, 978-1461375821

More Books

Students also viewed these Databases questions