Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

please implement step 2 In [1] : import nltk import re from nitk import pos_tag, word_tokenize, Tree from nitk.stem import WordNetLenmatizer Regular expression practice: In

image text in transcribed
image text in transcribed
image text in transcribed
please implement step 2
In [1] : import nltk import re from nitk import pos_tag, word_tokenize, Tree from nitk.stem import WordNetLenmatizer Regular expression practice: In this example, we show one regex pattern example for Hearst pattern: NP such as {NP,)* {lor | and}} NP (https://docs.python.org/3/library/re.htm) In [2): regex = "(NP_\W+ (,)?such as (NP_\W+ ?(,)? (and lor )?)+)" test_str = "NP_1 such as NP_2, NP_3 and NP_4 matches = re.search(regex, test_str) if matches: # Match.groupIgroupl, ...) Returns one or more subgroups of the match. # If there is a single argument, the result is a single string; # If there are multiple arguments, the result is a tuple with one item per argument. # Without arguments, groupl defaults to zero (the whole match is returned). print(matches.group()) NP_1 such as NP_2, NP_3 and NP_4 Step1: Chunking Sentence . Note the result is not the chunked np, instead is the chunk tree structure In [3]: fron nitk import ne_chunk def np_chunking(sentence): # your implementation result = ne_chunkipos_tag(word_tokenize sentence))) return result print(np_chunking like to listen to music from musical genres, such as blues, rock and jazz. )) (s I/PRP like/VBP to/TO listen/VB to/TO music/NN from/IN musical/J) genres/NNS .l. such/J) as/IN blues/NNS .l. rock/NN and/CC jazz/NN ./.) Step2: Prepare the chunked result for subsequent Hearst pattern matching Traverse the chunked result, if the label is NP, then merge all the words in this chunk and add a prefix NP_ . All the tokens are separated with a white space(" ") Remember to lemmatize words, using WordNetLenmatizer (fron nitk.stem import WordNetLenmatizer) In (4): prepare the chunked sentence by merging words and add prefix NP_ def prepare_chunks (chunks): # 17 chunk is NP, start with NP_ and join tokens in chunk with - Else just keep the token as it is terms=0 for chunk in chunks: label - None try: see if the chunk is simply a word or a NP. But non-NP fail on this method call label chunk. label() except: pass * Based on the label, do processing, your implementation here.... In ts: raw_text= "I like to listen to music from musical genres, such as blues, rock and jazz." chunk_res np_chunking(row_text) print(prepare_chunks (chunk_res)) I like to listen to NP_music from NP_musical_genre, such as NP_blue, NP_rock and NP_jazz

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

5. Discuss the key components of behavior modeling training.

Answered: 1 week ago

Question

4. Develop a self-directed learning module.

Answered: 1 week ago