Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 22, 2024

Get to know the lab corpus (i.e. dataset) Prepare the data by make them all in lowercase to ease the string comparisons needed for subsequent

image text in transcribed

Get to know the lab corpus (i.e. dataset) Prepare the data by make them all in lowercase to ease the string comparisons needed for subsequent steps \# In these two steps, we lower the letters in each word of the training and test corpus train_set = [[(pair[0]. lower(), pair[1])] for pairs in train_set for pair in pairs] test_set = [[ (pair[0]. lower(), pair[1])] for pairs in test_set for pair in pairs] \# create list of train and test tagged words train tagged words = [ tup for sent in train set for tup in sent ] test_tagged_words = [ tup for sent in test_set for tup in sent ] print(len(train_tagged_words)) print(len(test_tagged_words)) Write the transition probabilities function Build the transition probability matrix using the function written in step 4 Build the emission probability matrix using the function written in step 5 Extract all possible POS tagging for a test case 9. Compute the HMM probabilities Please complete the below tasks to predict the correct tag for a sentence. Your understanding of Part-1 is essential to complete these tasks. 1. Delete the start and end symbols from the corpus provided to you in the first part [0.25 mark] 2. Apply the emission and transition functions to compute related probabilities on the updated corpus [ 1 mark] 3. Compute the probability of the possible tags that can be given to the same test sentence provided in the first part but without the start and end symbols [1mark] 4. Report the difference [0.25] 5. Now, use the below code to install and prepare corpus from the library import nltk from sklearn.model selection import train test split import numpy as np import pandas as pd import random import pprint, time from itertools import product \#installing the treebank corpus from library nltk nltk.download('treebank') \# reading the Treebank tagged sentences nl data = list (nltk. corpus.treebank,tagged sents()) tr set, ts set =train test split (nl data [ : i00], train size=0.75, test size=0.25) 6. Apply the emission and transition functions on the corpus to compute related probabilities [1 mark] 7. Compute the probability of the possible tags using for the sentence: "In July, the agency imposed a ban"[1 mark] 8. Confirm your understanding with the lab instructor [0.5 mark]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Data Analytics With Sas Explore Your Data And Get Actionable Insights With The Power Of Sas

Data Analytics With Sas Explore Your Data And Get Actionable Insights With The Power Of Sas

Authors: Nishant Sidana

1st Edition

9355515979, 978-9355515971

More Books

Students also viewed these Databases questions

Question

All of the following are examples of reflecting on past experience to apply what was learned to new experiences, except a. Juan asks Marcus, an employee he supervises, why he was so offended by a...

Answered: 1 week ago

Question

★★★★★

Prepare journal entries to record the following. (a) Perez Company retires its delivery equipment, which cost $44,000. Accumulated depreciation is also $44,000 on this delivery equipment. No salvage...

Answered: 1 week ago

Question

★★★★★

pls answer all questions thanks. bonding and Structure

Answered: 1 week ago

Question

★★★★★

please help 3.2 Truth Tables for Negarto Shutfor 5) Construct a Truth Table for the statement: (9 points) "PV (gAr) 6) Construct a Truth Table for the statement: (8 points) 7) You do not Circle your...

Answered: 1 week ago

Question

★★★★★

The Denver Company currently has a bond issue outstanding which carries a coupon rate of 6.4 percent, makes semiannual payments, currently sells for $967.40, and matures in 8.4 years. Suppose Denver...

Answered: 1 week ago

Question

★★★★★

GoGames is a software developer that sells games and apps through its own internet site. In addition to selling its own products, it offers products developed by SmallStuff and remits 95% of the...

Answered: 1 week ago

Question

★★★★★

Different marketing strategies of roundhill ETF that have engage more investors ? What factors contribute to the seemingly success of Roundhill Ball Metaverse ETF

Answered: 1 week ago

Question

★★★★★

A medical researcher wants an equal number of male and female participants in an upcoming controlled study for a new heart medication. Instead of selecting the 100 total participants from the...

Answered: 1 week ago

Question

★★★★★

A college basketball coach has 12 players on his team. Eight players are receiving scholarships, and four are not. The coach decides to select a starting lineup by selecting five names out of a bowl....

Answered: 1 week ago

Question

★★★★★

Describe three situations that would require a businessperson to use the direct plan for a negative message. (Objective 6)

Answered: 1 week ago

Question

★★★★★

Explain the placement of the logical explanation and the negative news if a message follows the direct plan. (Objective 5)

Answered: 1 week ago

Question

★★★★★

List and discuss the five major parts of the indirect plan outline and guidelines for developing each part. (Objective 3)

Answered: 1 week ago

Previous Question Next Question