Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Training Data We'll be using the English news Corpus ( 2 0 1 8 year ) as our training data. There are around 1 0
Training Data
We'll be using the English news Corpus year as our training data. There are around sentences. engnewsKsentences.txtDownload engnewsKsentences.txt
Problem: Ngram language model
You need to build language models, the Unigram model, and the Bigram model using Laplace smoothing. With each model, you will do the following tasks:
Display generated sentences from this model.
Score the probabilities of the provided test sentences and display the average and standard deviance of these sentences.
once for the provided test set
once for the test set that you curate
Part : Build an ngram language model marks
Preprocessing of the data.
Split the data for training and testing.
You need to develop an ngram model that could model any order ngram, which we'll be using specifically to look at unigrams and bigrams. Specifically, you'll write code that builds this language model from the training data and provides functions that can take a sentence in formatted the same as in the training data and return the probability assigned to that sentence by your model.
Handling of unknown words and smoothing.
Evaluating the language model.
Part : Implement Sentence Generation points
In this part, youll implement sentence generation for your Language Model. Start by generating the token, then sampling from the ngrams beginning with Stop generating words when you hit an token.
Notes:
When generating sentences for unigrams, do not count the pseudoword as part of the unigram probability mass after you've chosen it as the beginning token in a sentence.
All unigram sentences that you generate should start with one and end with one
For ngrams larger than the sentences you generate should start with n tokens. They should end with n tokens.
Justification of the output obtained for all the above tasks is mandatory
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started