Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Problem 3. (Noisy Message Decoder) Imagine you receive a message where some of the characters have been corrupted by noise. We represent unknown characters by

Problem 3. (Noisy Message Decoder) Imagine you receive a message where some of the characters have been corrupted by noise. We represent unknown characters by the ~ symbol (we assume we dont use ~ in our messages). Implement the method model.replace_unknown() in the markov_model.py data type that decodes a noisy message corrupted by replacing each ~ in it with the most likely character given an order k Markov model and conditional on the surrounding text and returns the decoded message. You may assume that the unknown characters are at least k characters apart and also appear at least k characters away from the start and end of the message. This maximum-likelihood approach doesnt always get it perfect, but it ?xes most of the missing characters correctly. Here are some details on what it means to ?nd the most likely replacement for each ~. For each unknown character, you should 3 of 4 CS110 Project 5 (Markov Model) Swami Iyer consider all possible replacement characters. You want the replacement character that makes sense not only at the unknown position (given the previous characters) but also when the replacement is used in the context of the k subsequent known characters. For example, we expect the unknown character in "was ~he wo" to be t and not simply the most likely character in the context of "was ". You can compute the probability of each hypothesis by multiplying the probabilities of generating each of k + 1 characters in sequence: the missing one, and the k next ones. Write a client program fix_corrupted.py that takes an integer k (model order) and a string s (noisy message) as command-line arguments, reads the input text from standard input (for e?ciency reasons, use sys.stdin.read() to read the text), and prints out the most likely original string. Original : it was the best of times , it was the worst of times. Noisy : it w~s th~ bes~ of tim~s, i~ was ~he wo~st of~times. $ python3 fix_corrupted.py 4 "it w~s th~ bes~ of tim~s, i~ was ~he wo~st of~times." < data/obama.txt it was the best of times , it was the worst of times. $ python3 fix_corrupted.py 2 "it w~s th~ bes~ of tim~s, i~ was ~he wo~st of~times." < data/obama.txt it was the best of times , is was the worst of times. Data The data directory contains several input ?les. Make sure to test your client programs against them. $ python3 text_generator.py 5 50 < data/obama.txt Well , you the last those what weve got to Russian $ python3 fix_corrupted.py 3 "she s~lls sea s~ells on th~ sea s~ore" < data/wiki_100k.txt she sells sea spells on the sea store

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Marketing The New Profit Frontier

Authors: Ed Burnett

1st Edition

0964535629, 978-0964535626

More Books

Students also viewed these Databases questions