Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Python : Hi, I need some help in completing this assignment: Create a data type MarkovModel in model.py to represent a Markov model of order

Python : Hi, I need some help in completing this assignment:

Create a data type MarkovModel in model.py to represent a Markov model of order k from a given text string. The data type must implement the following API:

MarkovModel(text, k) = create a Markov model of order k from text Initialize instance variables appropriately Construct circular text circ_text from text by appending the first k characters to the end; for example, if text = gagggagaggcgagaaa and k = 2, then circ_text = gagggagaggcgagaaaga For each kgram from circ_text, and the character next_char that immediately follows kgram, increment the frequency of next_char in the dictionary _st[kgram] by 1; for the above example, the dictionary _st, at the end of this step, should look like the following: { aa : {a : 1 , g : 1} , ag : {a : 3 , g : 2} , cg : {a : 1} , ga : {a : 1 , g : 4} , gc : {g : 1} , gg : {a : 1 , c : 1, g : 1}}

model.order() = order k of Markov model

model.kgram_freq(kgram) = number of occurrences of kgram in text Return the frequency of kgram, which is simply the sum of the values of _st[kgram]

model.char_freq(kgram, c) = number of times that character c follows kgram Return the number of times c immediately follows kgram, which is simply the value of c in _st[kgram]

model.rand(kgram) = a random character following the given kgram Use stdrandom.discrete() to randomly select and return a character that immediately follows kgram

model.gen(kgram, T) = a string of length T characters generated by simulating a trajectory through the corresponding Markov chain, the rst k characters of which is kgram Initialize a variable text to kgram Perform T - _k iterations, where each iteration involves appending to text a random character obtained using a call to self.rand() and updating kgram to the last _k characters of kgram Return text

Here is the program

import stdio import stdrandom import sys

class MarkovModel(object): """ Represents a Markov model of order k from a given text string. """

def __init__(self, text, k): """ Creates a Markov model of order k from given text. Assumes that text has length at least k. """

self.k = k self.st = {} circ_text = text + text[:k] for i in range(len(circ_text) - k): ...

def order(self): """ Returns order k of Markov model. """

...

def kgram_freq(self, kgram): """ Returns number of occurrences of kgram in text. Raises an error if kgram is not of length k. """

if self.k != len(kgram): raise ValueError('kgram ' + kgram + ' not of length ' + str(self.k)) ...

def char_freq(self, kgram, c): """ Returns number of times character c follows kgram. Raises an error if kgram is not of length k. """

if self.k != len(kgram): raise ValueError('kgram ' + kgram + ' not of length ' + str(self.k)) ...

def rand(self, kgram): """ Returns a random character following kgram. Raises an error if kgram is not of length k or if kgram is unknown. """

if self.k != len(kgram): raise ValueError('kgram ' + kgram + ' not of length ' + str(self.k)) if kgram not in self.st: raise ValueError('Unknown kgram ' + kgram) ...

def gen(self, kgram, T): """ Generates and returns a string of length T by simulating a trajectory through the corresponding Markov chain. The first k characters of the generated string is the argument kgram. Assumes that T is at least k. """

...

def replace_unknown(self, corrupted): """ Replaces unknown characters (~) in corrupted with most probable characters, and returns that string. """

# Given a list a, argmax returns the index of the maximum element in a. def argmax(a): return a.index(max(a))

original = '' for i in range(len(corrupted)): if corrupted[i] == '~': ... else: original += corrupted[i] return original

def _main(): """ Test client [DO NOT EDIT]. """

text, k = sys.argv[1], int(sys.argv[2]) model = MarkovModel(text, k) a = [] while not stdio.isEmpty(): kgram = stdio.readString() char = stdio.readString() a.append((kgram.replace("-", " "), char.replace("-", " "))) for kgram, char in a: if char == ' ': stdio.writef('freq(%s) = %s ', kgram, model.kgram_freq(kgram)) else: stdio.writef('freq(%s, %s) = %s ', kgram, char, model.char_freq(kgram, char))

if __name__ == '__main__': _main()

Output should be like this, thank you:

$ python model.py banana 2 an a na b na a na - freq (an , a) = 2 freq (na , b) = 1 freq (na , a) = 0 freq (na) = 2

$ python model.py gagggagaggcgagaaa 2 aa a ga g gg c ag - cg - gc - freq (aa , a) = 1 freq (ga , g) = 4 freq (gg , c) = 1 freq (ag) = 5 freq (cg) = 1 freq (gc) = 1

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Intranet And Web Databases For Dummies

Authors: Paul Litwin

1st Edition

0764502212, 9780764502217

More Books

Students also viewed these Databases questions