Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Problem 1 . ( Markov Model Data Type ) Define a data type called MarkovModel in markov _ model.py to represent a Markov model of
Problem Markov Model Data Type Define a data type called MarkovModel in markovmodel.py to represent a Markov model of order k from a given text string. The data type must support the following API:
MarkovModel
MarkovModeltext k constructs a Markov model m of order k from textmorder returns the order of mmkgramfreqkgram returns the number of occurrences of kgram in mmcharfreqkgram c returns the number of times character c follows kgram in mmrandkgram using m finds and returns a random character following kgrammgenkgram n using m builds and returns a string of length n the first k characters of which is kgram
aa ag cg ga gc gg
Project Markov Model
Constructor To implement the data type, define two instance variables: an integer k that stores the order of the Markov model, and a symbol table st whose keys are all the kgrams from the given text. The value corresponding to each key say kgram in st is a symbol table whose keys are the characters that follow kgram in the text, and the corresponding values are their frequencies. You may assume that the input text is a sequence of characters over the ASCII alphabet so that all values are between and The frequencies should be tallied as if the text were circular ie as if it repeated the first k characters at the end For example, if the text is gagggagaggcgagaaa and k then the symbol table st should store the following information:
aa: a: g:
ag: a: g:
cg: a:
ga: a: g:
gc: g:
gg: a: c: g:
If you are careful enough, the entire symbol table can be built in just one pass through the circular text. Note that there is no reason to save the original text or the circular text as an attribute of the data type. That would be a grossly inefficient waste of space. Your MarkovModel object does not need either of these strings after the symbol table is built.
Order. Return the order k of the Markov Model.
Frequency. There are two frequency methods.
kgramfreqkgram returns the number of times kgram was found in the original text. Returns when kgram is not found. Raises an error if kgram is not of length k
charfreqkgram c returns the number of times kgram was followed by the character c in the original text. Returns when kgram or c is not found. Raises an error if kgram is not of length k
Randomly generate a character. Return a character. It must be a character that followed the kgram in the original text. The character should be chosen randomly, but the results of calling randkgram several times should mirror the frequencies of characters that followed the kgram in the original text. Raise an error if kgram is not of length k or if kgram is unknown.
Generate pseudorandom text. Return a string of length n that is a randomly generated stream of characters whose first k characters are the argument kgram. Starting with the argument kgram, repeatedly call rand to generate the next character. Successive kgrams should be formed by using the most recent k characters in the newly generated text.
To avoid dead ends, treat the input text as a circular string: the last character is considered to precede the first character. For example, if k and the text is the character string gagggagaggcgagaaa then the salient features of the Markov model are captured in the table below:
frequency of next char kgram freq a c g
probability that next char is
a c g
Note that the frequency of ag is and not because we are treating the string as circular.
A Markov chain is a stochastic process where the state change depends on only the current state. For text generation, the current state is a kgram. The next character is selected at random, using the probabilities from the Markov model. For example, if the current state is ga in the Markov model of order discussed above, then the next character is a with probability and g with probability The next state in the Markov chain is obtained by appending the new character to the end of the kgram and discarding the first character. A trajectory through the Markov chain is a sequence of such states. Shown below is a possible trajectory consisting of transitions.
Project Markov Model
trajectory: ga ag gg gc cg ga ag ga aa ag probability for a: probabilityforc: probability for g:
Treating the input text as a circular string ensures that the Markov chain never gets stuck in a state with no next characters.
To generate random text from a Markov model of order k set the initial state to k characters from the input text. Then, sim
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started