Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write program in python that involves hashing , built upon markov chain logarithm The program will be altered to use a hash table ADT rather

Write program in python that involves hashing , built upon markov chain logarithm

  1. The program will be altered to use a hash table ADT rather than a Python built in dictionaries
  2. The input will include the table size for the hash table, as specified under Input Format.
  3. Some error checking is required as specified under Errors.

 

General Requirements

  1. For the long problems, your code should follow the standard python style guide
  2. You may not use concepts  like exceptions, type annotations, and importing libraries (unless a library is explicitly mentioned in the specification.)

Markov Chain Algorith

 

Expected Behavior

Write a program, in a file writer_bot_ht.py, that generates random text from a given source text. Your program should behave as follows:

 

  1. Use input() (without arguments) to read the name of the the source file sfile. Do not prompt the user for input. Do not hard-code the file name into your program.

 

  1. Use input() (without arguments) to read the size of the the hash table M. Do not prompt the user for input.

 

  1. Use input() (without arguments) to read in the prefix size n. Do not prompt the user for input.

 

  1. Use input() (without arguments) to read in the number of words to be generated for the random text. Do not prompt the user for input.

 

  1. Read sfile and build the Markov chain table of prefixes to suffixes according to the description above.

 

  1. Construct the randomly generated text according to the Markov chain algorithm. Construct a list to hold the words of the generated text.

 

  1. Print out the generated text list accoring to the Output format below.

 

Input Fiile

Each line of the input file is a sequence of characters separated by whitespace. The file may consists of any number of lines with any number of words on each line.

Output Format:

Print out the list of generated text ten words per line. Any extra words will be printed on the last line. For example, if the generated text has only nine words, the output will consist of one line of nine words. If the text has 109 words, the output will consist of eleven lines of output, the first ten lines having ten words and the last line having nine.

Programming Requirements

  1. The example discussed above shows a table for prefixes of size two. Your program must work for a prefix of arbitrary size n.

 

  1. Instead of using a Python dictionary to build the table mapping prefixes to suffixes, implement a hash table ADT. Previously, tuples were used to represent the prefexis you will use strings to represent a prefix. For example, instead of using the tuple

    ('Half, 'a')    

  1. you will use the string

    'Half a'    

  1. Since the prefixes will be the keys in the hash table, you will use the hash function specified in the Hash Function section to hash a string representing a prefix to an integer. You are required to use the given hash function. Implement the following class, Hashtable:

class Hashtable

An object of this class is a hash table that uses linear probing to handle collisions. It should contain (at least) the following attributes:

  • _pairs: the underlying Python list implementing the hash table.
  • _size: the size of the hash table.

The class must implement (at least) the following methods:

  • __init__(self, size): initializes the object's attributes as follows: _pairs is set to a list of size size; _size is set to size.
  • put(self, key, value): hashes key and inserts the key/value pair in the hash table. Use linear probing with a decrement of 1 to resolve collisions.
  • get(self, key): looks up key in the hash table and if found, returns the corresonding value. If not found, it returns None.
  • __contains__(self, key): looks up key in the hash table and if found returns True and otherwise returns False.
  • __str__(self) returns a string representation of a Hashtable object.

 

  1. Note: Do not use append() in the put() method.  The code that uses the hashtable ATD will modify a key's corresponding value when needed.

 

  1. As before, a prefix may have one or more suffixes. You must use a list to represent thue possible suffixes. Whenu a new suffix is encountered for an existing prefix, you muuust append the new suffix to the end of the list. This is important for matching the autograder output: the order in which suffixes are stored in the list will affect the choices made during text generation and will impact the output. For example, suppose that 'Half a' hashes to integer i and that 'a bee' hashes to integer j. The following shows the contents of the hash table at those indices for the Eric the Bee example:

 

    hash table at index i is ['Half a', [ 'bee' ]]    ...    hash table at index j is ['a bee', [ 'philosophically', 'be', 'Due' ]]

  1. In addition, during text generation, when a prefix has more than one suffix, the suffix will be randomly chosen from the list. You will use the Python random number to do this, in order for your output to match the tester and grading scripts, you must seed the random number generator. To do this, define the following constant at the top of your program:

   SEED = 8

  1. You must define the constant NONWORD, which must be a word that cannot exist in the original text. In Assignment 9, a space was used for this, however, we need spaces to delineate the words of a prefix. Consequently, define NONWORD as the single character @ as follows:

   NONWORD = '@'

  1. As you can imagine, when generating the output for larger text, it is not useful to print out the random text one word at a time. During the text generation phase, create a list to hold the words of the generated text. When the text generation is complete, print the output as specified in the Output format section.

 

Hash Function

The hash function covered in lecture that hashes a string to an integer multiples the position of each character in the string by its ord value. That is straightforward but not robust enough for use in practice. A better approach is to compute a polynomial whose coefficients are the ord values of the individual characters in the string. Using Horner's rule to compute the polynomial, and using 31 for the value of x, gives us the following hash function:

    def _hash(self, key):        p = 0        for c in key:            p = 31*p + ord(c)        return p % self._size  

Errors

The following are errors:

 

  1. The input value n for the prefix size is less than one.

Program behavior: Use normal program logic to detect this (i.e., if statements). Give the following error message:

Error message: "ERROR: specified prefix size is less than one"

 

  1. The input value for the size of the generated text is less than one.

Program behavior: Use normal program logic to detect this (i.e., if statements). Give the following error message:

Error message: "ERROR: specified size of the generated text is less than one"

 

 

Terminate the program after the errors above are reported. You may use Python's system library. First, include the following import statement at the top of your program:

import sys

After printing the appropriate error message, exit your program like so:

sys.exit(0)

Examples

Some examples of generating random text from different source texts are shown here.

 

Reference

The Markov chain algorithm is used to solve a variety of problems. Using it for random text generation has been described in many places, most notably in The Practice of Programming, by Kernighan and Pike, which can be found here.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Data Structures and Algorithms in Java

Authors: Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser

6th edition

1118771334, 1118771338, 978-1118771334

More Books

Students also viewed these Algorithms questions

Question

Show that n i=1 i/2 i Answered: 1 week ago

Answered: 1 week ago