Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

PYTHON 3.x Huffman Trees HuffmanHeap.py: # Defines a data structure that allows Huffman codes to # be computed efficiently. # # A Huffman Heap is

PYTHON 3.x Huffman Trees

image text in transcribed

HuffmanHeap.py:

# Defines a data structure that allows Huffman codes to # be computed efficiently. # # A Huffman Heap is a queue to store HuffmanTree objects. # The word "heap" implies that the dequeue operation will always # remove and return the HuffmanTree object with the lowest frequency. class HuffmanHeap(object): def __init__(self, leafs): """  Purpose:  Creates a new HuffmanHeap object.  Pre-conditions:  leafs: a list of HuffmanTree leaf nodes.  Post-conditions:  The heap is created and initialized.  Return:  None  """  pass def enqueue(self, tree): """  Purpose:  Adds the tree to the Heap.  This is an O(1) operation.  Pre-conditions:  tree is a HuffmanTree object  Post-conditions:  the heap increases in size by 1  Return:  None  """  return None def dequeue(self): """  Purpose:  Return the smallest value in the queue.  This is an O(1) operation.  Pre-conditions:  None  Post-conditions:  The heap decreases in size by 1  Return:  A HuffmanTree object, with the lowest frequency  """  return None def __len__(self): """  Purpose:  Return the number of data values stored in the heap.  This method allows Python scripts to call len(hh)  if hh is a HuffmanHeap object.  Pre-conditions:  None  Post-conditions:  None  Return:  The number of values stored in the heap  """  return 0 def display(self): """  Purpose:  Display the data for debugging.  Pre-conditions:  None  Post-conditions:  None  Return:  None  """  print('Add code to HuffmanHeap.display() to help you debug.') HuffmanTree.py:
# Defines the Huffman Tree data structure # # A Huffman tree-node has a character and a frequency. # some strings to avoid long lines later _INIT_ASSERT_MESSAGE_NONLEAF = 'Invalid Huffman tree construction attempted.' _GETCH_ASSERT_MESSAGE_NONLEAF = 'Method get_char() called on non-leaf node' class HuffmanTree(object): def __init__(self, freq=0, char=None, left=None, right=None): """  Purpose:  Initializes the HuffmanTree object.  To create a leaf node  aleaf = HuffmanTree(freq=10, char='A')  bleaf = HuffmanTree(freq=15, char='E')  To create an internal node:  node = HuffmanTree(left=aleaf,right=bleaf)   Pre-conditions:  :param freq: a positive integer  :param char: a character  :param left: another Huffman Tree  :param right: another HuffmanTree  """  self.__char = char self.left = left self.right = right if left is None and right is None: # leaf node: assign the frequency as given self.__freq = freq else: assert left is not None and right is not None, _INIT_ASSERT_MESSAGE_NONLEAF # non-leaf node: calculate the frequency from the given subtrees self.__freq = left.__freq + right.__freq def is_leaf(self): """  Purpose:  Check if the node is a leaf.  Simplifies some of the other methods with an abstraction.  Return:  True if the node is a leaf node  """  return self.left is None and self.right is None def get_freq(self): """  Purpose:  Return the frequency data stored in the node.  Return:  :return: the frequency  """  return self.__freq def get_char(self): """  Purpose:  Return the character stored at a leaf node.  Return:  :return: the character at a leaf node  """  assert self.is_leaf(), _GETCH_ASSERT_MESSAGE_NONLEAF return self.__char def display(self, level=0): """  Purpose:  For debugging, display the tree.  The structure of the tree is represented by indentation  No other real purpose.  Preconditions:  :param level: indentation amount for subtrees.  Return  :return: None  """  if self.is_leaf(): print(' '*level+'Leaf:', self.__char, self.__freq) else: print(' '*level+'Node:', self.__freq, 'Children:') self.left.display(level+1) self.right.display(level+1) def build_codec(self): """  Purpose:  Build a dictionary of char-code pairs from the Huffman tree.  Return:  :return: a dictionary with character as key, code as value  """  codes = {} def encoder(tree, code): if tree.is_leaf(): codes[tree.get_char()] = code else: encoder(tree.left, code+'0') encoder(tree.right, code+'1') if self.is_leaf(): codes[self.__char] = '0' else: encoder(self,'') return codes def __lt__(self, other): return self.__freq  

Encoder.py

import sys as sys import HuffmanTree as HT import HuffmanHeap as HH DEFAULT_OUTPUT_FILE = 'encoded.txt' def main(): """  Purpose:  Application to read and encode a file using Huffman codes.  Usage: python3 Encoder.py   Sends output to DEFAULT_OUTPUT_FILE  Return:  :return: None  """  if len(sys.argv) != 2: print('Usage: python3', sys.argv[0], '') print('-- sends output to', DEFAULT_OUTPUT_FILE, '-- ') return fname = sys.argv[1] lines = read_file(fname) freqs = count_characters(lines) codec = build_codec(freqs) coded = encode(lines, codec) write_file(DEFAULT_OUTPUT_FILE, coded) def read_file(fname): """  Purpose:  Read a file and return contents in a list of strings.  Preconditions:  :param fname: the name of a text file  Return:  :return: a list of strings, one string for each line  """  tfile = open(fname) lines = [l.rstrip() for l in tfile] tfile.close() return lines def write_file(fname, lines): """  Purpose:  Write the data in lines to the named file.  Warning: this will over-write any data in the named file!  Preconditions:  :param fname: the name of a text file  :param lines: a list of strings  Post-condition:  The named file is created or over-written.  Return:  :return: None  """  tfile = open(fname, 'w') for line in lines: tfile.write(line + ' ') tfile.close() def count_characters(contents): """  Purpose:  Count the characters in the contents.  Pre-conditions:  :param contents: a list of strings.  Return:  :return: a list of (character,frequency) tuples  """  freqs = dict() for line in contents: for char in line: if char in freqs: freqs[char] += 1 else: freqs[char] = 1 return list(freqs.items()) def build_codec(freq_list): """  Purpose:  Build a codec from the frequency list.  Pre-conditions:  :param freq_list: A list of (char,frequency) tuples.  Return:  :return: a dictionary mapping characters to codes  """  freq_list.sort(key=lambda p: p[1]) hq = HH.HuffmanHeap([HT.HuffmanTree(freq=f,char=c) \ for c,f in freq_list]) while len(hq) > 1: t1 = hq.dequeue() t2 = hq.dequeue() hq.enqueue(HT.HuffmanTree(left=t1, right=t2)) survivor = hq.dequeue() return survivor.build_codec() def encode(strings, codec): """  Purpose:  Use the codec to create the data to be sent to the output file.  Pre-conditions:  :param strings: A list of strings to encode.  :param codec: A dictionary containing character-code pairs  Return:  :return: a list of strings including:  the number of codes the number of coded lines,  the codes  the coded lines  """  output = [] output.append(str(len(codec)) + ' ' + str(len(strings))) for char in codec: output.append(codec[char] + ':' + "'" + char + "'") for s in strings: encoded = [] for char in s: encoded.append(codec[char]) output.append(''.join(encoded)) return output if __name__ == '__main__': main()
Question 1 (17 points): Purpose: To implement an ADT to support efficient execution of a HuffmanTree encoder Degree of Difficulty: Easy to Moderate Phoenix Wright, ace attorney, loves being lazy - erm, efficient. Phoenix is also short on money. He has heard of text compression, and thinks it might be a good idea to compress all his text files to save on valuable disk space (disk space is expensive for him!). Larry Butz, his useless IT friend, wrote most of the code for text compression using Huffman trees, but did not finish the job (surprising no one). On the Assignment 10 Moodle page, you'll find a program named Encoder.py. The program follows a simple design, and is well-documented. All the pieces should be familiar to you, as they are similar to discussion in class. The Encoder.py program imports two supporting ADTs HuffmanTree.py Defines the Huf fmanTree class . HuffmanHeap.py Defines the HuffmanHeap class. The HuffmanTree class was discussed in lecture, and its definition should be familiar. There is one change that you should note: the function build_codec that builds the codec from a HuffmanTree object is novw a method in the Huf fmanTree class The ADT HuffmanHeap, was also discussed in lecture, though the code was not given. A heap is a kind of queue that guarantees that the dequeue operation always removes and returns the smallest value in the queue (this is not the FIFO protoco). The HuffmanHeap stores HuffmanTree objects, and has the following methods . __init_.(self, leafs): Creates a new HuffmanHeap object. . enqueue(self, tree): Adds the given HuffmanTree to the heap . dequeue(self): Removes and returns the smallest HuffmanTree in the heap -len--(self): Returns the number of HuffmanTrees in the heap. Defining this method allows pro- grammers to call len ) on HuffmanHeap objects In lecture we saw that we could implement these operations so that they all have O(n) worst case time complexity, where n is the number of HuffmanTree objects in the HuffmanHeap. However, we also de- scribed in lecture how these operations could be implemented so that they all have worst case time com plexity of O(1). Hint: use 2 lists The HuffmanHeap class is incomplete. Your job is to complete the 4 methods described above, so that the Encoder.py program correctly encodes text files. (When you've completed Questions 1 and 2, you should be able to encode and decode any text files. f you want to test your encoder, you can try decoding with the program given for Question 2.) NOTE: some symbols do not work for our encoding solution. apostrophes, semi-colons and non-standard symbols may have errors

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Learn Mysql The Easy Way A Beginner Friendly Guide

Authors: Kiet Huynh

1st Edition

B0CNY7143T, 979-8869761545

More Books

Students also viewed these Databases questions

Question

Differentiate the function. r(z) = 2-8 - 21/2 r'(z) =

Answered: 1 week ago