Answered step by step
Verified Expert Solution
Question
1 Approved Answer
PYTHON 3.x Huffman Trees HuffmanHeap.py: # Defines a data structure that allows Huffman codes to # be computed efficiently. # # A Huffman Heap is
PYTHON 3.x Huffman Trees
HuffmanHeap.py:
# Defines a data structure that allows Huffman codes to # be computed efficiently. # # A Huffman Heap is a queue to store HuffmanTree objects. # The word "heap" implies that the dequeue operation will always # remove and return the HuffmanTree object with the lowest frequency. class HuffmanHeap(object): def __init__(self, leafs): """ Purpose: Creates a new HuffmanHeap object. Pre-conditions: leafs: a list of HuffmanTree leaf nodes. Post-conditions: The heap is created and initialized. Return: None """ pass def enqueue(self, tree): """ Purpose: Adds the tree to the Heap. This is an O(1) operation. Pre-conditions: tree is a HuffmanTree object Post-conditions: the heap increases in size by 1 Return: None """ return None def dequeue(self): """ Purpose: Return the smallest value in the queue. This is an O(1) operation. Pre-conditions: None Post-conditions: The heap decreases in size by 1 Return: A HuffmanTree object, with the lowest frequency """ return None def __len__(self): """ Purpose: Return the number of data values stored in the heap. This method allows Python scripts to call len(hh) if hh is a HuffmanHeap object. Pre-conditions: None Post-conditions: None Return: The number of values stored in the heap """ return 0 def display(self): """ Purpose: Display the data for debugging. Pre-conditions: None Post-conditions: None Return: None """ print('Add code to HuffmanHeap.display() to help you debug.') HuffmanTree.py:
# Defines the Huffman Tree data structure # # A Huffman tree-node has a character and a frequency. # some strings to avoid long lines later _INIT_ASSERT_MESSAGE_NONLEAF = 'Invalid Huffman tree construction attempted.' _GETCH_ASSERT_MESSAGE_NONLEAF = 'Method get_char() called on non-leaf node' class HuffmanTree(object): def __init__(self, freq=0, char=None, left=None, right=None): """ Purpose: Initializes the HuffmanTree object. To create a leaf node aleaf = HuffmanTree(freq=10, char='A') bleaf = HuffmanTree(freq=15, char='E') To create an internal node: node = HuffmanTree(left=aleaf,right=bleaf) Pre-conditions: :param freq: a positive integer :param char: a character :param left: another Huffman Tree :param right: another HuffmanTree """ self.__char = char self.left = left self.right = right if left is None and right is None: # leaf node: assign the frequency as given self.__freq = freq else: assert left is not None and right is not None, _INIT_ASSERT_MESSAGE_NONLEAF # non-leaf node: calculate the frequency from the given subtrees self.__freq = left.__freq + right.__freq def is_leaf(self): """ Purpose: Check if the node is a leaf. Simplifies some of the other methods with an abstraction. Return: True if the node is a leaf node """ return self.left is None and self.right is None def get_freq(self): """ Purpose: Return the frequency data stored in the node. Return: :return: the frequency """ return self.__freq def get_char(self): """ Purpose: Return the character stored at a leaf node. Return: :return: the character at a leaf node """ assert self.is_leaf(), _GETCH_ASSERT_MESSAGE_NONLEAF return self.__char def display(self, level=0): """ Purpose: For debugging, display the tree. The structure of the tree is represented by indentation No other real purpose. Preconditions: :param level: indentation amount for subtrees. Return :return: None """ if self.is_leaf(): print(' '*level+'Leaf:', self.__char, self.__freq) else: print(' '*level+'Node:', self.__freq, 'Children:') self.left.display(level+1) self.right.display(level+1) def build_codec(self): """ Purpose: Build a dictionary of char-code pairs from the Huffman tree. Return: :return: a dictionary with character as key, code as value """ codes = {} def encoder(tree, code): if tree.is_leaf(): codes[tree.get_char()] = code else: encoder(tree.left, code+'0') encoder(tree.right, code+'1') if self.is_leaf(): codes[self.__char] = '0' else: encoder(self,'') return codes def __lt__(self, other): return self.__freqEncoder.py
import sys as sys import HuffmanTree as HT import HuffmanHeap as HH DEFAULT_OUTPUT_FILE = 'encoded.txt' def main(): """ Purpose: Application to read and encode a file using Huffman codes. Usage: python3 Encoder.pyQuestion 1 (17 points): Purpose: To implement an ADT to support efficient execution of a HuffmanTree encoder Degree of Difficulty: Easy to Moderate Phoenix Wright, ace attorney, loves being lazy - erm, efficient. Phoenix is also short on money. He has heard of text compression, and thinks it might be a good idea to compress all his text files to save on valuable disk space (disk space is expensive for him!). Larry Butz, his useless IT friend, wrote most of the code for text compression using Huffman trees, but did not finish the job (surprising no one). On the Assignment 10 Moodle page, you'll find a program named Encoder.py. The program follows a simple design, and is well-documented. All the pieces should be familiar to you, as they are similar to discussion in class. The Encoder.py program imports two supporting ADTs HuffmanTree.py Defines the Huf fmanTree class . HuffmanHeap.py Defines the HuffmanHeap class. The HuffmanTree class was discussed in lecture, and its definition should be familiar. There is one change that you should note: the function build_codec that builds the codec from a HuffmanTree object is novw a method in the Huf fmanTree class The ADT HuffmanHeap, was also discussed in lecture, though the code was not given. A heap is a kind of queue that guarantees that the dequeue operation always removes and returns the smallest value in the queue (this is not the FIFO protoco). The HuffmanHeap stores HuffmanTree objects, and has the following methods . __init_.(self, leafs): Creates a new HuffmanHeap object. . enqueue(self, tree): Adds the given HuffmanTree to the heap . dequeue(self): Removes and returns the smallest HuffmanTree in the heap -len--(self): Returns the number of HuffmanTrees in the heap. Defining this method allows pro- grammers to call len ) on HuffmanHeap objects In lecture we saw that we could implement these operations so that they all have O(n) worst case time complexity, where n is the number of HuffmanTree objects in the HuffmanHeap. However, we also de- scribed in lecture how these operations could be implemented so that they all have worst case time com plexity of O(1). Hint: use 2 lists The HuffmanHeap class is incomplete. Your job is to complete the 4 methods described above, so that the Encoder.py program correctly encodes text files. (When you've completed Questions 1 and 2, you should be able to encode and decode any text files. f you want to test your encoder, you can try decoding with the program given for Question 2.) NOTE: some symbols do not work for our encoding solution. apostrophes, semi-colons and non-standard symbols may have errorsSends output to DEFAULT_OUTPUT_FILE Return: :return: None """ if len(sys.argv) != 2: print('Usage: python3', sys.argv[0], ' ') print('-- sends output to', DEFAULT_OUTPUT_FILE, '-- ') return fname = sys.argv[1] lines = read_file(fname) freqs = count_characters(lines) codec = build_codec(freqs) coded = encode(lines, codec) write_file(DEFAULT_OUTPUT_FILE, coded) def read_file(fname): """ Purpose: Read a file and return contents in a list of strings. Preconditions: :param fname: the name of a text file Return: :return: a list of strings, one string for each line """ tfile = open(fname) lines = [l.rstrip() for l in tfile] tfile.close() return lines def write_file(fname, lines): """ Purpose: Write the data in lines to the named file. Warning: this will over-write any data in the named file! Preconditions: :param fname: the name of a text file :param lines: a list of strings Post-condition: The named file is created or over-written. Return: :return: None """ tfile = open(fname, 'w') for line in lines: tfile.write(line + ' ') tfile.close() def count_characters(contents): """ Purpose: Count the characters in the contents. Pre-conditions: :param contents: a list of strings. Return: :return: a list of (character,frequency) tuples """ freqs = dict() for line in contents: for char in line: if char in freqs: freqs[char] += 1 else: freqs[char] = 1 return list(freqs.items()) def build_codec(freq_list): """ Purpose: Build a codec from the frequency list. Pre-conditions: :param freq_list: A list of (char,frequency) tuples. Return: :return: a dictionary mapping characters to codes """ freq_list.sort(key=lambda p: p[1]) hq = HH.HuffmanHeap([HT.HuffmanTree(freq=f,char=c) \ for c,f in freq_list]) while len(hq) > 1: t1 = hq.dequeue() t2 = hq.dequeue() hq.enqueue(HT.HuffmanTree(left=t1, right=t2)) survivor = hq.dequeue() return survivor.build_codec() def encode(strings, codec): """ Purpose: Use the codec to create the data to be sent to the output file. Pre-conditions: :param strings: A list of strings to encode. :param codec: A dictionary containing character-code pairs Return: :return: a list of strings including: the number of codes the number of coded lines, the codes the coded lines """ output = [] output.append(str(len(codec)) + ' ' + str(len(strings))) for char in codec: output.append(codec[char] + ':' + "'" + char + "'") for s in strings: encoded = [] for char in s: encoded.append(codec[char]) output.append(''.join(encoded)) return output if __name__ == '__main__': main()
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started