Question
Hello all, I need to create the definitions for huffman encoding and decoding according to the specific details below in PYTHON(v3). It is very important
Hello all, I need to create the definitions for huffman encoding and decoding according to the specific details below in PYTHON(v3). It is very important that it is understood that the code i am also providing isntto be changed and must remain the same, all I am looking for is 3 definitions to add to my code, Huffman encoding, Huffman Decoding and huffman tree preorder ... I will first post the directions given, then i will post my code that i have to add these three functions to...
Huffman encode: Write a function called huffman_encode(in_file, out_file) (use that exact name) that reads an input text file and writes, using the Huffman code, the encoded text into an output file. The function accepts two file names in that order: input file name and output file name, represented as strings. If the specified output file already exists, its old contents will be erased. See example files in the test cases provided to see the format. Writing the generated code for each character into a file will actually enlarge the file size instead compress it. The explanation is simple: although we encoded our input text characters in sequences of 0s and 1s, representing actual single bits, we write them as individual 0 and 1 characters, i.e. 8 bits. To actually obtain a compressed the file you would write the 0 and 1 characters as individual bits.
Huffman decoding: In this assignment you will not actually be storing the header information with the encoded file. In an actual implementation some initial information must be stored in the compressed file that will be used by the decoding program. This information must be sufficient to construct the tree to be used for decoding. There are several alternatives for doing this including: Store the character counts at the beginning of the file. You can store counts for every character, or counts for the non-zero characters. If you do the latter, you must include some method for indicating the character, e.g., store character/count pairs. This is conceptually what is being done in the assignment by passing the frequencies into huffman_decode(). Store the tree at the beginning of the file. One method for doing this is to do a pre-order traversal, writing each node visited. You must differentiate leaf nodes from internal/non-leaf nodes. One way to do this is write a single bit for each node, say 1 for leaf and 0 for non-leaf. For leaf nodes, you will also need to write the character stored. For non-leaf nodes there's no information that needs to be written, just the bit that indicates there's an internal node. Use a "standard" character frequency, e.g., for any English language text you could assume weights/frequencies for every character and use these in constructing the tree for both encoding and decoding. Implementation Write a function tree_preord(hufftree) that takes a Huffman tree and produces the tree description given in the second bullet under Header Information. Namely a string of characters produced by a preorder traversal of a Huffman tree writing 0 for a non-leaf node or 1 for a leaf node followed by the character stored in the leaf node with no spaces or separating characters. Write a function called huffman_decode(freqs, encoded_file, decode_file) (use that exact name) that reads an encoded text file, encoded_file, and writes the decoded text into an output text file, decode_file, using the Huffman Tree produced by your programs cnt_fref() and create_huff_tree(list_of_freqs). If the specified output file already exists, its old contents will be erased.
Here is my code ....
class HuffmanNode: """Implements an efficient last-in first-out Abstract Data Type using a Python List"""
# char is a character # freq is a integer number by the name of frequency # code is the binary code that is output # left is the left child of a node # right is the right child of a node def __init__(self, char, freq): self.char = char # actually the character code it is the numeric ascii code self.freq = freq self.code = None self.left = None self.right = None
# Purpose: to check whether the right or left is none, therefore making it a leaf # Signature: str -> bool def is_leaf(self): if self.left and self.right is None: # checks to see whether the node we are at has children , return True # true if it does else: return False # false if it doesnt
def __repr__(self): return '(char=%r, freq=%r)' % (self.char, self.freq)
#Purpose: call function and pass the file name # Signature: inpt file -> output file def fio(filename): with open(filename, encoding='utf-8-sig') as fin: for line in fin: for ch in range(0, len(line)): print(line[ch], end=', ') fin.close()
def huffman_encode(in_file, out_file): pass
def huffman_decode(freqs, encoded_file, decode_file): pass
def tree_preord (node): pass
# def comes_before (a, b) : # Purpose: New parent node to be created from the two nodes with the lowest occurrence count # that parent node is inserted into the list of sorted nodes # Signature: character and frequency -> bool def comes_before (a, b) : if a.freq == b.freq: # makes a comparison of frequencies of both parameters to compare ascii values if a.char < b.char: # if we have two characters we compare ascii values return True else: return False if a.freq < b.freq: return True else: return False
""" def comes_before(a, b): if a.occurrence_count < b.occurrence_count: return True elif a.occurrence_count == b.occurrence_count: return a.ch < b.ch else: return False """
# Purpose: Converts an infix expression to an equivalent postfix expression. # Signature: string -> int def cnt_freq(filename): # opens a text file try: with open(filename, encoding='utf-8-sig') as file: # opens the file freqCount = [0] * 256 # create a list of 256 size with 0 in each index string = file.read() #read in the string from the file for char in string: # for each character we look through index = ord(char) # using the ordinance function we chaeck the characters freqCount[index] += 1 #then every iteration is incremented by 1 except: # if file doesnt exist raise IOError('File doesnt exist') return freqCount #return the int value of that character
# Purpose: Creates huffman code for nodes by using a sorting loop # Signature: str -> list def create_huff_tree(list_of_freqs): listOfNodes = [] #create an empty list
for i in range(len(list_of_freqs)): #loop through the length of the list if list_of_freqs[i] != 0: #if we are not at the very beginning of the list index node = HuffmanNode(i, list_of_freqs[i]) #we create the node that goes into the huffman listOfNodes.append(node) # we append that new node if len(listOfNodes) == 0: # this would be an empty list return None
while len(listOfNodes) != 1: # we need to make sure that we have more than one character in this list firstNode = find_min(listOfNodes) #so we find the min of the first node in that list listOfNodes.remove(firstNode) #then we remove it secondNode = find_min(listOfNodes) #we repeat the process again listOfNodes.remove(secondNode)
newFrequency = firstNode.freq + secondNode.freq #once the nodes we want to combine are found we add their frequencies as that is what the parent will take itn combinedChar=min(firstNode.char,secondNode.char)
newNode = HuffmanNode(combinedChar,newFrequency) #this is what goes into the new node that is created newNode.left = firstNode #this would be the children from the comparison newNode.right = secondNode listOfNodes.insert(0,newNode) #this is where we insert the new node #print(listOfNodes[0].right) return listOfNodes[0]
# Purpose: Creates huffman code for nodes # Signature: str -> list of integers in binary form def create_code(root_node): if root_node == None: #check if there is anything in the list return ['']*256 # if there is an empty list just return else: last_visitedlst = [''] * 256 # creates an empty list of size 256 newlist = last_visitedlst #creating a temp list binaryConvert(root_node,last_visitedlst) #calling recurrsion on the list so that we may loop through parents and add binary numbers accordingly in the form [character:000 , etc.] return newlist
# Purpose: Creates huffman code for nodes using a recursive pattern to loop through parents # Signature: char -> list of integers in binary form def binaryConvert(node,last_visitedlst,new=""): # this function takes in the node we are using, the list we created that was empty, and a new list where we will append to left = node.left right = node.right if left == None and right == None: # checking if this is a leaf node.code = new last_visitedlst[node.char] = node.code if (left != None): # if we do have a left child pastleft=new # we create a temporary in the list new = new + "0" # we concatenate a 0 to the already character in the list binaryConvert(left,last_visitedlst,new) #and we recursively repeat the process until we reach the leaf new = pastleft if (right != None): # if we do have a right child oldNew = new # we create a temporary in the list new = new + "1" # we concatenate a 1 to the already character in the list binaryConvert(right,last_visitedlst,new) #and we recursively repeat the process until we reach the leaf new = oldNew
# Purpose: Finding the minimum of a set of nodes from a list # Signature: str -> str def find_min(lst): min = lst[0] # we set a temp value as the first index in the list for index in range(0,len(lst)): #loop through our range current = lst[index] #current is the value at the index of this list if comes_before(current,min): # using our comes_before function we make a comparison and see if its true min = current #if it is our temp is the current value return min #return that value
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started