Question: In this problem Huffman codec. The termcodecis short for a compression/decompression (or coding/decoding) scheme. Your codec will implement thea3.Codecinterface, which means it must supply two

In this problem Huffman codec. The termcodecis short for a compression/decompression (or coding/decoding) scheme.

Your codec will implement thea3.Codecinterface, which means it must supply two methods:

void compress(File input, File output) throws IOException;

void decompress(File input, File output) throws IOException;

Implementing this interface allows your codec to be plugged into a client program that wants to do com- pression and decompression. We have supplied such a program, which we will discuss later (Section 3.4).

In the following two sections, we will go over carefully what you have to do for compression and decom-pression.

3.2 Compression

3.2.1 Constructing the Huffman Tree

To construct the Huffman tree from the uncompressedinputFile, you first n eed t o c alculate t he byte frequencies in the file. O pen t he fi le fo r re ading by calling

FileInputStream inStream = new FileInputStream(inputFile);

You can then read bytes sequentially from the file b y c allingi nStream.read()s uccessively. T his method returns anintcontaining the next byte in the file in the low-order 8 b its. For example, if the input byte was 01100001 (binary) representing the ASCII character 'a', theintreturned byinStream.read()would be 97. You can use thisintas an index into an array of length 256 = 28, the total number of possible bytes. Go through the file once and record the byte frequencies in anintarray of length 256.

Now construct the Huffman tree as described above. You should use ajava.util.PriorityQueueofTreeNodes for this. First construct a new priority queue by calling

PriorityQueue pq = new PriorityQueue();

then insert a newTreeNodefor each byte and its frequency. These will be the leaves of the soon-to-be-built Huffman tree.The minimum element ofpqis extracted by callingpq.poll().

Comparable

To implement this interface, you must define a methodint compareTo(TreeNodeobj).

Constructing the Byte-to-Codeword Map

To do the coding, we will need an efficient way to map each incoming byte to its corresponding codeword. We could search the tree every time for the byte, but this would be very inefficient. We will instead do a little preprocessing to construct an arraycodesof length 256 whosekth element is the codeword fork. Then while we are coding, we will be able to map incoming bytes to codewords by just pulling the appropriate codeword out of the array.

We have provided a classBitVectorto represent codewords. This class gives an efficient reresentation of a bitstring with interface similar toArrayListorVector; Javadoc documentation is provided.

construct the codewords by walking the tree recursively, filling in thecodesarray when you encounter a leaf. This should be done with a recursive methodcreateCodes(TreeNode n, BitVector bv). The method should just assignbvto the key ofnifnis a leaf. Otherwise, ifnis an internal node, it should clonebv, addtrueto one copy and call recursively on the right subtree, and addfalseto the other copy and call recursively on the left subtree. If you do this right, it is only a few lines of code.

Saving the Huffman Tree

The compressed file will contain the Huffman tree followed by the sequence of codewords corresponding to the bytes of the input file. You must first write the Huffman tree to the output file.

Open the output file for writing as aBitStreamwithBitStream outStream = new BitStream(outFile, "w");

The"w"means "write". The classBitStreamis a class for reading and writing streams of bits as opposed to bytes. We have provided this class and Javadoc documentation for it.

Write the keys in the tree inpreorderto the output file. Write just the keys, nothing else. Use theoutStream.writeBitsmethod to write each key. You are writing an integer, but you only want to write 9 bits of it. (All keys are between 0 and 511, which only requires 9 bits. We are trying to save space, remember?)

TheoutStream.writeBitsmethod allows you to specify the number of bits to write. Again, this can be done recursively with only a few lines of code.

When decompressing, we will be able to reconstruct the tree from the preorder sequence of keys.

Compressing the Input File

Close the input streaminStreamand reopen it so we can read the bytes again from the beginning. The output stream should still be open. Read the bytes sequentially from the input stream.

When you are done, make sure you close the input and output streams. This is especially important for the output stream, because bits are buffered, and not all of them may have been written to the output file yet. Moreover, since the number of bits is not necessarily a multiple of 8, the actual bit count must be saved in the output file. TheBitStream.close()method takes care of all this for you, but you have to remember to call it when you are done.

3.3 Decompression

Reconstructing the Huffman Tree

Before we can decompress, we must reconstruct the Huffman tree that was saved in the compressed file as a preorder sequence of keys. In the decompression phase, the frequencies are not used.

Open the input file as aBitStreamfor reading. Read in the sequence of 9-bit bitsequences that were the keys of the Huffman tree in preorder into an

Iterablesuch asArrayListorVector.

Now we will reconstruct the tree recursively. The recursive procedure should take anIteratorand return aTreeNode, which will be the root of the tree constructed from the keys in theIterator. The recursive procedure will be called with the initialIteratorobtained from theIterableobject containing the keys that you read in. The recursive procedure does the following:

Get the next key from theIterator.

Make a newTreeNode nwith that key. This is the root of the subtree we are currently visiting.

Returnn.

Decompressing the Input File

Open the output file as aFileOutputStream. We can write bytes to it using thewrite(int)method ofFileOutputStream.

Set a pointer to the root of the Huffman tree. Read the bits of the input file one at a time and follow the path in the tree as determined by those bits (0=left, 1=right). When we arrive at a leaf, that leaf contains the uncompressed byte

corresponding to the codeword we have just seen. Write that byte to the output stream and reset the pointer to the root of the tree for the next codeword. Note that this works since the code is self-delimiting: we know when we are at the end of a codeword, since that occurs exactly when we are at a leaf of the tree. Beautiful!

When done, remember to close the input and output streams.

TreeNode.java,HuffmanTree.java,Huffman.java,compressed.txt

---------------------

package hw9;

import java.util.*;

import java.io.*;

import java.util.BitSet;

/**

* This class takes care of the details of reading and writing bits to a file.

* To be correctly read by a read instance of this class, the file must have been

* written by a write instance of this class, and the output stream that wrote it

* must have been closed by calling {@code close()}.

*

* @author Nikos

*/

public class BitStream {

/**

* Buffer that temporarily stores bits and is flushed to the file when enough

* bits are around

*/

private ArrayList stream;

/** The underlying file */

private RandomAccessFile raf;

/** The mode with which the file was opened: "r" = read, "w" = write */

private String mode;

/** The length in bits of the file (not necessarily a multiple of 8) */

private int bitlength;

/**

* Creates a BitStream object and initializes the fields appropriately.

*

* @param file

* to be read/written

* @param mode

* open mode: can be either "r" (read) or "w" (write)

*

*/

public BitStream(File file, String mode) throws IOException {

bitlength = 0;

this.mode = new String(mode);

stream = new ArrayList();

if (mode.equals("w")) {

raf = new RandomAccessFile(file, "rw");

raf.writeInt(0); // save room for length later

} else if (mode.equals("r")) {

raf = new RandomAccessFile(file, "r");

bitlength = raf.readInt();

//sanity check -- reported file length should not be a lie

long fileLength = file.length();

if ((bitlength + 7)/8 > fileLength - 4)

throw new IllegalArgumentException("Corrupt input file");

} else throw new IllegalArgumentException(

"Mode should be either \"r\" or \"w\"");

}

/**

* Writes a collection of bits to the {@code BitStream}.

*

* @param bits the {@link BitSet} of bits to be written.

* @throws IOException if the bits cannot be written

*/

public void writeBits(BitSet bits) throws IOException {

bitlength += bits.length();

// Iterator iter = bits.iterator();

for (int i = 0; i

stream.add(bits.get(i)? 1:0);

// while (iter.hasNext()) stream.add(iter.next()? 1:0);

while (stream.size() >= 8)

raf.writeByte(bitsToInt(stream, 8));

}

/**

* Writes an integer as bits to the {@code BitStream}.

*

* @param data the int containing the bits to be written

* @param n the number of bits to be written

* @throws IOException if the bits cannot be written

*/

public void writeBits(int data, int n) throws IOException {

bitlength += n;

for (int i = n - 1; i >= 0; i--)

stream.add((data >> i) & 1);

while (stream.size() >= 8)

raf.writeByte(bitsToInt(stream, 8));

}

/**

* Writes all the bits in the buffer to the file, padding with zeros if

* necessary.

*

* @throws IOException if the stream cannot be flushed

*/

private void flushBits() throws IOException {

while (stream.size() % 8 != 0)

stream.add(0);

while (stream.size() >= 8)

raf.writeByte(bitsToInt(stream, 8));

}

/**

* Finalizes a bitstream. This method must be called in order that a

* read instance can subsequently read the bitstream.

* @throws IOException if the stream cannot be closed

*/

public void close() throws IOException {

if (mode.equals("w")) {

flushBits();

raf.seek(0);

raf.writeInt(bitlength);

}

raf.close();

}

/**

* Reads a specified number of bits from the Bitstream.

*

* @param n the number of bits to be read

* @return the bits actually read

* @throws IOException if n bits are not available

*/

public int readBits(int n) throws IOException {

int bits;

try {

while (stream.size() < n) {

byte b = raf.readByte();

for (int i = 7; i >= 0; i--)

stream.add((b >> i) & 1);

}

} catch (EOFException e) {

throw new IOException("Unexpected end of file");

}

bits = bitsToInt(stream, n);

bitlength -= n;

return bits;

}

/**

* Converts a sequence of n bits to an int.

*

* @param bits

* is the sequence of bits.

* @param n

* is the number of bits to use from this sequence.

* @return an n-bit int from the bits of the sequence.

*/

private static int bitsToInt(List bits, int n) {

int ret = 0;

for (int i = 0; i < n; i++) {

ret = (ret << 1) | bits.get(0);

bits.remove(0);

}

return ret;

}

/**

* For a read bitstream, returns {@code true} if the bitstream is not exhausted.

* @return {@code true} if the bitstream is not exhausted

*/

public boolean hasMoreBits() {

return bitlength > 0;

}

}

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!