Write a program to construct a dictionary of all words, defined to be runs of consecutive nonwhitespace,
Question:
Write a program to construct a dictionary of all “words,” defined to be runs of consecutive nonwhitespace, in a given text file. We might then compress the file (ignoring the loss of whitespace information)
by representing each word as an index in the dictionary. Retrieve the file rfc791.txt from the RFC repository, and run your program on it.
Give the size of the compressed file, assuming first that each word is encoded with 12 bits (this should be sufficient) and then that the 128 most common words are encoded with 8 bits and the rest with 13 bits. Assume that the dictionary itself can be stored by using, for each word, length(word) + 1 bytes.
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Related Book For
Computer Networks A Systems Approach
ISBN: 9780128182000
6th Edition
Authors: Larry L. Peterson, Bruce S. Davie
Question Posted: