Question

1 Approved Answer

Posted on Sep 24, 2024

READ THE FULL EXERCICE TEXT/ ADVANCED ALGORITHMICS Algorithm 1 Code-prefix : (T: array of length n using k different characters) Inputs: A text T of

READ THE FULL EXERCICE TEXT/ ADVANCED ALGORITHMICS

Algorithm 1 Code-prefix : (T: array of length n using k different characters) Inputs: A text T of length n in which k different characters appear. Outputs: A binary tree representing a prefix code to encode the text. 1. Calculate the frequencies (F1,..., Fk) of the letters of the text; 2. Create for each j in {1,. . . k} a tree with only one root with the value Fj; We obtain a forest of k trees; 3. Sort the trees according to the value of their root; 4. as long as there are at least two trees in the forest do - Choose the two trees A1 and A2 whose roots carry the two smallest values; - Replace A1 and A2 by a new tree such that: the two children of its root are A1 and A2; its root carries as value the sum of the roots of A1 and A2; the edges linking the root to its children A1 and A2 are valued 0 and 1 respectively; - Update the sorted list of trees according to the value of their roots; end while 5. Return the tree;

In order to decompress a text, it is necessary to have the tree produced by algorithm 1 or at least of the encoding table (i.e. for each character of its associated code). Transfer the shaft to the decompressor or this table is therefore counterproductive since their size is added to that of the compressed text! A solution simple would be to use the same table for all the texts, but it is more satisfying to be able to transmit a reduced amount of information that allows the decompressor to infer (i.e. find) the table / tree used specific to the text in question. To achieve this goal, we will impose several rules for the creation of a new prefix code. Suppose we have a prefix code produced by Algorithm 1. From the latter, we Let's build a new one that will be used by the compressor, then reconstituted by the decompressor only thanks to the lengths of the original codes: - The new codes will be sorted by their length. If their length is equal (same number of bits), then they will be ordered by the alphabetical order of the characters they encode. - Once this order is respected, all the bits of the first code are set to "0". - If two codes of the same length follow each other, then the second is equal to the previous one incremented by 1. - If two codes follow each other and the length of the second is strictly greater than that of the first, then the second code is obtained by incrementing the first by 1 and then placing 0 to the right of the result in order to obtain the original length of the second code (reminder: this amounts to multiplying this result by 2^m if we add m 0). Thus, the new codes obtained will perhaps be very different from those of the initial prefix code but their lengths will be preserved for each character, and therefore the compression rate will be the same.

QUESTION/

1-a)Using the previous rules, write an algorithm which from the sole knowledge of the list of lengths of the codes and the characters they encode, generates a new prefix code respecting these rules. Give its complexity with full explanation b) Apply your algorithm to the following instance which gives only the code lengths of each character: {(b, 3), (f, 4), (a, 1), (e, 4), (c, 3), (d, 3)}.