Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

map1.txt contents a b b c c d d e e f f g g h h i i j j k k l l

image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedmap1.txt contents

a b b c c d d e e f f g g h h i i j j k k l l m m n n o o p p q q r r s s t t u u v v w w x x y y z z a

map2.txt contents

a g b t c q d x e m f b g n h v i y j i k e l w m c n f o k p h q z r p s s t d u u v o w l x j y a z r

Background: Substitution Ciphers One of the oldest - and poorest - types of codes is a "substitution cipher" - where each character in the alphabet is replaced with another. It's incredibly easy to break (with just a little bit of math, and a moderate amount of ciphertext), but we're going to use it to get some more practice with files and dictionaries. For our program, we are going to have a "substitution map," which is a dictionary which maps characters to other characters. We'll read this from a file; you will parse the lines, and build the dictionary. An input file (which defines a map) will be made up of lines like this: b m (This map says that the character 'a' should be replaced with 'x', 'b' with 'm, and so on.) Since the map file has such a simple format, we won't ever try to encrypt (or decrypt) spaces, tabs, or newlines; we also won't try to convert the many other unprintable characters. You will be able to assume that each line in the file (other than blank lines, and comments), will have exactly two words, and each word will have exactly two characters. After you build a map, we will (sometimes) call another function - in another file - which will "encode" a string. We will give you the encryption map (the dictionary), and a string; you will iterate over the string, and build up the encrypted string. If you see any spaces, tabs, or newlines, you will just copy them into the destination string, unchanged. For ordinary characters, you will use the map to find the encrypted character, and copy that into the output. How it Works 5.1 In the DEFLATE format, data is made up of a sequence of items, each of which is one of the following: A single character A reference to previous data in the uncompressed stream. This reference has both a distance and a length For example, suppose that you wanted to encode the text See Jack run. Jack runs up the hil. Jack and Jill roll down. We won't try to write an encode for this assignment - but suppose that you no- ticed that the word "Jack" showed up in the text a number of times. You would leave the first Jack" intact, but would replace the second with a reference: references to previous text, like this: See Jack run. (11,4) runs up the hill. Jack and Jill roll down. (In this example, we see that the 2nd Jack" is 11 characters after the first one, and we wanted to copy 4 characters.) If we were just a little smarter, we would notice that we could copy the spaces before and after the word as well: See Jack run. (11,6)runs up the hill. Jack and Jill roll down. You also want to replace the 2nd Jack." If you want, you can make it point at the first ome: See Jack run. (11,6)runs up the hill. (35,6) and Jill roll down. or at the second one, which works just as well: See Jack run. (11,6)runs up the hill. (24,6) and Jill roll down. Of course, there are lots of other duplicates in the string: See Jack run. (11,6)runs up the hill(24,8)and Jill roll down. See Jack run. (11,6)runs up the hill (24,8)and Ji(16,2) roll down. See Jack run. (11,6)runs up the hil1(24,8)and Ji(16,2) ro(5,3)down. See Jack run. (11,9)s up the hill(24,8)and Ji(16,2) ro(5,3)down. How We'll Represent the Encoding 5.2 We will represent a compressed data stream as a list of items. Each item will be one of two things: either a single character, or a tuple which contains two integers. The single character represents a single character in the output; the tuple represents a backwards reference. So the compressed data above would be encoded like this: 'e', ['S', 'e' 'c', 'J', 'a' 'k' 'r', 'u', 'n', (11,9), 's', and so on 7 Required Function: build map() build map() must take a single parameter, which is the location of a file. You must open the file and build a substitution map from it; return the map. The file format will be as described earlier in this file; as with the previous assignment, you must ignore any blank lines, or lines that contain comments. You may assume that the file exists; you do not have to do any error checking to handle a situation where the file can't be opened. 7.1 asserts Your function must use asserts to make sure that each of the following is true, for every line (except blank lines and comments): The line has exactly two words (we don't care about how much whitespace, or where) Both words on the line must be a single character The same "from" character in the map must not be used twice. Note that there are many other things that might be interesting to check, such as confirming that the "to" characters are not duplicated. Don't do these checks in this function; tolerate silly maps. 1Since this is a function - not the main program itself - do not do the chdir code. Assume that main () gives you a filename which will work, without changing directories. 8 Required Function: is valid map() is valid map () does some more extensive error checking - more than is done by build map(). It is designed either to be used as a second step (sanity-checking something that build map() returned), or as checking a map that was build by hand, using some other process. This function takes a single parameter (the map), and returns either True or False; it only returns True if all of the requirements are met. This function should never assert about anything; instead, return False if you find a problem. This function must confirm that: Every "from" character (the keys) and every to" character (the values) are strings. Also, every such string is exactly one character long. Also, none of those strings are the space, tab, or newline character. HINT: What does this code do? if c in " \t ": There are no duplicate "to" characters. The set of "from" characters is exactly the same as the set of to" char- acters. 9 Required Function: encode () 9. encode (), in subs_cipher.py, must encode a string. It takes two parameters: the substitution map, and the input string. It must return the cipher string. See the description, earlier in this spec, to see how the encryption process works. 9.1 asserts This function must call is-valid map() to check that the map is reasonable, and must assert that the function returns True. You must not implement is valid map() in this file! Instead, you must import the build map.py file, and call the function that is defined in there. 10 Required Function: decode() decode (), in subs.cipher.py, must decode a string. It takes works exactly like encode (), except that it must reverse the mapping. Required Function: unzip() 11 unzip() takes a compressed stream as its only parameter, and returns the uncompressed string. The compressed stream is encoded as described earlier in this spec; the stream is a list, and each element is either a string (with a single character), or a tuple with two integers. 11.1 asserts This function must assert that: Every element in the compressed stream is the proper type (string or tuple) Every string is exactly one character long Every tuple has exactly two elements, both of which are positive integers The "offset" (the first element in the tuple) does not point back any further than the start of the uncompressed string. Background: Substitution Ciphers One of the oldest - and poorest - types of codes is a "substitution cipher" - where each character in the alphabet is replaced with another. It's incredibly easy to break (with just a little bit of math, and a moderate amount of ciphertext), but we're going to use it to get some more practice with files and dictionaries. For our program, we are going to have a "substitution map," which is a dictionary which maps characters to other characters. We'll read this from a file; you will parse the lines, and build the dictionary. An input file (which defines a map) will be made up of lines like this: b m (This map says that the character 'a' should be replaced with 'x', 'b' with 'm, and so on.) Since the map file has such a simple format, we won't ever try to encrypt (or decrypt) spaces, tabs, or newlines; we also won't try to convert the many other unprintable characters. You will be able to assume that each line in the file (other than blank lines, and comments), will have exactly two words, and each word will have exactly two characters. After you build a map, we will (sometimes) call another function - in another file - which will "encode" a string. We will give you the encryption map (the dictionary), and a string; you will iterate over the string, and build up the encrypted string. If you see any spaces, tabs, or newlines, you will just copy them into the destination string, unchanged. For ordinary characters, you will use the map to find the encrypted character, and copy that into the output. How it Works 5.1 In the DEFLATE format, data is made up of a sequence of items, each of which is one of the following: A single character A reference to previous data in the uncompressed stream. This reference has both a distance and a length For example, suppose that you wanted to encode the text See Jack run. Jack runs up the hil. Jack and Jill roll down. We won't try to write an encode for this assignment - but suppose that you no- ticed that the word "Jack" showed up in the text a number of times. You would leave the first Jack" intact, but would replace the second with a reference: references to previous text, like this: See Jack run. (11,4) runs up the hill. Jack and Jill roll down. (In this example, we see that the 2nd Jack" is 11 characters after the first one, and we wanted to copy 4 characters.) If we were just a little smarter, we would notice that we could copy the spaces before and after the word as well: See Jack run. (11,6)runs up the hill. Jack and Jill roll down. You also want to replace the 2nd Jack." If you want, you can make it point at the first ome: See Jack run. (11,6)runs up the hill. (35,6) and Jill roll down. or at the second one, which works just as well: See Jack run. (11,6)runs up the hill. (24,6) and Jill roll down. Of course, there are lots of other duplicates in the string: See Jack run. (11,6)runs up the hill(24,8)and Jill roll down. See Jack run. (11,6)runs up the hill (24,8)and Ji(16,2) roll down. See Jack run. (11,6)runs up the hil1(24,8)and Ji(16,2) ro(5,3)down. See Jack run. (11,9)s up the hill(24,8)and Ji(16,2) ro(5,3)down. How We'll Represent the Encoding 5.2 We will represent a compressed data stream as a list of items. Each item will be one of two things: either a single character, or a tuple which contains two integers. The single character represents a single character in the output; the tuple represents a backwards reference. So the compressed data above would be encoded like this: 'e', ['S', 'e' 'c', 'J', 'a' 'k' 'r', 'u', 'n', (11,9), 's', and so on 7 Required Function: build map() build map() must take a single parameter, which is the location of a file. You must open the file and build a substitution map from it; return the map. The file format will be as described earlier in this file; as with the previous assignment, you must ignore any blank lines, or lines that contain comments. You may assume that the file exists; you do not have to do any error checking to handle a situation where the file can't be opened. 7.1 asserts Your function must use asserts to make sure that each of the following is true, for every line (except blank lines and comments): The line has exactly two words (we don't care about how much whitespace, or where) Both words on the line must be a single character The same "from" character in the map must not be used twice. Note that there are many other things that might be interesting to check, such as confirming that the "to" characters are not duplicated. Don't do these checks in this function; tolerate silly maps. 1Since this is a function - not the main program itself - do not do the chdir code. Assume that main () gives you a filename which will work, without changing directories. 8 Required Function: is valid map() is valid map () does some more extensive error checking - more than is done by build map(). It is designed either to be used as a second step (sanity-checking something that build map() returned), or as checking a map that was build by hand, using some other process. This function takes a single parameter (the map), and returns either True or False; it only returns True if all of the requirements are met. This function should never assert about anything; instead, return False if you find a problem. This function must confirm that: Every "from" character (the keys) and every to" character (the values) are strings. Also, every such string is exactly one character long. Also, none of those strings are the space, tab, or newline character. HINT: What does this code do? if c in " \t ": There are no duplicate "to" characters. The set of "from" characters is exactly the same as the set of to" char- acters. 9 Required Function: encode () 9. encode (), in subs_cipher.py, must encode a string. It takes two parameters: the substitution map, and the input string. It must return the cipher string. See the description, earlier in this spec, to see how the encryption process works. 9.1 asserts This function must call is-valid map() to check that the map is reasonable, and must assert that the function returns True. You must not implement is valid map() in this file! Instead, you must import the build map.py file, and call the function that is defined in there. 10 Required Function: decode() decode (), in subs.cipher.py, must decode a string. It takes works exactly like encode (), except that it must reverse the mapping. Required Function: unzip() 11 unzip() takes a compressed stream as its only parameter, and returns the uncompressed string. The compressed stream is encoded as described earlier in this spec; the stream is a list, and each element is either a string (with a single character), or a tuple with two integers. 11.1 asserts This function must assert that: Every element in the compressed stream is the proper type (string or tuple) Every string is exactly one character long Every tuple has exactly two elements, both of which are positive integers The "offset" (the first element in the tuple) does not point back any further than the start of the uncompressed string

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Data And Information Quality Dimensions, Principles And Techniques

Authors: Carlo Batini, Monica Scannapieco

1st Edition

3319241060, 9783319241067

More Books

Students also viewed these Databases questions

Question

Outline Aquinass methodology.

Answered: 1 week ago