Question
Breaking Substitution Ciphers It is possible to break simple substitution ciphers without even using a computer. This is why, for example, such puzzles appear in
Breaking Substitution Ciphers
It is possible to break simple substitution ciphers without even using a computer. This is why, for example, such puzzles appear in newspapers. In this homework assignment, you will write a program that automatically decodes a variant of these codes without knowing the codestring in advance.
Codeword Substitution
One approach to defining a codestring for a substitution cipher is to specify a codeword instead. The idea is that a codeword (with no repeating letters is given) and that is used as the start of the codestring. The rest of the codestring is filled in with the remaining letters of the alphabet in reverse order. For example, if the codeword is ZEBRA , then the codestring would be:
ZEBRAYXWVUTSQPONMLKJIHGFDC If the codeword is DOG , then the codestring would be:
DOGZYXWVUTSRQPNMLKJIHFECBA
You will want to be able to generate such a codestring from a codeword. Here is the code to extend the codeword into a codestring.
w = 'ZEBRA' cs = w[:] for a in 'ZYXWVUTSRQPONMLKJIHGFEDCBA':
if a not in w: cs = cs + a
print(cs) ZEBRAYXWVUTSQPONMLKJIHGFDC
Hiding Codewords in Plain Sight
In the kingdom of Vulgaria, instead of choosing a cipher in advance, the spies send coded messages
that all start with a string of 30 characters that we'll call the preamble. The codeword for the cipher is a substring of those first 30 characters. They have some secret way of finding the codeword in there, so they can decode messages.
Their idea is that they want anyone deciphering the messages to break a new code for each message. Maybe this would work until word gets out that the codeword is hiding in plain sight. Now code breakers can try all the different substrings of those first 30 characters and see which one produces a message that looks like real words.
Note: a substring is a sequence of consecutive characters in a string. You can get a substring of a string s using slicing, i.e., s[3:7] is the sting of length 4 consisting of the characters starting at index 3 and continuing up to but not including index 7. So, if s = 'abcdefghij' , then s[3:7] isdefg .
Incidentally, this is related to how the early versions of the German Enigma machine was cracked.
Using data
You can get a program to try different substrings and check if it is producing real words. You will be given a list of frequently used words. Use this list to check if you have the correct codeword. Then try all the different choices and see which one produces something the looks like a real message.
If we have a guess at what the codestring is, we can try to decode the message. Then, we can look at it to see if it resembles a real message, for example by checking if the words are real words. If there is a small number of possible codestrings, we can choose between them by checking which produces the largest
There is a file called words.py included in the skeleton. It contains a string with a big collection of words. Technically, I got this list from processing the text of Bram Stoker's Dracula and pulling out the unique words that appeared more than a certain number of times. You can import the variable w ordlist from words.py using the following line of python at the top of your ciphercracker.pyfile.
from words import wordlist
You will then be able to use the wordlist variable in your code. This approach is not considered great style, but it is easy and appropriate for this case.
To Do:
Add support to the Cipher class for codeword substitution ciphers. Implement a new class called CipherCracker that will be saved in a file called ciphercracker.py . A CipherCracker object should store a ciphertext that you are trying to decode. It should separate out the preamble from the beginning of the ciphertext. It should implement methods called decode , quality , mostlikelydecode , and mostlikelycodeword . The description of the desired input and output of each method is given in the skeleton.
TestCyphercracker
import unittest from ciphercracker import CipherCracker
class TestCipherCracker(unittest.TestCase): def setUp(self): # Here is a text encdoded with the codeword "ZEPHYR" self.ciphertext = """IWYO V HYJPYOHYH RKNQ IWY JVHY NR IWY PNZPW, ZJ IWY PZSBPWY DZJ PSNJY ZSNOXJVHY, IWY HKVFYK WYSMVOX QY DVIW Z WZOH DWVPW PZGXWI QB ZKQ VO Z XKVM NR JIYYS; WVJ JIKYOXIW QGJI WZFY EYYO MKNHVXVNGJ. DVIWNGI Z DNKH WY JWNNT WVJ KYVOJ, IWY WNKJYJ IGKOYH, ZOH DY JDYMI VOIN IWY HZKTOYJJ NR IWY MZJJ. ZJ V SNNTYH EZPT V JZD IWY JIYZQ RKNQ IWY WNKJYJ NR IWY PNZPW EB IWY SVXWI NR IWY SZQMJ, ZOH MKNUYPIYH ZXZVOJI VI IWY RVXGKYJ NR QB SZIY PNQMZOVNOJ PKNJJVOX IWYQJYSFYJ. IWYO IWY HKVFYK PKZPTYH WVJ DWVM ZOH PZSSYH IN WVJ WNKJYJ, ZOH NRR IWYB JDYMI NO IWYVK DZB IN EGTNFVOZ. ZJ IWYB JZOT VOIN IWY HZKTOYJJ V RYSI Z JIKZOXY PWVSS, ZOH Z SNOYSB RYYSVOX PZQY NFYK QY; EGI Z PSNZT DZJ IWKNDO NFYK QB JWNGSHYKJ, ZOH Z KGX ZPKNJJ QB TOYYJ, ZOH IWY HKVFYK JZVH VO YCPYSSYOI XYKQZO:-- "IWY OVXWI VJ PWVSS, QYVO WYKK, ZOH QB QZJIYK IWY PNGOI EZHY QY IZTY ZSS PZKY NR BNG. IWYKY VJ Z RSZJT NR JSVFNFVIA (IWY MSGQ EKZOHB NR IWY PNGOIKB) GOHYKOYZIW IWY JYZI, VR BNG JWNGSH KYLGVKY VI." V HVH ONI IZTY ZOB, EGI VI DZJ Z PNQRNKI IN TOND VI DZJ IWYKY ZSS IWY JZQY. V RYSI Z SVIISY JIKZOXYSB, ZOH ONI Z SVIISY RKVXWIYOYH. V IWVOT WZH IWYKY EYYO ZOB ZSIYKOZIVFY V JWNGSH WZFY IZTYO VI, VOJIYZH NR MKNJYPGIVOX IWZI GOTONDO OVXWI UNGKOYB. IWY PZKKVZXY DYOI ZI Z WZKH MZPY JIKZVXWI ZSNOX, IWYO DY QZHY Z PNQMSYIY IGKO ZOH DYOI ZSNOX ZONIWYK JIKZVXWI KNZH. VI JYYQYH IN QY IWZI DY DYKY JVQMSB XNVOX NFYK ZOH NFYK IWY JZQY XKNGOH ZXZVO; ZOH JN V INNT ONIY NR JNQY JZSVYOI MNVOI, ZOH RNGOH IWZI IWVJ DZJ JN. V DNGSH WZFY SVTYH IN WZFY ZJTYH IWY HKVFYK DWZI IWVJ ZSS QYZOI, EGI V KYZSSB RYZKYH IN HN JN, RNK V IWNGXWI IWZI, MSZPYH ZJ V DZJ, ZOB MKNIYJI DNGSH WZFY WZH ON YRRYPI VO PZJY IWYKY WZH EYYO ZO VOIYOIVNO IN HYSZB. EB-ZOH-EB, WNDYFYK, ZJ V DZJ PGKVNGJ IN TOND WND IVQY DZJ MZJJVOX, V JIKGPT Z QZIPW, ZOH EB VIJ RSZQY SNNTYH ZI QB DZIPW; VI DZJ DVIWVO Z RYD QVOGIYJ NR QVHOVXWI. IWVJ XZFY QY Z JNKI NR JWNPT, RNK V JGMMNJY IWY XYOYKZS JGMYKJIVIVNO ZENGI QVHOVXWI DZJ VOPKYZJYH EB QB KYPYOI YCMYKVYOPYJ. V DZVIYH DVIW Z JVPT RYYSVOX NR JGJMYOJY."""
def testconstructor(self): c = CipherCracker("X")
def testdecode(self): preamble = "ABA" + "X" * 27 message = "ABCDF" c = CipherCracker(preamble + message) self.assertEqual(c.decode(0,1), "AZYXV") self.assertEqual(c.decode(1,3), "BAZYW")
def testdecodewithcodewordatend(self): preamble = "X" * 27 + "ZYZ" message = "ABZCD" c = CipherCracker(preamble + message) self.assertEqual(c.decode(0,1), "ZYBXW") self.assertEqual(c.decode(27,28), "ZYAXW") self.assertEqual(c.decode(28,30), "ZYBXW")
def testquality(self): preamble = "DZEPHYRKLMNEDABCEFGHIJKLMNEFGH" c = CipherCracker(preamble + self.ciphertext) for i in range(2,20): self.assertTrue(c.quality(1,7) > c.quality(i, i+6))
def testquality2(self): preamble = "EFGHKLMNEDABCDEFGHIJKLMNZEPHYR" c = CipherCracker(preamble + self.ciphertext) for i in range(1,20): self.assertTrue(c.quality(24,30) > c.quality(i, i+6))
def testmostlikelycodeword(self): preamble = "EFGHKLMNEDABCDEFGHIJKLMNZEPHYR" c = CipherCracker(preamble + self.ciphertext) self.assertEqual(c.mostlikelycodeword(), "ZEPHYR") preamble = "KLMNEFGHIJKLMNZEPHYREDABCDEFGH" c = CipherCracker(preamble + self.ciphertext) self.assertEqual(c.mostlikelycodeword(), "ZEPHYR")
def testmostlikelydecode(self): plaintext = """IN A LITTLE WHILE THE PROFESSOR SIGNALLED TO ME, SO I GOT UP AND JOINED HIM. HE HAD FOUND A WONDERFUL SPOT, A SORT OF NATURAL HOLLOW IN A ROCK, WITH AN ENTRANCE LIKE A DOORWAY BETWEEN TWO BOULDERS. HE TOOK ME BY THE HAND AND DREW ME IN: "SEE!" HE SAID, "HERE YOU WILL BE IN SHELTER; AND IF THE WOLVES DO COME I CAN MEET THEM ONE BY ONE.""" ciphertext = """VO Z SVIISY DWVSY IWY MKNRYJJNK JVXOZSSYH IN QY, JN V XNI GM ZOH UNVOYH WVQ. WY WZH RNGOH Z DNOHYKRGS JMNI, Z JNKI NR OZIGKZS WNSSND VO Z KNPT, DVIW ZO YOIKZOPY SVTY Z HNNKDZB EYIDYYO IDN ENGSHYKJ. WY INNT QY EB IWY WZOH ZOH HKYD QY VO: "JYY!" WY JZVH, "WYKY BNG DVSS EY VO JWYSIYK; ZOH VR IWY DNSFYJ HN PNQY V PZO QYYI IWYQ NOY EB NOY.""" preamble = "KLMNEFGHIJKZEPHYRLMNEDABCDEFGH" c = CipherCracker(preamble + ciphertext) decoded = c.mostlikelydecode() self.assertEqual(decoded[:25], plaintext[:25])
if __name__ == '__main__': unittest.main()
Words examples
_wordstring = """ WARNING RETURNED ME PERSONAL WHILST NOON OCCASION REMEMBERED TEETH CAREFULLY PLEASE BEAT SURPRISED JULY CHANGE DUST SHOT OPEN
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started