Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Use Python Code only. Please help me with this project.I have spent two days but still can't figure out what to do. I need codes

Use Python Code only.

Please help me with this project.I have spent two days but still can't figure out what to do. I need codes in python 3.3 or 3.6 only. you can dis regard C language. I am posting below my code. I have the codes for part 1 but I need for next 3 parts because they are related.U

I do know I need to change keys, so please mention in code where I have to change. Thank you so much Sir/Mam.

Part 1

Implement a function with C prototype

float bhatt_dist(float D1[],float D2[],int n)

or with Python prototype

def bhatt_dist(D1, D2, n)

which computes the Bhattacharyya distance between D1 and D2, as described above. You should be able to use D1 = [1 2, 1 2] and D2 = [ 59 100, 41 100] as a test case (the correct output is given above). You will need to include the math library, which is done like this in C:

#include

and in python:

import math

The main function you will need from math.h (C) or math (Python) is the natural logarithm, which is denoted log in math.h (C), or as math.log (Python). To compile a program which uses the math library you must append -lm to your compilation command, as in

gcc file.c -lm

for C implementations.

In order to use our new function for the substitution cipher, we need some probability distributions. For us these will just be float arrays of length 26. That is,

float D1[26]; // I will be a probability distrubution

in C, or an array called D1 in Python which contains 26 entries. The understanding here is that D1[c] for the letter c will be the relative frequency of c in some given document.

Here is the code for the Part 1

import math def bhatt_dist(D1, D2, n): BCSum = 0 for i in range(n): BCSum += math.sqrt(D1[i] * D2[i]) DBValue = - math.log(BCSum) return DBValue D1 = [1 , 2, 1, 2] D2 = [ 59 ,100, 41, 100] print bhatt_dist(D1, D2, 2)

Part 2

Write a function that takes a lename and creates a single letter frequency distribution for all the chars (i.e. bytes) occurring in the le. Such a function in C might look roughly like the following:

void mkdist(float* D,char* filename) { FILE *fp = fopen(filename,"r"); /*write some code here to initialize some things*/ while((c=fgetc(fp))!=EOF) { /*write some code here to process the char c*/ } /*write some more code here to finish things up*/ fclose(fp); }

I give you this code mainly to stop you from worrying about how to open a le, read from a le, and close the le. All of this is done for you but you still need to write the actual code to build the probability distribution. Remember that at the end of the method D[i] should be the number of occurrences of i in the le, divided by the total number of bytes (chars) in the le. Note that we wrote a Python implementation last class which opens a document and saves this distribution in a le. If you are using Python, you may modify this code to return a list of 26 numbers corresponding to the count of each letter in the alphabet.

Part 3. Statistical Analysis of Files

Write a program that takes two lenames as inputs. The output of the program should be the Bhattacharyya distance between the single letter frequency distributions resulting from each of the les, respectively. Note that to implement this, you will need to use the two functions dened in the rst two challenges.

You have three les in your folder: sample.txt, file1.txt, and file2.txt. The le sample.txt is a large sample of writing in English which you will use to build a statistical prole for letter distribution in the English language. Of the other two les, one is an encryption of a document written in English and one is a random collection of letters.

To see what all this is good for, run the Bhattarchaya distance function on the three text les provided to your group. Find the distance between sample.txt and file1.txt, and the distance between sample.txt and file2.txt. In your written report, write down the results of these two comparisons and analyze what these results mean. Exactly one of these les is an encryption of a document written in English. In your report, explain which le is the le written in English and justify your reasoning.

Part 4: Decrypting the File

Your next project is to decrypt the le which you determined is in English. You do not know the key. However, you have run a statistical analysis of the le. There are multiple methods to deduce the key: Compare the distribution of the encrypted letters to the distribution of English letters that you deduced from sample.txt. For instance, if E is the most common letter in the English language then it would make sense to assume that the most common letter in the ciphertext is the value for E. Look at the ciphertext and observe patterns in the words. Single-letter words are likely to be encryptions of the one-letter English words A or I, while a frequently seen three-letter word ending in the encryption of E deduced above is likely to be THE. Get as creative as you wish!

Test dierent keys which you determined using the above methods until you nd a decryption of the text. You will see that the text is a classic work of literature. Write the name of the book you decrypted in your report, the key you deduced, and explain the process you used to nd the key.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Fundamentals Study Guide

Authors: Dr. Sergio Pisano

1st Edition

B09K1WW84J, 979-8985115307

More Books

Students also viewed these Databases questions

Question

1. What causes musculoskeletal pain?

Answered: 1 week ago