Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

python -hashing Implement a hashing function for 3-mer. That is, given a 3-mer, e.g., AAA, you should return an integer, e.g., 1. We take the

python -hashing

Implement a "hashing" function for 3-mer. That is, given a 3-mer, e.g., AAA, you should return an integer, e.g., 1. We take the natural order of sorted nuleotides, i.e., AAA -> 1, AAC -> 2, AAG -> 3, ..., TTT -> 64 as there are 64 possible combinations.

Hint:

Consider encode A, C, G, T as [0, 1, 2, 3]. Then, a 3-mer can be considered as a 3-digit 4 based integer (Quaternary). For example, ATC would be a 4 based number "031" which can be converted to a number in decimal numerical system by:

0 * 4^2 + 3 * 4^1 + 1 * 4^0 = 13

Since we require the number starts from 1, you need to add 1 to the final results. Hence, you should return 14 in the your function.

2. Now consider we want to know the 3-mer frequency for a DNA sequence, write a function that utilize the "hashing function" you did in problem 1 and return the 3-mer frequency in a DNA sequence. You should use a sliding window with step size of 1 to iterate through the sequence. For example, given input "ATTATTGC", you should return:

"ATT: 2, TTA: 1, TAT: 1, TTG: 1, TGC: 1".

Please submit your code in one single scripts. And show your output for the example inputs in comments.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions