Question
python -hashing Implement a hashing function for 3-mer. That is, given a 3-mer, e.g., AAA, you should return an integer, e.g., 1. We take the
python -hashing
Implement a "hashing" function for 3-mer. That is, given a 3-mer, e.g., AAA, you should return an integer, e.g., 1. We take the natural order of sorted nuleotides, i.e., AAA -> 1, AAC -> 2, AAG -> 3, ..., TTT -> 64 as there are 64 possible combinations.
Hint:
Consider encode A, C, G, T as [0, 1, 2, 3]. Then, a 3-mer can be considered as a 3-digit 4 based integer (Quaternary). For example, ATC would be a 4 based number "031" which can be converted to a number in decimal numerical system by:
0 * 4^2 + 3 * 4^1 + 1 * 4^0 = 13
Since we require the number starts from 1, you need to add 1 to the final results. Hence, you should return 14 in the your function.
2. Now consider we want to know the 3-mer frequency for a DNA sequence, write a function that utilize the "hashing function" you did in problem 1 and return the 3-mer frequency in a DNA sequence. You should use a sliding window with step size of 1 to iterate through the sequence. For example, given input "ATTATTGC", you should return:
"ATT: 2, TTA: 1, TAT: 1, TTG: 1, TGC: 1".
Please submit your code in one single scripts. And show your output for the example inputs in comments.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started