Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

solve it in python plz coronavirus.txt MN908947.3 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAA CGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAAC TAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTG TTGCAGCCGATCATCAGCACATCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTC CCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTAC GTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG CTTAGTAGAAGTTGAAAAAGGCGTTTTGCCTCAACTTGAACAGCCCTATGTGTTCATCAAACGTTCGGAT You

solve it in python plz
image text in transcribed
coronavirus.txt
MN908947.3 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome
ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAA
CGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAAC
TAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTG
TTGCAGCCGATCATCAGCACATCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTC
CCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTAC
GTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG
CTTAGTAGAAGTTGAAAAAGGCGTTTTGCCTCAACTTGAACAGCCCTATGTGTTCATCAAACGTTCGGAT
You are provided with a txt file (coronavirus x) containing the complete genome of Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1. The following is a part (substring of the genome: ATTAAAGGTTTATACCTTOCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGT TCTCTA ACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCAC GCAGTATAATTAATAACTAATTACTGTCGTTGA This is a reference: https://www.ncbi.nlm.nih.govinu.com/MN908947.37reporttasta Question Create a dictionary (In Python) for the most frequent words tk-mers) in the genomic String. The number k stands for the length of the word For example, * = 3, the question is then to find the most frequent words composed of 3 characters Given for instance the following Genomic String ACGTTGCATGTCGCATGATGCATGAGAGCT ifk = 4, the most 4-mers are GCAT and CATG ACGTTGCATGTCGCATGATGCATGAGAGCT ACGTTGCATGTCGCATGATGCATGAGAGCT Each word appears 3 times Example of dictionary TOAT ATGE TAG GAGA AGAG erst TCAT, CATCH Note: The coronavirus.txt text fie begins with a header. You should skip it before you start reading the words

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional SQL Server 2012 Internals And Troubleshooting

Authors: Christian Bolton, Justin Langford

1st Edition

1118177657, 9781118177655

More Books

Students also viewed these Databases questions

Question

In an Excel Pivot Table, how is a Fact/Measure Column repeated?

Answered: 1 week ago