Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Part 1(10 Points ADNA string is a sequence of the letters a.c. g. and tin any onder. For example, aacgtttgtaaccag is a DNA string of

image text in transcribed
image text in transcribed
Part 1(10 Points ADNA string is a sequence of the letters a.c. g. and tin any onder. For example, aacgtttgtaaccag is a DNA string of length 15. Each sequence of three consecutive codon. For letters is called a example, in the precedingsting the codons are aac.gt, tgt, aac, and cag. If we ignored the rst letter and started listing the codons starting atthe second a, the codons would be acg, tt, gta, and acc and we wouldigore thelast ag. In this exercise, for simplicity, we will assumethat we always start reading the codons at the rst letter of the string. ADNA string can be hundreds of thousands of codons long even millions of codons long, which means that it is infeasible to count them by hand. would be useful to have a simple script that could count the number of occurrences of a speci c codon in such a string. For instance, for the example string above such a script would tell us that aac occurs three times and tgt occurs once. Your job is to write a script named countcodons that expects two arguments on the command line. The rst argument is a three letter codon string such as aaa or ogt The second argument is the pathname of a le containing a valid DNA string with no newine characters or white space characters of any kind within it This le contains nothing but a sequence of the letters a, c, g, and t your script is given two vald arguments output a single number, which is the number of occurrences of the codongiven as argument 1 in the le given as argument 2. rinds no occurrences. should output 0. For example, if the string aacgttgtaaccagaac is in a le named dinafile, then your saipt should work like this: countoodons titt dnafilee countcodons aac dinafile countoodons coc dinafile Warning: if it is given vaid arguments the script is not to output anything but anumber. No fancy messages, no words just a number The script should checkthat ithas two arguments and if it does not, it should print a how-to-use-me and then exit ltis not required to check that the le isin the proper form, or that the string is actually acodon. However, for (+3 extra credit, it should print an error message and exit ifthe le cannot be opened or if it is not a le containing only the four letters, a, c.g. and t Itmust do both to receive the credit. Hint: You will not be able to solve this problem using the grep command alone. There are a number of commands that might be useful. such as sort.cut, fold, and uniq. One of these commands is the key that makes this task easy to solve. Find out which one it is and use it

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Relational Database Design A Practical Approach

Authors: Marilyn Campbell

1st Edition

1587193175, 978-1587193170

More Books

Students also viewed these Databases questions