Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Instructions Write a script named countmatches that expects at least two arguments on the command line. The first argument is the pathname of a dna

Instructions

Write a script named countmatches that expects at least two arguments on the command line.

The first argument is the pathname of a dna file containing a valid DNA string with no newline characters or white space characters of any kind within it. (It might be terminated with a newline character.) This dna file contains nothing but a sequence of the bases a, c, g, and t in any order.

The remaining arguments are strings containing only the bases a, c, g, and t in any order.

Error checking:

The script should check that the first argument is a file name, and that there is at least one other argument after it. If the first argument is not a file name or if it is missing anything after the filename, the script should output to the user

the appropriate user error,

a how-to-use-me message,

and then exit.

The script is not required to check that the file is in the proper form, or that the strings contains nothing but the letters a, c, g, and t.

The script is not required to check that the dna file contains a number of bases equal to a multiple of 3.

For each valid argument string, the program will search the DNA string in the file and count how many non-overlapping occurrences of that argument string are in the DNA string. To make sure you understand what nonoverlapping means, the string ata occurs just once in the string atata, not twice, because the two occurrences overlap.

If your script is called correctly, it will output for each argument a line containing the argument string followed by how many times it occurs in the string. If it finds no occurrences, it should output 0 as a count.

For example, if the string aaccgtttgtaaccggaac is in a file named dnafile, then your script should work like this:

$ ./countmatches dnafile ttt ttt 1 $ countmatches dnafile aac ggg aaccg aac 3 ggg 0 aaccg 2

Warning: if it is given valid arguments, the script is not to output anything except the strings and their associated counts. No fancy messages, no words!

Hint: You can write this script using grep and one filter command that appears in the course material. Although there are many filters commands, you do not need all of them to write the script. You have to read more about grep to know how to use it. The other filter command appears in the course material already.

Testing

There are DNA text files in the cs132 course directory,

/data/biocs/b/student.accounts/cs132/data/dna_textfiles

to give to your script as file arguments for testing. Your program should work for any such files.

Your program should be able to accept any kind of command line arguments passed in as

absolute pathnames,

relative pathnames, and

your own testing files,

not just those located in the directory!

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Mastering Apache Cassandra 3 X An Expert Guide To Improving Database Scalability And Availability Without Compromising Performance

Authors: Aaron Ploetz ,Tejaswi Malepati ,Nishant Neeraj

3rd Edition

1789131499, 978-1789131499

More Books

Students also viewed these Databases questions

Question

2. Outline the business case for a diverse workforce.

Answered: 1 week ago