Question
Instructions Write a script named countmatches that expects at least two arguments on the command line. The first argument is the pathname of a dna
Instructions
Write a script named countmatches that expects at least two arguments on the command line.
The first argument is the pathname of a dna file containing a valid DNA string with no newline characters or white space characters of any kind within it. (It might be terminated with a newline character.) This dna file contains nothing but a sequence of the bases a, c, g, and t in any order.
The remaining arguments are strings containing only the bases a, c, g, and t in any order.
Error checking: The script should check that the first argument is a file name, and that there is at least one other argument after it. If the first argument is not a file name or if it is missing anything after the filename, the script should output to the user the appropriate user error, a how-to-use-me message, and then exit. The script is not required to check that the file is in the proper form, or that the strings contains nothing but the letters a, c, g, and t. The script is not required to check that the dna file contains a number of bases equal to a multiple of 3. |
For each valid argument string, the program will search the DNA string in the file and count how many non-overlapping occurrences of that argument string are in the DNA string. To make sure you understand what nonoverlapping means, the string ata occurs just once in the string atata, not twice, because the two occurrences overlap.
If your script is called correctly, it will output for each argument a line containing the argument string followed by how many times it occurs in the string. If it finds no occurrences, it should output 0 as a count.
For example, if the string aaccgtttgtaaccggaac is in a file named dnafile, then your script should work like this:
$ ./countmatches dnafile ttt ttt 1 $ countmatches dnafile aac ggg aaccg aac 3 ggg 0 aaccg 2 |
Warning: if it is given valid arguments, the script is not to output anything except the strings and their associated counts. No fancy messages, no words!
Hint: You can write this script using grep and one filter command that appears in the course material. Although there are many filters commands, you do not need all of them to write the script. You have to read more about grep to know how to use it. The other filter command appears in the course material already.
Testing There are DNA text files in the cs132 course directory, /data/biocs/b/student.accounts/cs132/data/dna_textfiles to give to your script as file arguments for testing. Your program should work for any such files. Your program should be able to accept any kind of command line arguments passed in as absolute pathnames, relative pathnames, and your own testing files, not just those located in the directory! |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started