Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Summary of what gets turned in: A bunch of single .c files for each of the utilities below: wcat.c , wgrep.c , wzip.c , and

Summary of what gets turned in:

  • A bunch of single .c files for each of the utilities below: wcat.c, wgrep.c, wzip.c, and wunzip.c.
  • Each should compile successfully when compiled with the -Wall and -Werror flags.
  • Each should (hopefully) pass the tests we will discuss.

wcat

The program wcat is a simple program. Generally, it reads a file as specified by the user and prints its contents. A typical usage is as follows, in which the user wants to see the contents of main.c, and thus types:

prompt> ./wcat main.c

#include

...

As shown, wcat reads the file main.c and prints out its contents. The "./" before the wcat above is a UNIX thing; it just tells the system which directory to find wcat in (in this case, in the "." (dot) directory, which means the current working directory).

To create the wcat binary, you'll be creating a single source file, wcat.c, and writing a little C code to implement this simplified version of cat. To compile this program, you will do the following:

prompt> gcc -o wcat wcat.c -Wall -Werror

prompt>

This will make a single executable binary called wcat which you can then run as above.

You'll need to learn how to use a few library routines from the C standard library (often called libc) to implement the source code for this program, which we'll assume is in a file called wcat.c. All C code is automatically linked with the C library, which is full of useful functions you can call to implement your program. Learn more about the C library here (Links to an external site.) and perhaps here (Links to an external site.)1 (Links to an external site.).

For this project, we recommend using the following routines to do file input and output: fopen(), fgets(), and fclose(). Whenever you use a new function like this, the first thing you should do is read about it -- how else will you learn to use it properly?

On UNIX systems, the best way to read about such functions is to use what are called the man pages (short for manual). In our HTML/web-driven world, the man pages feel a bit antiquated, but they are useful and informative and generally quite easy to use.

To access the man page for fopen(), for example, just type the following at your UNIX shell prompt:

prompt> man fopen

Then, read! Reading man pages effectively takes practice; why not start learning now?

We will also give a simple overview here. The fopen() function "opens" a file, which is a common way in UNIX systems to begin the process of file access. In this case, opening a file just gives you back a pointer to a structure of type FILE, which can then be passed to other routines to read, write, etc.

Here is a typical usage of fopen():

FILE *fp = fopen("main.c", "r");

if (fp == NULL) {

printf("cannot open file ");

exit(1);

}

A couple of points here. First, note that fopen() takes two arguments: the name of the file and the mode. The latter just indicates what we plan to do with the file. In this case, because we wish to read the file, we pass "r" as the second argument. Read the man pages to see what other options are available.

Second, note the critical checking of whether the fopen() actually succeeded. This is not Java where an exception will be thrown when things goes wrong; rather, it is C, and it is expected (in good programs, i.e., the only kind you'd want to write) that you always will check if the call succeeded. Reading the man page tells you the details of what is returned when an error is encountered; in this case, the macOS man page says:

Upon successful completion fopen(), fdopen(), freopen() and fmemopen() return

a FILE pointer. Otherwise, NULL is returned and the global variable errno is

set to indicate the error.

Thus, as the code above does, please check that fopen() does not return NULL before trying to use the FILE pointer it returns.

Third, note that when the error case occurs, the program prints a message and then exits with error status of 1. In UNIX systems, it is traditional to return 0 upon success, and non-zero upon failure. Here, we will use 1 to indicate failure.

Side note: if fopen() does fail, there are many reasons possible as to why. You can use the functions perror() or strerror() to print out more about why the error occurred; learn about those on your own (using ... you guessed it ... the man pages!).

Once a file is open, there are many different ways to read from it. The one we're suggesting here to you is fgets(), which is used to get input from files, one line at a time.

To print out file contents, just use printf(). For example, after reading in a line with fgets() into a variable buffer, you can just print out the buffer as follows:

printf("%s", buffer);

Note that you should not add a newline ( ) character to the printf(), because that would be changing the output of the file to have extra newlines. Just print the exact contents of the read-in buffer (which, of course, many include a newline).

Finally, when you are done reading and printing, use fclose() to close the file (thus indicating you no longer need to read from it).

Details

  • Your program wcat can be invoked with one or more files on the command line; it should just print out each file in turn.
  • In all non-error cases, wcat should exit with status code 0, usually by returning a 0 from main() (or by calling exit(0)).
  • If no files are specified on the command line, wcat should just exit and return 0. Note that this is slightly different than the behavior of normal UNIX cat (if you'd like to, figure out the difference).
  • If the program tries to fopen() a file and fails, it should print the exact message "wcat: cannot open file" (followed by a newline) and exit with status code 1. If multiple files are specified on the command line, the files should be printed out in order until the end of the file list is reached or an error opening a file is reached (at which point the error message is printed and wcat exits).

wgrep

The second utility you will build is called wgrep, a variant of the UNIX tool grep. This tool looks through a file, line by line, trying to find a user-specified search term in the line. If a line has the word within it, the line is printed out, otherwise it is not.

Here is how a user would look for the term foo in the file bar.txt:

prompt> ./wgrep foo bar.txt

this line has foo in it

so does this foolish line; do you see where?

even this line, which has barfood in it, will be printed.

Details

  • Your program wgrep is always passed a search term and zero or more files to grep through (thus, more than one is possible). It should go through each line and see if the search term is in it; if so, the line should be printed, and if not, the line should be skipped.
  • The matching is case sensitive. Thus, if searching for foo, lines with Foo will not match.
  • Lines can be arbitrarily long (that is, you may see many many characters before you encounter a newline character, ). wgrep should work as expected even with very long lines. For this, you might want to look into the getline() library call (instead of fgets()), or roll your own.
  • If wgrep is passed no command-line arguments, it should print "wgrep: searchterm [file ...]" (followed by a newline) and exit with status 1.
  • If wgrep encounters a file that it cannot open, it should print "wgrep: cannot open file" (followed by a newline) and exit with status 1.
  • In all other cases, wgrep should exit with return code 0.
  • If a search term, but no file, is specified, wgrep should work, but instead of reading from a file, wgrep should read from standard input. Doing so is easy, because the file stream stdin is already open; you can use fgets() (or similar routines) to read from it.
  • For simplicity, if passed the empty string as a search string, wgrep can either match NO lines or match ALL lines, both are acceptable.

wzip and wunzip

The next tools you will build come in a pair, because one (wzip) is a file compression tool, and the other (wunzip) is a file decompression tool.

The type of compression used here is a simple form of compression called run-length encoding (RLE). RLE is quite simple: when you encounter n characters of the same type in a row, the compression tool (wzip) will turn that into the number n and a single instance of the character.

Thus, if we had a file with the following contents:

aaaaaaaaaabbbb

the tool would turn it (logically) into:

10a4b

However, the exact format of the compressed file is quite important; here, you will write out a 4-byte integer in binary format followed by the single character in ASCII. Thus, a compressed file will consist of some number of 5-byte entries, each of which is comprised of a 4-byte integer (the run length) and the single character.

To write out an integer in binary format (not ASCII), you should use fwrite(). Read the man page for more details. For wzip, all output should be written to standard output (the stdout file stream, which, as with stdin, is already open when the program starts running).

Note that typical usage of the wzip tool would thus use shell redirection in order to write the compressed output to a file. For example, to compress the file file.txt into a (hopefully smaller) file.z, you would type:

prompt> ./wzip file.txt > file.z

The "greater than" sign is a UNIX shell redirection; in this case, it ensures that the output from wzip is written to the file file.z (instead of being printed to the screen). You'll learn more about how this works a little later in the course.

The wunzip tool simply does the reverse of the wzip tool, taking in a compressed file and writing (to standard output again) the uncompressed results. For example, to see the contents of file.txt, you would type:

prompt> ./wunzip file.z

wunzip should read in the compressed file (likely using fread()) and print out the uncompressed output to standard output using printf().

Details

  • Correct invocation should pass one or more files via the command line to the program; if no files are specified, the program should exit with return code 1 and print "wzip: file1 [file2 ...]" (followed by a newline) or "wunzip: file1 [file2 ...]" (followed by a newline) for wzip and wunzip respectively.
  • The format of the compressed file must match the description above exactly (a 4-byte integer followed by a character for each run).
  • Do note that if multiple files are passed to *wzip, they are compressed into a single compressed output, and when unzipped, will turn into a single uncompressed stream of text (thus, the information that multiple files were originally input into wzip is lost). The same thing holds for wunzip. image text in transcribedimage text in transcribed
Wcat Check the value of argc -> if no command line arguments were passed (i.e., argc == 1), then simply exit. If there are command line arguments, iterate through each file name in argvl] use fopen() to open each file name in read mode use fgets or getline to read each line from the file print each line to stdout dlose the file before continuing the the next filename, if there is one Matthew Majewki Grep Check the value of argc If it is 1, exit with a status of 1 If it is 2, treat the first argument as the search term, and search using stdin instead of an open FILE* If it is > 2, treat the first argument as the search tearm, and treat the rest of the arguments as search files Iterate over each file like in wcat read each line from the file using getline Check if the line contains the search term using strstr() If it does, print the line Close the file before continuing to the next, if there is one Matthew Majwski Wzip Check the value of argc -> if no command line arguments were passed, then exit with status 1 iterate over each file like in wcat and wgrep Read each character using fgetc(), keeping track of the previously read character and a count as well If the read character is equal to the last character, increment count If the read character is not equal Write the integer count using fwrite() to the output file Write the character using fwrite() following the count set the current character to the last character reset the count Wek O Matthew Majwski Wunzip Essentially just does the inverse of wzip Check the value of argc -> if no command line args were passed, then exit with status 1 Iterate through files like previous programs For each file, read an integer value using fread() then read the character value using fread() as well print the read character n number of times, where n is the integer value read from the file continue until fread() returns 0 Continue to the next file, if there is one FN DOF WIi G Matthew Majewski Wcat Check the value of argc -> if no command line arguments were passed (i.e., argc == 1), then simply exit. If there are command line arguments, iterate through each file name in argvl] use fopen() to open each file name in read mode use fgets or getline to read each line from the file print each line to stdout dlose the file before continuing the the next filename, if there is one Matthew Majewki Grep Check the value of argc If it is 1, exit with a status of 1 If it is 2, treat the first argument as the search term, and search using stdin instead of an open FILE* If it is > 2, treat the first argument as the search tearm, and treat the rest of the arguments as search files Iterate over each file like in wcat read each line from the file using getline Check if the line contains the search term using strstr() If it does, print the line Close the file before continuing to the next, if there is one Matthew Majwski Wzip Check the value of argc -> if no command line arguments were passed, then exit with status 1 iterate over each file like in wcat and wgrep Read each character using fgetc(), keeping track of the previously read character and a count as well If the read character is equal to the last character, increment count If the read character is not equal Write the integer count using fwrite() to the output file Write the character using fwrite() following the count set the current character to the last character reset the count Wek O Matthew Majwski Wunzip Essentially just does the inverse of wzip Check the value of argc -> if no command line args were passed, then exit with status 1 Iterate through files like previous programs For each file, read an integer value using fread() then read the character value using fread() as well print the read character n number of times, where n is the integer value read from the file continue until fread() returns 0 Continue to the next file, if there is one FN DOF WIi G Matthew Majewski

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Database Management

Authors: Fred R. McFadden, Jeffrey Slater, Mary B. Prescott

5th Edition

0805360549, 978-0805360547

More Books

Students also viewed these Databases questions