Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 29, 2024

This is in C. The purpose of this programming assignment is to give you some experience with processes and input/output at the system-call level in

This is in C.

The purpose of this programming assignment is to give you some experience with processes and input/output at the system-call level in UNIX. You will need to use at least the access, fork, execve,read,write, wait and exit system calls. Make certain you read and understand all of the material in the assignment. A significant part of the code for this assignment is already provided, and you may use it if you wish; details appear later in the assignment.

The program you write will be a simple command line interpreter, or shell. It will not have nearly as much functionality as the typical shells found on a UNIX system. Instead, it will deal only with running processes and interpreting a subset of the execution sequencing operators provided by a full-featured shell.

The central part of the program involves creating a new process; this is done with the fork system call. Then, in that new process only, the program specified in a command with a specified list of command line parameters will be executed by that new process; this is done with the execve system call. Of course, theres more to the program than this.

The commands your program can execute are simple-commands (using the terminology given in the formal documentation of shell syntax). Heres the definition of a simple-command:

A simple-command is a sequence of one or more words each containing only printable non- blank characters, with the words separated from each other by blanks (a blank is a tab character '\t' or 0x09 or a space ' ' or 0x20). The first word in the sequence specifies the name of the command to be executed. All of the words are passed as arguments to the invoked commands main function. The command name is passed as argument 0. The value of a simple-command is its exit status if it terminates normally, or 128 + its exit status (non-zero) if it does not terminate normally.

A word is a string of printable ASCII characters that is, each character in a word is in the range 0x21 to 0x7e ('!' to '~'). In particular, a word does not include a blank or tab or end-of-line character. More detail on the execution of a simple-command is given below.

The input to your program (which is read from the standard input, file descriptor 0) is a sequence of lines, each of which will be no longer than 100 characters and terminated by an end of line character,

' '. Each line contains a list of zero or more simple-commands. Each pair of consecutive simple- commands in an input line is separated by one of these three sequencing operators:

';' (semicolon) Commands separated by a semicolon are executed sequentially, one at a time, with the left command executed first.

'&&' (logical and) The left command is executed first. The right command is executed only if the left command returns a value of 0.

'||' (logical or) The left command is executed first. The right command is executed only if the left command returns a non-zero value.

'&&' and '||' have equal precedence (when treated as operators), and they have higher precedence than ';'. Input examples using these sequencing operators are given later.

Process/Program Return Values

When we talk about a program or process returning a value, we are referring to the value returned by themain function (in a return statement), or the value specified as the argument to the exit system call. Consider the following main function from a simple program:

int main(int argc, char *argv[])

{

if (argc == 2)

return atoi(argv[1]);

else

return 1;

}

This program first checks to see if it was invoked with two command line arguments: specifically the name of the program (which is always the first command line argument) and another word. If not that is, if argc is not 2 the process terminates and returns the value 1. Otherwise it uses the atoi (ASCII to integer) function to convert the second command line argument (a character string at argv[1]) to an integer which is then returned by the program. The return statements in this program could have been replaced by invocations of the exit system call, and the effect would have been identical (except that we would also have needed to include the stdlib.h header file to provide the appropriate declaration of the exit system call for the C compiler):

#include

int main(int argc, char *argv[])

{

if (argc == 2)

exit(atoi(argv[1]));

else

exit(1);

}

The value returned from a process by the main function or by the exit system call is limited to a width of 8 bits by most UNIX systems, so the range of values that can be effectively returned is 0 to 255. If a larger value is specified, only the low-order 8 bits will be retained.

Example Command Lines

Lets consider some examples. Each of these can be executed on Loki (the name of the computer system we will use), and it is recommended that you experiment with the execution of these and additional variations to ensure your understanding of them. In these examples well use some simple programs.

true and false are standard UNIX programs that do nothing except return values of 0 and non-zero respectively. Note that a returned value of 0 from a process is used to indicate success; any non-zero value is used to indicate something else.

The echo program just displays the words provided as command-line arguments, so the command echo Hello will just display the word Hello. Note that the standard shells on UNIX systems will replace some command line arguments representing the value of shell variables. In particular, the command line argument $? is replaced by the value returned by the last command executed. Your program wont need to do that, but well use that feature in the examples so we see what a command returns, if we wish.

echo one ; echo two ; echo three This sequence will cause three lines to be displayed. The first line will contain one, the second line will contain two, and the third line will contain three.

echo one ; echo $? This will display a line containing one, then display a line containing the exit status of the echo one command, which should be 0.

echo one || echo two This will just display one. Since the value returned by exit in the echo command is zero, and the or operator (that is, ||) only indicates the execution of the right simple-command if the left simple-command failed (that is, returned a non-zero value), the echo two command is not executed.

echo one && echo two This will display one, then two. The and (that is, &&) operator indicates that execution of the right simple-command should take place only if the left simple-command succeeds (that is, returns zero), which it does in this case.

false || echo two Displays two.

false && echo two Displays nothing.

echo one || echo two || echo three Displays one.

false ; echo one || echo two Displays one.

true && echo one ; false && echo two Displays one.

echo one two three && echo four Displays two lines, one with one two three, and the other with four.

Make certain you understand what the shell is doing for these cases before you begin to implement your solution to this assignment! If you are brave, or curious, or both, you can use the command man bash to view a reasonably terse description of the full-featured shell you likely are using on Loki.

Problem Statement

Specifically, you are to write a program that will repeatedly do the following things:

1. Read1 a command line from the standard input (file descriptor 0). At the end of file (that is, when the read system call returns 0), just quit (using the exit system call).

1 Be certain to consider the information given in the next section, Details, on reading the command line.

2. If the line just read contains only blanks and tab characters, ignore it (it contains no simple- command in this case), and return to step 1.

3. Identify the words on the command line. Recall that each word is just a sequence of printable non-blank (space or tab) characters. In our shell we assume at least one space or tab character will separate each word from the next. We also assume that whitespace is used to separate ';'

'&&' and '||' from words, and that each input line is syntactically correct. You can choose to

separate the words into simple-commands immediately, or do this as part of step 4 (as needed).

4. Attempt to execute the first simple-command in the command line; a description of the steps necessary to do this are given below. There are three possible results of this execution:

a. the simple-command was executed and returned a value of 0 (that is, an exit status of 0)

b. the simple-command was executed and returned a non-zero value (or exit status)

c. the simple-command could not be executed because the program named in the command could not be found.

In cases (a) and (b) your shell should decide (based on the value returned by the execution of the simple-command) which of the remaining simple-commands in the command line is to be executed. Of course, if no additional simple-command exists, then your shell should return to step 1. If there are additional simple-commands that should be executed, then this step (that is, step 4) is repeated for the next one.

In case (c), which is described in more detail later, your shell should just display an appropriate message (for example The program could not be found.) and return to step 1.

Details

To read a command line you must use the read system call. Do not use any other input mechanism (such as a function provided by the C library). And remember: dont use any language other than C! To avoid extra complexity, assume you are reading commands from a disk file, and read a single byte at a time.That is, use code that looks something like this to read every input character:

char c;

int status;

status = read(0, &c, 1);

if (status == 0)

do end of file action

As you can see, the file descriptor to use is 0, which corresponds to the standard input. When read returns 0 you have encountered the end of file, and your shell should terminate immediately. When the single character read is an end of line character (' '), then you have encountered the end of a line. You should echo that is, write to the standard output (file descriptor 1) the input you read; when youre reading from the keyboard youll get an extra copy of each input line (since the standard input mechanism echoes each character as it is typed), but when reading from a disk file, youll get a display of the command line youre about to process2. You may assume that no input line contains more than 100 characters (including the end of line character).

2 The code provided for your use will determine if the input is from the keyboard or not, and will not echo the input if it does come from the keyboard. It will also display a prompt character to indicate when a new line is to be entered.

A simple parsing operation is necessary to separate the input command line into words, and to determine if there are any words present. In this simple shell, we require each word to be separated from other words (or sequencing operators) by one or more blanks or tab characters (i.e. whitespace). You may assume there will be no more than 16 words in any command line, including the sequencing operators. You may use standard C library functions like strtok to assist in this step. You may also assume no word will be longer than 64 characters, including the null ('\0') terminating character.

The name of the command to be executed that is, the first word in each simple-command can be specified either as a complete path to the commands executable file (e.g. /bin/false), as a path relative to the current working directory (e.g. ./myprog), or as just a filename (e.g. true). If this word includes a '/', then it is either a relative or absolute path to the file that is expected to contain the executable code and data for the command. That path can be used literally without change to identify the executable file.

If the name does not include '/', then must be the name of a file that appears in one of the directories specified in the value of the PATH environment variable, which is a colon-separated list of the names of directories in which the shell is to search for an executable file. You can see what the PATH environment variable contains by typing the command echo $PATH.

Suppose I enter the command line doit to it . Since doit does not contain a '/', the shell will first search the /usr/local/sbin directory for a file named doit, then it will search

/usr/local/bin. If a file doit is not found in any of those directories, then an error should be reported (something like The program could not be found), and the shell should repeat from step 1.

Parsing the value of the PATH environment variable is easy using the strtok function. However you must keep in mind that the string being parsed by strtok is modified during the parsing operation. This will not cause any difficulties during the processing of the first input command, but when you want to locate the directory containing the file named on the second command line youll want to use the unmodified value of the PATH environment variable. So it is appropriate to make a copy of the PATH environment variables value before using strtok on it.

We we need to be able to determine the value of PATH environment variable, and, the getenv function is provided for that purpose. (The on-line description of this function can be found using the command man getenv.) This function expects a character string argument that identifies the environment variable of interest "PATH" in our case and it returns a pointer to a character string containing the value of that environment variable (if it exists), or a NULL pointer (if it did not exist). Since were interested in the value of the PATH environment variable, an appropriate function invocation might be getenv("PATH"). Making a copy of the string returned by getenv is simple use the strdup function. The strtok function can then be used on the copy to identify the individual directory names. (Remember to use the free function to release the dynamically-allocated storage used for the copy.)

3 The font size has been reduced from the rest of this document to make this fit on a single line.

Youll also find mention of a caveat about using the strtok function on this value.

To determine if a file exists and is executable you should use the access system call. This call has two arguments. The first is the path to a file, and the second is the desired access mode to the file. In our case, we specifically want to test for executability, so the second argument should be X_OK (which is defined by including the header file unistd.h). If access returns 0, then the file exists and is executable (at least as far as the files permissions indicate). Otherwise, the file does not exist, or is not executable. (Notice that the first argument to access must be the path to the file, not just the file name. As a result, youll need to concatenate a directory name from the PATH environment variable, a slash, and a file name (that is, the first word in the command) to yield a suitable string for the first argument to access.) For example, with thePATH environment variables value mentioned above, and the command named frick, youd first want to effectively do this:

status = access("/usr/local/sbin/frick", X_OK);

where status is an integer variable. If its value is 0 on return from the access system call, then you know that you should use /usr/local/sbin/frick as the pathname of the file to be executed for the command.

Of course, this must be performed for each command to be executed. If a command is not to be executed, then it makes no difference if the program named in the command exists or not.

Executing a command is not too difficult. Use the fork system call to create a process, being sure to save the value it returns (since on success it will give the process ID of the newly-created child process). Always check to see if the returned value from fork is 1. If it is, then some error occurred (like you got stuck in a loop calling fork!), and you should abandon further execution of the shell (by printing an appropriate error message and invoking the exit system call with a non-zero argument). Note that fork returns twice(!): once in the parent process (that is, the shell process that invoked fork) and once in the child process (that was created by fork).

In the child process, prepare an array containing pointers to the strings containing the words in the command, followed by a null pointer. Then use the execve system call to execute the program identified by the first word in the command and pass its main function the appropriate command line arguments.

In the parent process (that is, the shell), you should execute the wait system call to delay execution until the child process terminates (see Waiting for the child process to terminate, below). Then return to step 1.

The execve system call is used to completely replace the code and data (the memory contents) of the process that executes it (in our case, the child process created by the shell) with the code and data for a specified program (from a disk file), specified as the first argument to execve. If a relative or absolute path was explicitly provided in a command, then that is the path to the file to be executed. Otherwise, the path will be one of the entries from the PATH environment variable, a slash, and the name of the command to be executed (that is, the first word of the command).

The second argument to execve is a pointer to the array of argument pointers (that is, character strings). The first of the argument pointer (with subscript 0) should point to the word containing the name of the program being executed. The remaining pointers should point to the command line arguments. The final pointer must be a NULL pointer (which, by convention, is zero).

The third and final argument to execve is a pointer to an array of strings representing the environment for the program about to be executed. This should be the value of the external variable environ (which you should declare as extern char **environ; outside any function in your program). Heres a simple program (available, with comments, on Loki in /home/stanw/csci4500/echo_me.c) that illustrates these concepts. It execves the program in the file named /bin/echo to display "Hello, world!".

#include

extern char **environ;

int main(int argc, char *argv[])

{

char *args[4];

args[0] = "/bin/echo"; args[1] = "Hello, "; args[2] = "world!"; args[3] = 0;

execve(args[0],args,environ); write(1,"execve failed. ",15); exit(1);

}

Waiting for the child process to terminate is done in the parent process (the shell) with the wait system call. The single argument to wait is a pointer to an integer; well call that integer status. status will receive two pieces of information. In the low-order byte (that is, the low order 8 bits) of status will be stored an indication of the reason for the termination of the child process. If the termination was normal (i.e. the process terminated by returning or executing the exit system call), this byte will contain 0. If the process terminated abnormally (e.g. it divided by zero or tried to access memory that didnt belong to it, or suffered some other traumatic experience), then this byte will be non-zero; we need not be concerned with the particular values at this time. In the event of a normal termination, the next higher byte (bits 15-8 of status) will contain the low-order eight bits of the value returned by the process, either with thereturnstatement or the exit system call. The return value from wait is normally the process ID of the child process that terminated. If there was no child process then wait will return 1 (this you should not get!).

Heres an example that shows how to use the wait system call. You should try changing the actions taken by the child process to illustrate different exit codes and termination status values.

#include

extern char **environ;

int main(int argc, char *argv[])

{

pid_t pid;

int status, which;

char msg[100];

pid = fork();

if (pid == -1) {

write(1,"fork failed. ",12);

exit(1);

}

if (pid == 0) { /* child process */

int x, y, z;

y = 12;

z = 0;

x = y / z; /* divide by zero! */

exit(1);

}

if (pid != 0) { /* parent */

which = wait(&status);

if (which == -1) {

write(1,"wait failed. ",12);

exit(1);

}

if (status & 0xff) { /* abnormal termination */

sprintf(msg,"process %d terminated abnormally for reason %d ",

which, status & 0xff);

} else { /* normal termination */

sprintf(msg,"process %d terminated normally with status %d ",

which, (status >> 8) & 0xff);

} write(1,msg,strlen(msg)); exit(0);

}

Notes and Restrictions

You may use sprintf (as illustrated in the last example) to prepare messages to be displayed, but no standard C library functions that result in input or output must be used in your solution (for example, do not use printf or scanf). Instead, all input and output must be accomplished using the read and writesystem calls. You may also use the string functions for things like parsing the command line and computing the length of strings (for the write system call). Likewise, use no other standard C library functions related to creating processes.

Requirements

You must write (and test) a C program that functions as a simple shell as just described.