Answered step by step
Verified Expert Solution
Link Copied!

Question

00
1 Approved Answer

Implement a simple file examination program. Note that you must use the specified C library calls to implement the commands. Your program, given a file

Implement a simple file examination program. Note that you must use the specified C library calls to implement the commands.
Your program, given a file name on the command line, will examine the contents of the file to see what type of file it really is. As your textbook authors stated, the convention that MS Word filename end in .doc or .docx, or that PDF filenames end in .pdf is just that -- a convention. People regularly misstype file names and wind up with a file named file.doc that is actually a PDF file, or a file named file.html that is actually a MS Word file.
Your program will be able to identify files of the specified types listed below. For any file your program finds unidentifiable, just print "Unknown file type".
To figure out what type of file a file is, your program must look at the characters at the start of the file. Files of some file types have predictable characters or strings at or near the start of files of that type, which can be used to identify the file type.
Make certain to test your program only on COPIES of files you want to keep.
The 'man' Unix shell command is your friend!
GIF:
GIF image files start with either "GIF89a" or "GIF87a"
(For more information see this Wikipedia article on GIF and other files starting with special codes)
EPS:
EPS files start with "%!PS-Adobe-2.0 EPSF-2.0".
ODT (OpenDocument Text):
ODT files start with the two characters "PK" and, some number of characters (less than 200) further along in the file, contain the string "mimetypeapplication".
Hint: Google Docs can save documents into ODT file format.
TIFF:
TIFF image files start with "MM" or "II". Lab 4 only requires you correctly identify the files that start with MM. TIFF image files that use big-endian storage start with MM; TIFF files that are stored using little-endian start with II.
After the MM beginning, there will be a 42 in the file, stored as a two-byte whole number. However, to detect the 42, you will need to read two bytes into one whole number variable of the appropriate size. (Here is a tutorial on C data types, containing information about their sizes.)
Hint: Remember C allows you to "cast" memory locations into any data type convenient to you. Also remember that an array variable simply stores the location in memory of the data stored within the array. doublearray[0] and doublearray[1] are memory locations 8 bytes apart in memory, because doubles are 8 bytes in size. characterarray[0] and characterarray[1] are 1 byte apart in memory, because chars are 1 byte in size.
Yes, code that correctly looks for and detects the 'MM' AND 42 in a TIFF file will be worth more points than code that simply looks for 'MM'. Partial credit will be given for simply detecting 'MM'.
System calls you may use:
The system calls described in the Files and Directories chapter of 'Three Easy Pieces'. open(), close(), read(), lseek(), etc.
getchar(), isprint()
If you have another system call you'd like to have added to this list, contact me.
Nothing in the string.h library is on this list.
Example of a Lab 4 program in action:
prompt>./lab4program reallyaodtfile.pdf
reallyawordfile.pdf is a ODT document.
prompt>./lab4program reallyaepsfile.txt
reallyagiffile.txt is a EPS image file.
prompt>./lab4program reallyatifffile.docx
reallyapdffile.pdf is a TIFF image file.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started