Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This program prepares a simple index on a text file. It will read a text file containing data and build the index, treating the first

This program prepares a simple index on a text file. It will read a text file containing data and build the index, treating the first n columns as the key, where n will be specified by the user. Writing this program will require knowledge of binary I/O in the language of your choice. Allowable languages are Java, and C/C++. You will be provided two data sets for testing, the same ones the grader will use. The first one (CS6360Asg5TestDataA.txt) should be indexed on the first 15 bytes. The second one (CS6360Asg5TestDataB.txt) should be indexed on the first 4 bytes.

This program must run from the command line, which takes four parameters:

The first parameter is one of two strings: The string c indicates that you should create an index. The string l (lower-case L) indicates you should list the file using the index.

The name of an input text file.

The name of the output file.

The number of characters, starting with the first one, that comprise the key.

For example, (call your program INDEX) the following command line would create an index on a file called DATA.TXT, produce an output file DATA.IDX, with a key length of 4:

INDEX -c DATA.TXT DATA.IDX 4

The output file must contain the key and the 8-byte binary offset of the record within the data file. Thus records in your output file are fixed-length, containing the number of bytes in the key plus 8 bytes of pointer, or offset. In the example, each index record would be 12 bytes long. The output file must be sorted. While a real system would use an external sort, you may create an array of objects and sort that in memory. Thus if your input file is this:

AAAATest data 1

ZZZZAnother test record

CCCCThird test record

Your output file would look like this, with the brackets meaning that what is in them is a binary number, not decimal digits:

AAAA[0]

CCCC[39]

ZZZZ[15]

Thus the first record is in position 0 in the input file, the second one with key CCCC is in position 39, and the third one with key ZZZZ, the second input record, is in position 15. Each record is fixed length. Thus if you have a four-byte key and 8 bytes of offset, records in this example are 12 bytes. There is no delimiter; this is a binary file, not text.

The second way to run this program, to list the contents of a file using the index, takes the following parameters:

1.The string l (lower-case L) for list.

2.The name of an index file.

3.The name of the associated data file.

4.The length of the key.

This usage must list the records in order using the data from the index. Thus you will read a record from the index file, use its pointer to read a record from the data file, and display it. Since the data records are variable length, you should read in 1000 bytes, knowing where the record starts, and look for a newline character. Then display the record.

This command lists all of the records in alphabetical order:

INDEX -l DATA.TXT DATA.IDX 4

Given the data above, you would see:

AAAATest data 1

CCCCThird test record

ZZZZAnother test record

######################################################################

CS6360Asg5TestDataB.txt contents:

AAAAThis is the first record. Key length is 4 ZZZZThis is the last record, but second in the input file CCCCThis should list second, but is third in the input MMMMYet another record

CS6360Asg5TestDataA.txt contents(first 30 lines):

11627047200100A ANESTH, SALIVARY GLAND 98262265900102A ANESTH, REPAIR OF CLEFT LIP 14624323100103A ANESTH, BLEPHAROPLASTY 44136167900104A ANESTH, ELECTROSHOCK 99969473000120A ANESTH, EAR SURGERY 68095631900124A ANESTH, EAR EXAM 46951777700126A ANESTH, TYMPANOTOMY 09563777700140A ANESTH, PROCEDURES ON EYE 45526813100142A ANESTH, LENS SURGERY 06227766800144A ANESTH, CORNEAL TRANSPLANT 45052857200145A ANESTH, VITREORETINAL SURG 12955344600147A ANESTH, IRIDECTOMY 69534673500148A ANESTH, EYE EXAM 14209442900160A ANESTH, NOSE/SINUS SURGERY 44379623500162A ANESTH, NOSE/SINUS SURGERY 64541668700164A ANESTH, BIOPSY OF NOSE 24389647500170A ANESTH, PROCEDURE ON MOUTH 16394184600172A ANESTH, CLEFT PALATE REPAIR 81198194300174A ANESTH, PHARYNGEAL SURGERY 34119395900176A ANESTH, PHARYNGEAL SURGERY 96361983600190A ANESTH, FACE/SKULL BONE SURG 12625076800192A ANESTH, FACIAL BONE SURGERY 74690041800210A ANESTH, OPEN HEAD SURGERY 69126966700212A ANESTH, SKULL DRAINAGE 09682442300214A ANESTH, SKULL DRAINAGE 82583844600215A ANESTH, SKULL REPAIR/FRACT 23762794700216A ANESTH, HEAD VESSEL SURGERY 90855108500218A ANESTH, SPECIAL HEAD SURGERY 58329600300220A ANESTH, INTRCRN NERVE 99117127100222A ANESTH, HEAD NERVE SURGERY

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database And Expert Systems Applications 24th International Conference Dexa 2013 Prague Czech Republic August 2013 Proceedings Part 1 Lncs 8055

Authors: Hendrik Decker ,Lenka Lhotska ,Sebastian Link ,Josef Basl ,A Min Tjoa

2013 Edition

3642402844, 978-3642402845

More Books

Students also viewed these Databases questions

Question

What are the five forces that determine an industrys profitability?

Answered: 1 week ago

Question

sorting 3 D array using loop 4 times without using flatten 3 DArray

Answered: 1 week ago

Question

=+ What would it look like? Who should deliver it?

Answered: 1 week ago