Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hashing DATAIN.dat TATUNG CO.EL PR. LONG BEACH CA KAMERMAN LCIRRUS BEAVERTON, OR QUADRAM COLOACH AV NORCROSS GE AST RESEARALTON AV IRVINE CA EXPRESS SYREMING SCHAUMBURG

Hashing

image text in transcribed

image text in transcribed

image text in transcribed

DATAIN.dat

TATUNG CO.EL PR. LONG BEACH CA KAMERMAN LCIRRUS BEAVERTON, OR QUADRAM COLOACH AV NORCROSS GE AST RESEARALTON AV IRVINE CA EXPRESS SYREMING SCHAUMBURG IL DAC SW INCSPRING VAL DALLAS TX SUMMIT TECBABSON WELLESLEY, MA SIGMA DESIUNIVER A SAN JOSE CA HERCULES CNINTH ST BERKELEY CA MICROSTUF,H.W. PKWY ROSWELL GE MAXELL CO.OXFORD MOONACHIE NJ PRINCETON.EWING S PRINCETON NJ IOMEGA CORWESTA SOUTH ROY UTAH VEN-TAL INWALSH SANTA CLARA CA TAXAN CORPCITY OF INDUSTRY, CA ORCHID CORWESTINGHO FREMONT CA SSISOFTWARCENTER ST. OREM UTAH GRAPHIC CO5TH AVE. WALTHAM, MA VICTOR CORSCOTTS VALLEY, CALIF NCR COORPOBOWLING DR DAYTON OH MICROMART.CAMPUS D NORCROSS GE HEWLETT PABERNARD SAN DIEGO CA MICROPRO IBOX 57135 HAYWARD CA BORLAND I.SCOTTS V. DR S.V. CA LOGIQUEST.MONTRE QUEBEC CANADA PROSOFT COBELLAI HOLLEYWOOD CA SPECTRUM SWOLFE R SUNNYVALE CA HONEYWELL.BAKER COSTA MESA, CA EMERSON COSTAN ST SANTA ANA CA SOURCE TELPOBOX 1305 MCLEAN VA MICROGRAFXGREEN RICHARDSON, TX IBM CORPORBOCA RATON, FLORIDA INTEL COOR5 ST MOUNTAINVIEW CA MICRODESIGUNVIE WINTER PARK FL QUBIE CORPCALLE S CAMARILLO CA PARADISE STAYLOR S BRISBANE CA EVEREX SYSMILMONT FREMONT, CA INTERLUDE.RICHMOND HOUSTON, TX EPSON AMERBEDA STR TORRANCE CA CHANNELS IKI ST TORONTO CANADA BORTHER I.THENALA DR IRVINE CA MAYNARD ELSEMOR CASSELBERRY FL DIGITAL REGARDEN C MONTEREY CA CORE INTERFEDERA BOCA RATON FL ROSESOFT CUNIVE WAY SEATTLE WA BUSSIN TOL128 AVE BELLEVUE, WA TALLTREE SSNTONIO PALO ALTO CA GENOA SYSTTRIMBLE SAN JOSE, CA CURITS INCUNIO PETERBOROUGH NH OKIDATA COA MOUNTAIN LAUREL NJ COMPUADD CTECH BLVD. AUSTIN TX AMDEK COR.MAINE GROVE VALLY IL DYSAN CORPPAT H SANTA CLARA CA MICROWAY CTEMPOHOUSE LONDON UK TECMAR INTCOCHRAN RD. SOLON OH QUANTUM SWSTAFFO OTTAWA CANADA PROMETHEUSFREMONT S FREMONT CA FUNK SOFTW3RD ST CAMBRIDGE, MA

SEARCH.dat

KALLTREE S DAC SW INC COMPUADD C MICROWAY C TATUNG CO. DYSAN CORP DIGITAL RE QUANTBM SW PROMETHEUS EVEREX SYS MICROMART. PARADISE S SPECTRUM S MICRODESIG MAXELL CO. AMDEK COR, TECMAR INT IBM CORPOR OHCHID COR FUN SOFTW

In many applications, it is desirable to access data in a sequential manner. Examples of this can be found in payroll and billing processing. However, certain applications, like airline reservations, require immediate, random access to data. This assignment introduces the concept of hashing. Hashing refers to the process of generating a unique address or index for a data value or object. Typically, this unique address or index is used to determine the object's relative position within a data structure or disc file. Processing Requirements "DATAIN" is a text file which contains 59 records whose format is KEYDATA, where KEY is a character (string) field of 10 characters and DATA is a character (string) field of 20 characters (see example below). This file will be provided to you. Read "DATAIN" and hash each record using a hash function, later specified, into the memory data structure. Collision resolution handling is specified later. KE Y (10 bytes) D (20 bytes) VEN-TAL INWALSH SANTA CLARA CA TAXAN CORPCITY OF INDUSTRY, CA ORCHID CORWESTINGHO FREMONT CA SSISOFTWARCENTER ST. OREM UTAH The memory data structure, called a hash table, consists of a primary and overflow area. The primary are has 20 buckets, each bucket containing 3 hash slots, an overflow pointer and count fields. The overflow area contain an additional number of buckets dedicated to handle collisions. Bucket zero's overflow pointer field may be used to point to the next available overflow bucket in the event of a collision. KEY DATA Bucket n, Slot 1 Slot 2 Slot 3 KEY DATA KEY DATA Overflow Pointer 1 Count Hashing (contd) A collision occurs when two or more data keys hash to the same bucket. If a collision occurs and there is a vacant slot in the primary area bucket, the record is stored in the first available slot. If the primary area bucket is full and an overflow bucket has been allocated, store the record in the next available slot within overflow area bucket. If no overflow bucket has been allocated, allocate one (link the primary area bucket to its overflow bucket using the pointer field). After hashing all records contained within the file "DATAIN", write the memory data structure to a file called "HASHTABLE" and display the contents of all hash buckets in the report format shown below. Hash Table Verification Report Before | After Restoration Bucket 1 Slot 1: Slot 2: Overflow Pointer: xx Bucket n Slot 1: Slot 2: Overflow Pointer: xx The second phase of the assignment involves restoring the hash table from a disk file and searching for data keys found in file "SEARCH' which contains 20 records. This file will be provided to you. Sample Entries ("SEARCH" file) MICROWAY C TATUNG CO. DYSAN CORP DIGITAL RE First, restore the memory data structure using the file "HASHTABLE" and produce the restoration report (format shown above). Next, read records from the file "SEARCH" and hash the key. Simulate the retrieval of a matching record from the data structure by producing the report shown on the next page. Hashing (contd) Search and Retrieval Transactions Search Key DYSAN CORP DIGITAL RE FUNK SOFTW Bucket/slot 12/2 4/1 Record (show data image here) PAT H SANTA CLARA CA GARDEN C MONTEREY CA Record not found The final phase of the project consists of computing total and average chain (collision) lengths of the hash function. Compute the total length for each primary bucket and the average of all primary buckets. For computing the average, exclude buckets having zero collisions. Use a report format similar to the before/after restoration report to report total length for each primary bucket. Hash Functions Hash functions are a key to address transformation and their efficiency is usually dependent upon the data set being "hashed". However, the simplest hash algorithm is also among the most efficient. For this assignment, use the following: Intermediate Value = ord (key [2] ) +Ord (key [4] +Ord (key [6]) Hash index = (Intermediate Value) mod table size where table size is number of primary area buckets. Thus, this assignment consists of following major tasks: 1) Hashing the records found in DATAIN' (ftp://165.196.201.8/dixon/cisp430) to a memory data structure. 2) Writing the memory data structure to disc and later, restoring the structure. Displaying the contents of hash buckets before and after restoration. Search of the memory data structure for key values found in file 'SEARCH' (see ftp above). 5) Computing hash efficiency statistics. Submit the following items for grading: 1) A copy of the source file. 2) Copies of all reports produced by the program; there are 3 reports: the hash table contents before and after restoration from disc, and the search/retrieval report. 3) Any project documentation you wish to include. In many applications, it is desirable to access data in a sequential manner. Examples of this can be found in payroll and billing processing. However, certain applications, like airline reservations, require immediate, random access to data. This assignment introduces the concept of hashing. Hashing refers to the process of generating a unique address or index for a data value or object. Typically, this unique address or index is used to determine the object's relative position within a data structure or disc file. Processing Requirements "DATAIN" is a text file which contains 59 records whose format is KEYDATA, where KEY is a character (string) field of 10 characters and DATA is a character (string) field of 20 characters (see example below). This file will be provided to you. Read "DATAIN" and hash each record using a hash function, later specified, into the memory data structure. Collision resolution handling is specified later. KE Y (10 bytes) D (20 bytes) VEN-TAL INWALSH SANTA CLARA CA TAXAN CORPCITY OF INDUSTRY, CA ORCHID CORWESTINGHO FREMONT CA SSISOFTWARCENTER ST. OREM UTAH The memory data structure, called a hash table, consists of a primary and overflow area. The primary are has 20 buckets, each bucket containing 3 hash slots, an overflow pointer and count fields. The overflow area contain an additional number of buckets dedicated to handle collisions. Bucket zero's overflow pointer field may be used to point to the next available overflow bucket in the event of a collision. KEY DATA Bucket n, Slot 1 Slot 2 Slot 3 KEY DATA KEY DATA Overflow Pointer 1 Count Hashing (contd) A collision occurs when two or more data keys hash to the same bucket. If a collision occurs and there is a vacant slot in the primary area bucket, the record is stored in the first available slot. If the primary area bucket is full and an overflow bucket has been allocated, store the record in the next available slot within overflow area bucket. If no overflow bucket has been allocated, allocate one (link the primary area bucket to its overflow bucket using the pointer field). After hashing all records contained within the file "DATAIN", write the memory data structure to a file called "HASHTABLE" and display the contents of all hash buckets in the report format shown below. Hash Table Verification Report Before | After Restoration Bucket 1 Slot 1: Slot 2: Overflow Pointer: xx Bucket n Slot 1: Slot 2: Overflow Pointer: xx The second phase of the assignment involves restoring the hash table from a disk file and searching for data keys found in file "SEARCH' which contains 20 records. This file will be provided to you. Sample Entries ("SEARCH" file) MICROWAY C TATUNG CO. DYSAN CORP DIGITAL RE First, restore the memory data structure using the file "HASHTABLE" and produce the restoration report (format shown above). Next, read records from the file "SEARCH" and hash the key. Simulate the retrieval of a matching record from the data structure by producing the report shown on the next page. Hashing (contd) Search and Retrieval Transactions Search Key DYSAN CORP DIGITAL RE FUNK SOFTW Bucket/slot 12/2 4/1 Record (show data image here) PAT H SANTA CLARA CA GARDEN C MONTEREY CA Record not found The final phase of the project consists of computing total and average chain (collision) lengths of the hash function. Compute the total length for each primary bucket and the average of all primary buckets. For computing the average, exclude buckets having zero collisions. Use a report format similar to the before/after restoration report to report total length for each primary bucket. Hash Functions Hash functions are a key to address transformation and their efficiency is usually dependent upon the data set being "hashed". However, the simplest hash algorithm is also among the most efficient. For this assignment, use the following: Intermediate Value = ord (key [2] ) +Ord (key [4] +Ord (key [6]) Hash index = (Intermediate Value) mod table size where table size is number of primary area buckets. Thus, this assignment consists of following major tasks: 1) Hashing the records found in DATAIN' (ftp://165.196.201.8/dixon/cisp430) to a memory data structure. 2) Writing the memory data structure to disc and later, restoring the structure. Displaying the contents of hash buckets before and after restoration. Search of the memory data structure for key values found in file 'SEARCH' (see ftp above). 5) Computing hash efficiency statistics. Submit the following items for grading: 1) A copy of the source file. 2) Copies of all reports produced by the program; there are 3 reports: the hash table contents before and after restoration from disc, and the search/retrieval report. 3) Any project documentation you wish to include

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions