Question

1 Approved Answer

Posted on Sep 25, 2024

Implement a C program to sort a set of 4-byte float point values in ascending order using radixsort. The values are saved in a file.

Implement a C program to sort a set of 4-byte float point values in ascending order using radixsort. The values are saved in a file. After sorting, the original file should still save the sorted values. Your program must access the file using memory mapping. No calls to functions, such as read(), write(), fread(), fwrite(), etc. are allowed. 1. Objectives To further improve C programming skills To learn how to read/write files through memory mapping To learn how radix-sort works 2. Detailed Requirements and Instructions Name the program in the pattern SECTION#_NJITID#_2.c. SECTION# is the threedigit section number of the CS288 section you registered (e.g., 001, 003, don't miss the leading 0s). NJITID# is the eight-digit NJIT ID (Not your UCID, Rutgers students also have NJIT IDs). 2 means this is the second problem. So your file name is something like 001_00123456_2.c (DO NOT COPY THIS AS YOUR FILE NAME!). The grader may use a script to find, compile, and test your program. The script will not find your program if it has a different name. Your radix-sort implementation should be flexible to support sorting the values as binaries or hexadecimals. When the values are sorted as binaries, your program uses two lists, and in each round of the sort your program adds every value to one of the lists based on the bit being examined in this round. When the values are sorted as hexadecimals, your program uses 16 lists, and in each round of the sort your program adds every value to one of the lists based on the hexadecimal digit (i.e., 4 binary digits) bit being examined in this round. Your program should take two arguments. One is the number of bits, 1 for sorting the values as binaries and 4 for sorting the values as hexadecimals. Thus, it represents the number of bits in each digit used in the sort. The second argument is the pathname of the file containing the data to be sorted. The number of float-point values saved in the file can be calculated using file size and the size of each float point value (i.e., 4 bytes). Thus, there is no need to specify the number of values. For example, to sort the float point values saved in ./file5k as hexadecimals, you can use the following command: ./your_program 4 ./file5k

The implementation of radix-sort needs to use bitwise operations. Read online articles or get a C programming book for how to use bitwise operations. Bitwise operations cannot be directly applied to float point values. You may need to use union to include two types, float and int, to allow your program to access a value with these two different types. Special attention is needed to handle the problem caused by sign bits. If sign bits are handled in the same way as other bits, after the last round of sort, all the negative values will be organized after positive values, and the negative values are in descending order. Thus, you need to reverse the order of negative values and put them before positive values in the file.

You can compile gendata.c attached with this assignment and use it to generate random values and save them into a file. The program also reports the sum of the values. For example, to generate 5000 random values and save them into ./file5kvalues, you can use the following command ./gendata 5000 ./file5kvalues You can compile checkdata.c attached with this assignment and use it to check whether the float point values have been sorted in ascending order. The tool also calculates a sum of the values in the file. Thus, you can compare the sum with the sum reported by gendata. The two sums should be very similar with minor numerical error caused by limited precisions. Optimize your implementation. For example, to copy a large number of numbers, you can use memcpy instead of copying the numbers one by one. Grading: 1. Your program can finish within 1 minute without an error when it is run against a file containing 1 million float point values ---- 10 points 2. Your program can correctly sort 100 million float point values as binaries in a file within 1 minute, and the file can pass the test with the checkdata program (i.e., sorted; the sum of sorted values is close to the sum of unsorted values given by gendata when the file was created (<5% difference)) ---- 10 points time ./your_program 1 ./file100millionvalues 3. Your program can correctly sort 100 million float point values as hexadecimals in a file within 1 minute, and the file can pass the test with the checkdata program (i.e., (i.e., sorted; the sum of sorted values is close to the sum of unsorted values given by gendata when the file was created (<5% difference)) ---- 10 points time ./your_program 4 ./file100millionvalues 4. Your program passes the tests in 2 and 3, and the time used by your program in test 2 is roughly 4x of the time used in test 3 (must be within 3x~5x). ---- 10 points 5. Your source code accesses the file containing the values through memory mapping. --- 10 points