Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Abstract This is the second in a series of projects that will involve sorting a large amount of data. In this second phase, you will

image text in transcribedimage text in transcribed

Abstract This is the second in a series of projects that will involve sorting a large amount of data. In this second phase, you will write a multi-process C program to sort a list of records of movies from imdb alphabetically by the data in a given column. You will make use of the concepts learned in lecture including file/diretory access, forking processes, etc File formats for this part of the project are the same as in the first. The CSV file with movie metadata will remain the same. The sorting algorithm will also rermain the same. If you properly modularized your code in Project 0, you should be able to reuse almost all of your code In this project, you will read in a directory name and walk through the directory structure to find all csw files. There may be multiple levels of directories that you will need to recurse through. You will then fork child processes to sort each of the files and output the results to a different file. You should NOT use exec for this project. You should write one program that, when copied from the parent to the child process, will continue running. You can use the return value of fork) in a conditional to choose different blocks of code to run within your code. You will want to make sure to prevent zornbie and orphan child processes. You will also want to make sure you to not create fork bornbs that will bring down machines. In all cases of bad input, you should fail gracefully(eg. no segfaults). You will output metadata about your processes to STDOUT. This metadata ill show the total number of processes created and the pids of all created processes Methodology a. Parameters Your code will read in a set of parameters a the command line. Records w e stored in cs fi es in e provided directory. As mentioned above directory structures may have multiple leels and you must find a csv files that do nat have the correct format of the movie metadata csv (e.g.csv files that have other random data in them). csv files. Your code should gnore on .cov ies and Remember, the first record (line) is the column headings and shauld not be sorted as data. Your code must take in a conmand-line parameter to determine which value type (column) to sort on. If that parameter is not present (?.> throw an error, or default behavior). The first argument to your program will be'- to indicate sorting by column and the second will be the column name: /sorter -c food Be sure to check the arguments are there and that they correspond to a listed value type (column heading) in the CSV For this phase youll extend your flags from one to three. The second paraneter to your program will be-d' indicating the directory the program should search for.csv files. This parameter is optional. The default behavior wi search the current directory sorter - The third parameter to your progran will be-o' indicating the output directory for the sorted versions of the input file. This parameter is optional. The default behaior will be to output in the same directory as the source file. sorter-c movie title -d thisdir-o thatdir food -d thisdir/thatdir b. Operation Your code will be reading in and traversing the entire directory. In order to run your code to test it, you will need to open each CSV and read it for processing sorter- c movle.title -d thisdir-o tha dir Abstract This is the second in a series of projects that will involve sorting a large amount of data. In this second phase, you will write a multi-process C program to sort a list of records of movies from imdb alphabetically by the data in a given column. You will make use of the concepts learned in lecture including file/diretory access, forking processes, etc File formats for this part of the project are the same as in the first. The CSV file with movie metadata will remain the same. The sorting algorithm will also rermain the same. If you properly modularized your code in Project 0, you should be able to reuse almost all of your code In this project, you will read in a directory name and walk through the directory structure to find all csw files. There may be multiple levels of directories that you will need to recurse through. You will then fork child processes to sort each of the files and output the results to a different file. You should NOT use exec for this project. You should write one program that, when copied from the parent to the child process, will continue running. You can use the return value of fork) in a conditional to choose different blocks of code to run within your code. You will want to make sure to prevent zornbie and orphan child processes. You will also want to make sure you to not create fork bornbs that will bring down machines. In all cases of bad input, you should fail gracefully(eg. no segfaults). You will output metadata about your processes to STDOUT. This metadata ill show the total number of processes created and the pids of all created processes Methodology a. Parameters Your code will read in a set of parameters a the command line. Records w e stored in cs fi es in e provided directory. As mentioned above directory structures may have multiple leels and you must find a csv files that do nat have the correct format of the movie metadata csv (e.g.csv files that have other random data in them). csv files. Your code should gnore on .cov ies and Remember, the first record (line) is the column headings and shauld not be sorted as data. Your code must take in a conmand-line parameter to determine which value type (column) to sort on. If that parameter is not present (?.> throw an error, or default behavior). The first argument to your program will be'- to indicate sorting by column and the second will be the column name: /sorter -c food Be sure to check the arguments are there and that they correspond to a listed value type (column heading) in the CSV For this phase youll extend your flags from one to three. The second paraneter to your program will be-d' indicating the directory the program should search for.csv files. This parameter is optional. The default behavior wi search the current directory sorter - The third parameter to your progran will be-o' indicating the output directory for the sorted versions of the input file. This parameter is optional. The default behaior will be to output in the same directory as the source file. sorter-c movie title -d thisdir-o thatdir food -d thisdir/thatdir b. Operation Your code will be reading in and traversing the entire directory. In order to run your code to test it, you will need to open each CSV and read it for processing sorter- c movle.title -d thisdir-o tha dir

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Oracle RMAN For Absolute Beginners

Authors: Darl Kuhn

1st Edition

1484207637, 9781484207635

More Books

Students also viewed these Databases questions