Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Phase 1-Basic Functionality: Placing the files into a new directory structure Create a directory called filesToSort in your SHOME directory and 'cd' into it. From

image text in transcribed
image text in transcribed
image text in transcribed
image text in transcribed
Phase 1-Basic Functionality: Placing the files into a new directory structure Create a directory called filesToSort in your SHOME directory and 'cd' into it. From there run the command 'tar -xf Ivar/tmp/fts.tar. This will create the files you need to process. There are around 200 of them, all with names starting ff The files have similar data, in the same format throughout. Here is an example (please note that, throughout this document, examples are provided for illustration only; they do not necessarily reflect the actual data you will be processing): l/filesToSort] $ cat./f1 1683451198 2015-07-21 Tuesday Leeds 3416293 Each data file (ff_xxxx.xx) should be placed in one of a set of subdirectories based on the date in the first line of the file. For example, the file above should be placed in a sub directory underneath the current directory (in your case, files ToSort) '2015/07/21'. This means it should be in a directory named '21' within a directory named '07' within a directory named '2015' within the current directory. Directories should only be created when needed so that only the minimum number of directories needed to house the files is actually created. Write a script called makeStructure that moves any files beginning with ff_' into the structure described (and creates that structure if required). Phase 2 - Data Verification & Correction Scenario Unfortunately, a glitch has caused corruption to some of the files that have been placed in the new structure. The data corruption is not severe and only affects the date stamp in the file on line 1. This means the date stamp can be regenerated from the file pathname. Putting the damaged data in place From the directory filesToSort' run the following commands: 'pwd' # To make sure you are in filesToSort 'rm tar -xf /var/tmp/damagedDatatar. # Get the files . -r 20" M" # Delete any unneeded files. Ignore any error messages . This will put the damaged files in place and in the appropriate directory structure as discussed in phase 1. Phase 2A-Error Detection Write a script called checkData that identifies the files with damaged date entries. The script should produce a list of filenames with the non-matching date stamps. For example: /filesToSort] S checkData /2014/06/14/ff 1147710362 /2004/05/27/ff 1525617511 /2004/01/16/ff 1658910840 /2005/01/18/ff_ 1664330311 /2005/02/16/ff_1467231391 This must not be done by simply checking to see if the date in the file is in a date format. Phase 28-Error Correction Write a script called correctData that corrects the date entry using information in the file's pathname. Prove this has worked by running checkData and seeing no output Phase 3 - Reporting Population Statistics by Year called The last line of the data files is an estimated town population. Write a schp statPopnYear that calculates statistics for qualifying files. The script should accept keyword should be one of average, minimum, maximum or all The script should terminate with an error message and error status if a year number followed by a keyword that shows the informationt ed. The year number should correspond to one of those available in the tree. The . . . The number of arguments supplied is not two The year number does not correspond to one available in the tree The keyword is not one of those listed The output should be similar to the following examples (remember that the data might be different in the sample data you are provided, so do not use these figures to check your calculations are correct): Phase 4 -Creating and Processing Indexed Data Phase 4A -Creating an Index File Index files are often used in file systems as a quicker, more efficient way of searching through large numbers of files for predetermined data. The third line of the data files contains a town name on which the population data is basec Write a script alled createTownindex that will create a file index called townFilelndex fo the towns in the data. This file will contain a list of towns and the files that are associated with that town. It should output the data in the format town:filepath and the data should be stored alphabetically by town name. It should look something like this: /filesToSort] $cat townlndex Aberdeen:./2016/12/23/ff 2182327097 Aldershot:./2017/08/23/ff 349812947 Altrincham:./2006/01/23/ff 298983010 Note - Some towns may be repeated due to there being data for multiple dates. It is acceptable to have repeated town names in your index file. Phase 1-Basic Functionality: Placing the files into a new directory structure Create a directory called filesToSort in your SHOME directory and 'cd' into it. From there run the command 'tar -xf Ivar/tmp/fts.tar. This will create the files you need to process. There are around 200 of them, all with names starting ff The files have similar data, in the same format throughout. Here is an example (please note that, throughout this document, examples are provided for illustration only; they do not necessarily reflect the actual data you will be processing): l/filesToSort] $ cat./f1 1683451198 2015-07-21 Tuesday Leeds 3416293 Each data file (ff_xxxx.xx) should be placed in one of a set of subdirectories based on the date in the first line of the file. For example, the file above should be placed in a sub directory underneath the current directory (in your case, files ToSort) '2015/07/21'. This means it should be in a directory named '21' within a directory named '07' within a directory named '2015' within the current directory. Directories should only be created when needed so that only the minimum number of directories needed to house the files is actually created. Write a script called makeStructure that moves any files beginning with ff_' into the structure described (and creates that structure if required). Phase 2 - Data Verification & Correction Scenario Unfortunately, a glitch has caused corruption to some of the files that have been placed in the new structure. The data corruption is not severe and only affects the date stamp in the file on line 1. This means the date stamp can be regenerated from the file pathname. Putting the damaged data in place From the directory filesToSort' run the following commands: 'pwd' # To make sure you are in filesToSort 'rm tar -xf /var/tmp/damagedDatatar. # Get the files . -r 20" M" # Delete any unneeded files. Ignore any error messages . This will put the damaged files in place and in the appropriate directory structure as discussed in phase 1. Phase 2A-Error Detection Write a script called checkData that identifies the files with damaged date entries. The script should produce a list of filenames with the non-matching date stamps. For example: /filesToSort] S checkData /2014/06/14/ff 1147710362 /2004/05/27/ff 1525617511 /2004/01/16/ff 1658910840 /2005/01/18/ff_ 1664330311 /2005/02/16/ff_1467231391 This must not be done by simply checking to see if the date in the file is in a date format. Phase 28-Error Correction Write a script called correctData that corrects the date entry using information in the file's pathname. Prove this has worked by running checkData and seeing no output Phase 3 - Reporting Population Statistics by Year called The last line of the data files is an estimated town population. Write a schp statPopnYear that calculates statistics for qualifying files. The script should accept keyword should be one of average, minimum, maximum or all The script should terminate with an error message and error status if a year number followed by a keyword that shows the informationt ed. The year number should correspond to one of those available in the tree. The . . . The number of arguments supplied is not two The year number does not correspond to one available in the tree The keyword is not one of those listed The output should be similar to the following examples (remember that the data might be different in the sample data you are provided, so do not use these figures to check your calculations are correct): Phase 4 -Creating and Processing Indexed Data Phase 4A -Creating an Index File Index files are often used in file systems as a quicker, more efficient way of searching through large numbers of files for predetermined data. The third line of the data files contains a town name on which the population data is basec Write a script alled createTownindex that will create a file index called townFilelndex fo the towns in the data. This file will contain a list of towns and the files that are associated with that town. It should output the data in the format town:filepath and the data should be stored alphabetically by town name. It should look something like this: /filesToSort] $cat townlndex Aberdeen:./2016/12/23/ff 2182327097 Aldershot:./2017/08/23/ff 349812947 Altrincham:./2006/01/23/ff 298983010 Note - Some towns may be repeated due to there being data for multiple dates. It is acceptable to have repeated town names in your index file

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions