Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Assignment Let's using something from the real world to get some practice handling data. This assignment involves writing a bash script that queries air quality

image text in transcribedimage text in transcribed

Assignment Let's using something from the "real world" to get some practice handling data. This assignment involves writing a bash script that queries air quality readings, using a dataset from the EPA (Environmental Protection Agency). The EPA has a pre-generated CSV (Comma Separated Values) file that contains annual air quality measurements for all 50 dates here . A CSV file contains a series of rows, with each column separated by a comma. Each row appears on a separate line. Download this dataset into your virtual machine with wget https://aqs.epa.gov/aqsweb/airdata/annual_aqi_by_county_2020.zip Then uncompress it with the unzip command. Let's take a look at the format of this file. Use head to preview the first few lines: "State", "County", "Year","Days with AQI", "Good Days", "Moderate Days", "Unhealthy for Sensitive Groups Days", "Unhealthy Days", "Very Unhealthy Days", "Hazardous Days", "Max AQI", "90th Percentile AQI", "Median AQI", "Days Co", "Days NO2", "Days Ozone", "Days 502", "Days PM2.5", "Days PM10" "Alabama", "Baldwin", 2020,11,11,0,0,0,0,0,48,39,20,0,0,0,0,11,0 "Alabama", "Clay", 2020,5,5,0,0,0,0,0,31,31,15,0,0,0,0,5,0 "Alabama", "Dekalb", 2020,59,59,0,0,0,0,0,45,40,32,0,0,58,0,1,0 "Alabama", "Etowah", 2020,8,8,0,0,0,0,0,40,40,28,0,0,0,0,8,0 "Alabama", "Jefferson", 2020,32,26,6,0,0,0,0,63,54,35,1,5,9,0,15,2 "Alabama", "Mobile", 2020,30,27,3,0,0,0,0,58,51,27,0,0,0,1,29,0 "Alabama", "Montgomery", 2020,29,21,8,0,0,0,0,64,57,41,0,0,0,0,29,0 "Alabama", "Morgan", 2020,31,31,0,0,0,0,0,37,28,21,0,0,0,0,31,0 "Alabama", "Russell", 2020,31,28,3,0,0,0,0,57,42,25,0,0,0,0,31,0 The first line contains header information -- one quoted string, separated by a comma, for the title of each column. The first column identifies the state, the second identifies the county within that state, and subsequent columns contain air quality data for the county (we will only be a looking at a small subset of the fields here). Write a program that outputs the total number of days across a specific state for each air category (Good, Moderate, Unhealthy for Sensitive Groups, Unhealthy, Very Unhealthy, and finally... just plain Hazardous). The program should take two mandatory arguments: the filename the state name to find in the file. Example Output The output of your program should look like this: Example run #1: myprompt> ./aq.sh annual_aqi_by_county_2020.csv California Total California counties on file: 53 Total amount of days in each category: Good 8437 Moderate 2958 Unhealthy for sensitive groups 432 Very unhealthy 26 Hazardous @ Example run #2 (state not in the file): myprompt> ./aq.sh annual_aqi_by_county_2020.csv Alberta Total Alberta counties in file: 0 Requirements Use only bash and standard UNIX utilities such as cut and tr (no awk or other languages allowed here.. we'll get to those later!) Use a consistent style of indentation (one level for each code block is a good rule -- if you go into a loop, indent one level). If the number of arguments specified is incorrect, output an error message with usage information and exit with status code 1. If the file cannot be opened, output an error message and exit with status code 1. Output a count of the total counties in the state requested. If there are no counties in the file, output a 0 count and exit with status code 1 (see example). Hints To pass a state with a space in its name (like North Dakota), put the name in quotes from your shell: myscript.sh file.csv "North Dakota" (depending on your approach, this may not be necessary) The first step in writing the script should be to make sure you can extract the lines you want. You can use cut to separate the fields in the file, specifying the delimiter as the comma, character. Assignment Let's using something from the "real world" to get some practice handling data. This assignment involves writing a bash script that queries air quality readings, using a dataset from the EPA (Environmental Protection Agency). The EPA has a pre-generated CSV (Comma Separated Values) file that contains annual air quality measurements for all 50 dates here . A CSV file contains a series of rows, with each column separated by a comma. Each row appears on a separate line. Download this dataset into your virtual machine with wget https://aqs.epa.gov/aqsweb/airdata/annual_aqi_by_county_2020.zip Then uncompress it with the unzip command. Let's take a look at the format of this file. Use head to preview the first few lines: "State", "County", "Year","Days with AQI", "Good Days", "Moderate Days", "Unhealthy for Sensitive Groups Days", "Unhealthy Days", "Very Unhealthy Days", "Hazardous Days", "Max AQI", "90th Percentile AQI", "Median AQI", "Days Co", "Days NO2", "Days Ozone", "Days 502", "Days PM2.5", "Days PM10" "Alabama", "Baldwin", 2020,11,11,0,0,0,0,0,48,39,20,0,0,0,0,11,0 "Alabama", "Clay", 2020,5,5,0,0,0,0,0,31,31,15,0,0,0,0,5,0 "Alabama", "Dekalb", 2020,59,59,0,0,0,0,0,45,40,32,0,0,58,0,1,0 "Alabama", "Etowah", 2020,8,8,0,0,0,0,0,40,40,28,0,0,0,0,8,0 "Alabama", "Jefferson", 2020,32,26,6,0,0,0,0,63,54,35,1,5,9,0,15,2 "Alabama", "Mobile", 2020,30,27,3,0,0,0,0,58,51,27,0,0,0,1,29,0 "Alabama", "Montgomery", 2020,29,21,8,0,0,0,0,64,57,41,0,0,0,0,29,0 "Alabama", "Morgan", 2020,31,31,0,0,0,0,0,37,28,21,0,0,0,0,31,0 "Alabama", "Russell", 2020,31,28,3,0,0,0,0,57,42,25,0,0,0,0,31,0 The first line contains header information -- one quoted string, separated by a comma, for the title of each column. The first column identifies the state, the second identifies the county within that state, and subsequent columns contain air quality data for the county (we will only be a looking at a small subset of the fields here). Write a program that outputs the total number of days across a specific state for each air category (Good, Moderate, Unhealthy for Sensitive Groups, Unhealthy, Very Unhealthy, and finally... just plain Hazardous). The program should take two mandatory arguments: the filename the state name to find in the file. Example Output The output of your program should look like this: Example run #1: myprompt> ./aq.sh annual_aqi_by_county_2020.csv California Total California counties on file: 53 Total amount of days in each category: Good 8437 Moderate 2958 Unhealthy for sensitive groups 432 Very unhealthy 26 Hazardous @ Example run #2 (state not in the file): myprompt> ./aq.sh annual_aqi_by_county_2020.csv Alberta Total Alberta counties in file: 0 Requirements Use only bash and standard UNIX utilities such as cut and tr (no awk or other languages allowed here.. we'll get to those later!) Use a consistent style of indentation (one level for each code block is a good rule -- if you go into a loop, indent one level). If the number of arguments specified is incorrect, output an error message with usage information and exit with status code 1. If the file cannot be opened, output an error message and exit with status code 1. Output a count of the total counties in the state requested. If there are no counties in the file, output a 0 count and exit with status code 1 (see example). Hints To pass a state with a space in its name (like North Dakota), put the name in quotes from your shell: myscript.sh file.csv "North Dakota" (depending on your approach, this may not be necessary) The first step in writing the script should be to make sure you can extract the lines you want. You can use cut to separate the fields in the file, specifying the delimiter as the comma, character

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Real Time Database Systems Architecture And Techniques

Authors: Kam-Yiu Lam ,Tei-Wei Kuo

1st Edition

1475784023, 978-1475784022

More Books

Students also viewed these Databases questions

Question

What does stickiest refer to in regard to social media

Answered: 1 week ago