Question
Write a MapReduce job that processes the global weather dataset and returns the records of the country India. The output should contain 4 different files.
Write a MapReduce job that processes the global weather dataset and returns the records of the country "India". The output should contain 4 different files. Each file would contain weather data of one entire century.
For example: 4 part files should contain the data in the following pattern:
- File1 : Year 1700-1799 ( All the records of the 18th century will be stored in File1)
- File2 : Year 1800-1899( All the records of the 19th century will be stored in File2)
- File3 : Year 1900-1999( All the records of the 20th century will be stored in File3)
- File4 : Year 2000-Present ( All the records of the 21st century will be stored in File4)
Input Dataset: Refer to the path given below: (hdfs:///bigdatapgp/common_folder/assignment3/weather/weather1.csv)
Dataset Description:
COLUMN NAME | DESCRIPTION |
dt | Date |
AverageTemperature | Average Temperature of that city |
AverageTemperatureUncertainity | Uncertainty in the Average Temperature |
City | Name of the city |
Country | Name of the country that the city belongs to |
Latitude | Latitude of the city |
Longitude | Longitude of the city |
Constraints:
- Skip header row while reading the file
- Use the concept of Partitioner
Expected Solution: You need to paste the MR code, Hadoop commands & path of the final jar that is used to achieve this output.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started