Write a MapReduce job that processes the global weather dataset and returns the records of the country India The output should contain 4 different files Each file would contain weather data of one entire century For example 4 part files should contain the data in the following pattern File1 Year 1700 1799 ( All the records of the 18th century will be stored in File1) File2 Year 1800 1899( All the records of the 19th century will be stored in File2) File3 Year 1900 1999( All the records of the 20th century will be stored in File3) File4 Year 2000 Present ( All the records of the 21st century will be stored in File4) Input Dataset Refer to the path given below (hdfs bigdatapgp common folder assignment3 weather weather1 csv) Dataset Description COLUMN NAME DESCRIPTION dt Date AverageTemperature Average Temperature of that city AverageTemperatureUncertainity Uncertainty in the Average Temperature City Name of the city Country Name of the country that the city belongs to Latitude Latitude of the city Longitude Longitude of the city Constraints Skip header row while reading the file Use the concept of Partitioner Expected Solution You need to paste the MR code, Hadoop commands path of the final jar that is used to achieve this output

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 05, 2024

Write a MapReduce job that processes the global weather dataset and returns the records of the country India. The output should contain 4 different files.

Write a MapReduce job that processes the global weather dataset and returns the records of the country "India". The output should contain 4 different files. Each file would contain weather data of one entire century.

For example: 4 part files should contain the data in the following pattern:

File1 : Year 1700-1799 ( All the records of the 18th century will be stored in File1)
File2 : Year 1800-1899( All the records of the 19th century will be stored in File2)
File3 : Year 1900-1999( All the records of the 20th century will be stored in File3)
File4 : Year 2000-Present ( All the records of the 21st century will be stored in File4)

Input Dataset: Refer to the path given below: (hdfs:///bigdatapgp/common_folder/assignment3/weather/weather1.csv)

Dataset Description:

COLUMN NAME	DESCRIPTION
dt	Date
AverageTemperature	Average Temperature of that city
AverageTemperatureUncertainity	Uncertainty in the Average Temperature
City	Name of the city
Country	Name of the country that the city belongs to
Latitude	Latitude of the city
Longitude	Longitude of the city