Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

@Introduetion: |Drganizations often gain data from multiple sources. lter and clean it and then load the data onto different systems for visualizations and analysis. Over

image text in transcribedimage text in transcribed
@Introduetion: |Drganizations often gain data from multiple sources. lter and clean it and then load the data onto different systems for visualizations and analysis. Over time, many ofthese systems may become disconnected and not performing optimally. As a Big Data Developer, you may be tasked with trying to combine data from multiple sources into one or more other sinks. The go~to technologies for transferring data are often Apache Kafka and Spark Streaming. In this Final Project you will mimic an ETL process [short for extra, transform and load) using Spark Streaming and different data sources and sinks giving you practice on how such systems may be developed and implemented. ETL Process The ETL process will have two sources: 1. A Flume agent writing data into an HDFS source. 2. A Kafka producer writing data into a topic on a Kafka server For this assignment, we will use iot data from a Nexus phone (as used in the lecture exercises for Spark Streaming). You can choose other datasets if you wish. Spark streaming will be reading data from the above two sources. The data will then be aggregated and sent to two different sinks depending on the type of data that it is. For any data that involves sitting or standing, the data is sent to a Kafka topic called "idle". For any data other than sitting or standing, the data is sent to a Kafka topic called "active". Two consumers should then read the data and display the activity and time according to the type of activity. Finally, all data must also be sent and saved to a SQL database (for example MySQL)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Financial management theory and practice

Authors: Eugene F. Brigham and Michael C. Ehrhardt

12th Edition

978-0030243998, 30243998, 324422695, 978-0324422696

Students also viewed these Programming questions