Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Compare the Full feed received from the Roster system with the Data Snapshot created by the AIP _ FDC _ ETL _ System on a

Compare the Full feed received from the Roster system with the Data Snapshot created by the AIP_FDC_ETL_System on a tenant basis.
After comparing the files following analysis is to be derived from the original tenant- specific roster full feed received:
1. Tenant_id
2. Roster_Full_Feed_file_timestamp
3. DataSnapshot_file_timestamp
4. The total number of records received in roster full feed
5. The Total number of new records received in roster full feed
6. The Total number of records have different statuses in roster full feed
The roster full feed and data snapshot feeds reside in a different bucket.
The feeds should be individually read, joined, and compared using Apache PySpark job
I need to extract roster,data snapshot timestamp from filename ,roster full feed is csv,data snapshot are paqrquet files

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Machine Learning And Knowledge Discovery In Databases European Conference Ecml Pkdd 2016 Riva Del Garda Italy September 19 23 2016 Proceedings Part 3 Lnai 9853

Authors: Bettina Berendt ,Bjorn Bringmann ,Elisa Fromont ,Gemma Garriga ,Pauli Miettinen ,Nikolaj Tatti ,Volker Tresp

1st Edition

3319461303, 978-3319461304

More Books

Students also viewed these Databases questions

Question

Differentiate 3sin(9x+2x)

Answered: 1 week ago

Question

Compute the derivative f(x)=(x-a)(x-b)

Answered: 1 week ago