Answered step by step
Verified Expert Solution
Question
1 Approved Answer
need this to be solved. Given restaurant inspection dataset for 3 consecutive years (dataset2016, dataset 2017, and dataset 2018 as uploaded on Canvas), please do
need this to be solved.
Given restaurant inspection dataset for 3 consecutive years (dataset2016, dataset 2017, and dataset 2018 as uploaded on Canvas), please do the following:
Getting familiar with your data, merge the data together, and show descriptive statistics for numerical columns.
Find pairwise correlations of the numerical columns.
Checking duplicates and filling missing observations with means Using SparkSQL to find number of distinct restaurants in each zipcode.
You need to run it on PySpark and show results.
Restaurant Safety Analytics (Part I - ETL) Given restaurant inspection dataset for 3 consecutive years (dataset2016, dataset 2017, and dataset 2018 as uploaded on Canvas), please do the following: 1. Getting familiar with your data, merge the data together, and show descriptive statistics for numerical columns. 2. Find pairwise correlations of the numerical columns. 3. Checking duplicates and filling missing observations with means 4. Using SparkSQL to find number of distinct restaurants in each zipcode. You need to run it on PySpark and show resultsStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started