Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

ietbina a Hutsrica D va Su borimodids c. Sxdm - cra ava - Ma devi indaviry 2. Ondera - idr iatiet ir ceibits - Eiste

image text in transcribedimage text in transcribedimage text in transcribed

ietbina a Hutsrica D va Su borimodids c. Sxdm - "cra ava - Ma devi indaviry 2. Ondera - idr iatiet ir ceibits - Eiste N ertic. thatizsist - hareze noisv sane perr ender sider purs 7ievdo 2. Weekly insights a. Sales - Total sales - Total sales in each city - Total sales in each state b. Orders - Total number of orders - City-wise order distribution - State-wise order distribution - Average review score per order - Average freight charges per order - Average time taken to approve the orders (order approved - order purchased) - Average order delivery time c. Total freight charges d. Distribution of freight charges in each city Approach Tasks to perform: Week 1: Overview and basic configurations Step 1: Choose a suitable cloud provider and set up a Spark shell environment Step 2: Configure the necessary dependencies Step 3: Execute basic Spark commands to make sure Spark is ready Step 4: Use README.md for details, instructions, and commands Week 3: Data streaming Step 1: Connect to Spark shell with all the dependencies (Hive, Hadoop,and HDFS), 1. Create Schema of the CSV files 2. Create a Spark session - Add Object Storage Service details as per the Cloud provider - Add all variables to your environment as they contain sensitive data Step 2: Road the CSV file and convert the file to a data frame Step 3: Convert "order_purchase_timestamp" to week and day using UDF Step 4: Calculate the following data: 1. Total sales and order distribution per day and week for each eity 2. Total sales and order distribution per day and week for each state 3. Average review score, average freight value, average order approval, and delivery time 4. The freight charges per city and total freight charges ietbina a Hutsrica D va Su borimodids c. Sxdm - "cra ava - Ma devi indaviry 2. Ondera - idr iatiet ir ceibits - Eiste N ertic. thatizsist - hareze noisv sane perr ender sider purs 7ievdo 2. Weekly insights a. Sales - Total sales - Total sales in each city - Total sales in each state b. Orders - Total number of orders - City-wise order distribution - State-wise order distribution - Average review score per order - Average freight charges per order - Average time taken to approve the orders (order approved - order purchased) - Average order delivery time c. Total freight charges d. Distribution of freight charges in each city Approach Tasks to perform: Week 1: Overview and basic configurations Step 1: Choose a suitable cloud provider and set up a Spark shell environment Step 2: Configure the necessary dependencies Step 3: Execute basic Spark commands to make sure Spark is ready Step 4: Use README.md for details, instructions, and commands Week 3: Data streaming Step 1: Connect to Spark shell with all the dependencies (Hive, Hadoop,and HDFS), 1. Create Schema of the CSV files 2. Create a Spark session - Add Object Storage Service details as per the Cloud provider - Add all variables to your environment as they contain sensitive data Step 2: Road the CSV file and convert the file to a data frame Step 3: Convert "order_purchase_timestamp" to week and day using UDF Step 4: Calculate the following data: 1. Total sales and order distribution per day and week for each eity 2. Total sales and order distribution per day and week for each state 3. Average review score, average freight value, average order approval, and delivery time 4. The freight charges per city and total freight charges

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Intelligent Information And Database Systems 12th Asian Conference ACIIDS 2020 Phuket Thailand March 23 26 2020 Proceedings

Authors: Pawel Sitek ,Marcin Pietranik ,Marek Krotkiewicz ,Chutimet Srinilta

1st Edition

9811533792, 978-9811533792

More Books

Students also viewed these Databases questions

Question

Explain the rationale for continual improvement?

Answered: 1 week ago

Question

What is human nature?

Answered: 1 week ago