Answered step by step
Verified Expert Solution
Question
1 Approved Answer
ietbina a Hutsrica D va Su borimodids c. Sxdm - cra ava - Ma devi indaviry 2. Ondera - idr iatiet ir ceibits - Eiste
ietbina a Hutsrica D va Su borimodids c. Sxdm - "cra ava - Ma devi indaviry 2. Ondera - idr iatiet ir ceibits - Eiste N ertic. thatizsist - hareze noisv sane perr ender sider purs 7ievdo 2. Weekly insights a. Sales - Total sales - Total sales in each city - Total sales in each state b. Orders - Total number of orders - City-wise order distribution - State-wise order distribution - Average review score per order - Average freight charges per order - Average time taken to approve the orders (order approved - order purchased) - Average order delivery time c. Total freight charges d. Distribution of freight charges in each city Approach Tasks to perform: Week 1: Overview and basic configurations Step 1: Choose a suitable cloud provider and set up a Spark shell environment Step 2: Configure the necessary dependencies Step 3: Execute basic Spark commands to make sure Spark is ready Step 4: Use README.md for details, instructions, and commands Week 3: Data streaming Step 1: Connect to Spark shell with all the dependencies (Hive, Hadoop,and HDFS), 1. Create Schema of the CSV files 2. Create a Spark session - Add Object Storage Service details as per the Cloud provider - Add all variables to your environment as they contain sensitive data Step 2: Road the CSV file and convert the file to a data frame Step 3: Convert "order_purchase_timestamp" to week and day using UDF Step 4: Calculate the following data: 1. Total sales and order distribution per day and week for each eity 2. Total sales and order distribution per day and week for each state 3. Average review score, average freight value, average order approval, and delivery time 4. The freight charges per city and total freight charges ietbina a Hutsrica D va Su borimodids c. Sxdm - "cra ava - Ma devi indaviry 2. Ondera - idr iatiet ir ceibits - Eiste N ertic. thatizsist - hareze noisv sane perr ender sider purs 7ievdo 2. Weekly insights a. Sales - Total sales - Total sales in each city - Total sales in each state b. Orders - Total number of orders - City-wise order distribution - State-wise order distribution - Average review score per order - Average freight charges per order - Average time taken to approve the orders (order approved - order purchased) - Average order delivery time c. Total freight charges d. Distribution of freight charges in each city Approach Tasks to perform: Week 1: Overview and basic configurations Step 1: Choose a suitable cloud provider and set up a Spark shell environment Step 2: Configure the necessary dependencies Step 3: Execute basic Spark commands to make sure Spark is ready Step 4: Use README.md for details, instructions, and commands Week 3: Data streaming Step 1: Connect to Spark shell with all the dependencies (Hive, Hadoop,and HDFS), 1. Create Schema of the CSV files 2. Create a Spark session - Add Object Storage Service details as per the Cloud provider - Add all variables to your environment as they contain sensitive data Step 2: Road the CSV file and convert the file to a data frame Step 3: Convert "order_purchase_timestamp" to week and day using UDF Step 4: Calculate the following data: 1. Total sales and order distribution per day and week for each eity 2. Total sales and order distribution per day and week for each state 3. Average review score, average freight value, average order approval, and delivery time 4. The freight charges per city and total freight charges
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started