Question
The data contains the following columns: index, region, country, item_type, sales_channel, order_priority, order_date, order_id, ship_date, units_sold, unit_price, unit_cost, total_revenue, total_cost, total_profit Here are 2 questions
The data contains the following columns:
index, region, country, item_type, sales_channel, order_priority, order_date, order_id, ship_date, units_sold, unit_price, unit_cost, total_revenue, total_cost, total_profit
Here are 2 questions you need to answer from the data :
(a) What is the total units_sold in each region across the entire data set ?
(b) What is the average total_profit per transaction in the entire data set ? Remember that average is not an associative and commutative operation.
Think about how will you partition the data in each of the cases, what computation will happen on each partition of data, how will you "join" all the partial results together and present it on one node ?
You can adopt a master-slave architecture where a specific process is the master that takes the file and performs data partitioning, sends data to slaves, slaves perform parallel computation and send results to the master, master does some last step computation if needed and presents the answer to the user. A master process can also function as a slave. You do not need to adopt any parallel processing technique (e.g. using multi-threading) within the master or the slave to speed up local tasks. Your focus should be on parallel processing at a process level.
You need to use OpenMPI to implement the parallel program. Use the standard functions for communication within the cluster of processes. You can run the processes on the same multi-core machine. Given a large data set, you can use the file system to store data partitions, intermediate results at slaves and the final result. Just name the files prefixed with a process identifier so that processes know which data should be read or written by which process. MPI rank can be used as a process identifier.
Some of the dataset is given below
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started