Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In this assignment, we will program and build a distributed master - worker paradigm for distributed execution of tasks and answering queries. In that setup,

In this assignment, we will program and build a distributed master-worker paradigm for distributed execution of tasks and answering queries. In that setup, we have one or more client programs (hereafter "client") communicate with a "master" process and pass their queries to the master. The query here refers to a client asking about an individual person by their name and other attributes. In that, the system stores information about an individual's name, residence location, and year of residence in a "data table" (similar to the one shown in class). A sample data record looks like the following indexed by name in a dictionary:
"carrie": {
"record_id": 5,
"name": "carrie",
"location": "Los Angeles",
"year": 2004
}
The query can be either by name or by location or year, in which the master responds with all data items that satisfy the query.
In finer details, the master receives a query from the client but does not actually respond to the query itself, instead, it passes the query to one of the two workers (call them worker 1 and worker 2). To balance the load across the workers, the master splits the workload equally: if the query contains a name that starts with a letter from a to m (all names are in lowercase), it is forwarded to worker 1; otherwise, it is forwarded to worker 2. That means worker 1 handles a-m and worker 2 handles n-z. When the results are returned from the respective worker, the master combines them and returns them as a list to the client.
The schematic diagram looks like this with the associated port the processes are listening on:
master-worker-1.png
We will use RPC as the mode of communication across client, master, and worker and the data format across processes will be JSON (in case you use Python, you can assume regular python objects/dictionaries are being communicated).
Master and Worker will implement the following RPC calls (at least).
-- getbyname(name): returns person information matching the "name"
-- getbylocation(location): returns person information who lived in the specific location
-- getbyyear(location, year): returns person information who lived in a specific location at a specific year
Note that for handling queries with location and year, the query needs to go to both the workers.
Two separate JSON data files (data-am.json and data-nz.json) are given that will be loaded by the respective worker to store in their data table. The master process will not contain any data.
The client program is rather simple. It makes a set of RPC calls to the master asking for individual persons by name, by name, and by year. A sample client program (client.py) is given to test your code.
Skeleton code is available here: PA1 Skeleton Code.zip Download PA1 Skeleton Code.zip
Tasks:
Implement master and worker program (a skeleton code in Python is given to fill in)
Document your program (put comments in the source code as you implement)
Run them and test your code.
Since there are four programs to run, you will need four terminals/consoles to run them
python3 master.py 23000
python3 worker.py 23001 am
python3 worker.py 23002 nz
python3 client.py 23000
Handle error cases as necessary
Handle failure cases (e.g., kill one of your worker processes, the service should still be ON)
Submission:
Put your source code and a short document file describing any interesting observation you noticed when you program this into a zip file and upload.
If you want to build a repository (on Github, google code), you can do so and send a link. You must keep the repo private so that you can claim you worked alone and none copied from you.
Put a README file to describe how to run your code if you implement it using another programming language other than Python.
You record a video demonstrating how your programs run and submit it under the folder "Programming Assignment-1" under "Panopto Video".
[For Grad Students]
Add at least two more functionalities to the system. Describe them in the document and implement them. A couple of possibilities:
Here, the workers are hardcoded in the master. Instead, let workers register with the master when they start and take the workload.
Instead of all data loaded once, publish them one by one by another process (publisher)[you may put a random delay between records]. What happens to the client query then?
The master detects the failure of workers and reroutes the queries as needed.
The master balances the load between the workers, workers communicate to the master how many requests they handled so far so and the master balances the future calls.
Grading guidelines:
Functionalities (program compiles and runs end to end): 50
Additional functionalities: 20
Proper documentation and good coding style: 20
The document describing observations: 10

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

DATABASE Administrator Make A Difference

Authors: Mohciine Elmourabit

1st Edition

B0CGM7XG75, 978-1722657802

More Books

Students also viewed these Databases questions

Question

Discuss the formation and operation of a general partnership.

Answered: 1 week ago

Question

What is Change Control and how does it operate?

Answered: 1 week ago

Question

How do Data Requirements relate to Functional Requirements?

Answered: 1 week ago