Answered step by step
Verified Expert Solution
Question
1 Approved Answer
In this assignment, we will program and build a distributed master - worker paradigm for distributed execution of tasks and answering queries. In that setup,
In this assignment, we will program and build a distributed masterworker paradigm for distributed execution of tasks and answering queries. In that setup, we have one or more client programs hereafter "client" communicate with a "master" process and pass their queries to the master. The query here refers to a client asking about an individual person by their name and other attributes. In that, the system stores information about an individual's name, residence location, and year of residence in a "data table" similar to the one shown in class A sample data record looks like the following indexed by name in a dictionary:
"carrie":
"recordid:
"name": "carrie",
"location": "Los Angeles",
"year":
The query can be either by name or by location or year, in which the master responds with all data items that satisfy the query.
In finer details, the master receives a query from the client but does not actually respond to the query itself, instead, it passes the query to one of the two workers call them worker and worker To balance the load across the workers, the master splits the workload equally: if the query contains a name that starts with a letter from a to m all names are in lowercase it is forwarded to worker ; otherwise, it is forwarded to worker That means worker handles am and worker handles nz When the results are returned from the respective worker, the master combines them and returns them as a list to the client.
The schematic diagram looks like this with the associated port the processes are listening on:
masterworkerpng
We will use RPC as the mode of communication across client, master, and worker and the data format across processes will be JSON in case you use Python, you can assume regular python objectsdictionaries are being communicated
Master and Worker will implement the following RPC calls at least
getbynamename: returns person information matching the "name"
getbylocationlocation: returns person information who lived in the specific location
getbyyearlocation year: returns person information who lived in a specific location at a specific year
Note that for handling queries with location and year, the query needs to go to both the workers.
Two separate JSON data files dataamjson and datanzjson are given that will be loaded by the respective worker to store in their data table. The master process will not contain any data.
The client program is rather simple. It makes a set of RPC calls to the master asking for individual persons by name, by name, and by year. A sample client program clientpy is given to test your code.
Skeleton code is available here: PA Skeleton Code.zip Download PA Skeleton Code.zip
Tasks:
Implement master and worker program a skeleton code in Python is given to fill in
Document your program put comments in the source code as you implement
Run them and test your code.
Since there are four programs to run, you will need four terminalsconsoles to run them
python master.py
python worker.py am
python worker.py nz
python client.py
Handle error cases as necessary
Handle failure cases eg kill one of your worker processes, the service should still be ON
Submission:
Put your source code and a short document file describing any interesting observation you noticed when you program this into a zip file and upload.
If you want to build a repository on Github, google code you can do so and send a link. You must keep the repo private so that you can claim you worked alone and none copied from you.
Put a README file to describe how to run your code if you implement it using another programming language other than Python.
You record a video demonstrating how your programs run and submit it under the folder "Programming Assignment under "Panopto Video".
For Grad Students
Add at least two more functionalities to the system. Describe them in the document and implement them. A couple of possibilities:
Here, the workers are hardcoded in the master. Instead, let workers register with the master when they start and take the workload.
Instead of all data loaded once, publish them one by one by another process publisheryou may put a random delay between records What happens to the client query then?
The master detects the failure of workers and reroutes the queries as needed.
The master balances the load between the workers, workers communicate to the master how many requests they handled so far so and the master balances the future calls.
Grading guidelines:
Functionalities program compiles and runs end to end:
Additional functionalities:
Proper documentation and good coding style:
The document describing observations:
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started