Question
Please do not attempt this question if you are going to copy and paste nonsense from the internet. Any solutions that does not answer the
Please do not attempt this question if you are going to copy and paste nonsense from the internet. Any solutions that does not answer the question WILL be reported!
Question: Database Systems Question: Please answer ALL parts of the question with FULL explanations Problem...
Database Systems Question: Please answer ALL parts of the question with FULL explanations. Please specify which part you are answering
Problem 3: Double Buffering with IO
This problem explores an optimization often referred to as double buffering, which we'll use to speed up the external merge sort algorithm.
Recall that sequential IO (i.e. involving reading from / writing to consecutive pages) is generally much faster that random access IO (any reading / writing that is not sequential). Additionally, on newer memory technologies like SSD reading data can be faster than writing data.
In other words, for example, if we read 4 consecutive pages from file A, this should be much faster than reading 1 page from A, then 1 page from file B, then the next page from A.
Assume that 3/4 sequential READS are "free", i.e. the total cost of 4 sequential reads is 1 IO. We will also assume that the writes are always twice as expensive as a read. Sequential writes are never free, therefore the cost of N writes is always 2N.
NO REPACKING: Consider the external merge sort algorithm using the basic optimizations but do not use the repacking optimization
ONE BUFFER PAGE RESERVED FOR OUTPUT: Assume we use one page for output in a merge, e.g. a B-way merge would require B+1 buffer pages
REMEMBER TO ROUND: Take ceilings (i.e. rounding up to nearest integer values) into account in this problem for full credit! Note that we have sometimes omitted these (for simplicity) in lecture
Consider worst case cost: In other words, if 2 reads could happen to be sequential, but in general might not be, consider these random IO
Consider a modification of the external merge sort algorithm where reads are always read in 4-page chunks (i.e. 4 pages sequentially at a time) so as to take advantage of sequential reads. Calculate the cost of performing the external merge sort for a setup having B + 1 = 20 buffer pages and an unsorted input file with 160 pages.
Show the steps of your work and make sure to explain your reasoning by writing them as python comments above the final answers.
a) Give the exact IO cost of spliting and sorting the files? As is standard we want runs of size B + 1
b) How many passes of merging are required?
c) What is the IO cost of the first pass of merging? Note: the highest arity merge should always be used.
d) What is the total IO cost of running this external merge sort algorithm? Do not forget to add in the remaining passes (if any) of merging.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started