Question
Suppose I have a relation Grades(student_id, assignment_id, score). I have 150 students and 20 assignments. I would grade all submissions of one assignment based on
Suppose I have a relation Grades(student_id, assignment_id, score). I have 150 students and 20 assignments. I would grade all submissions of one assignment based on the submission order, and then insert the records. As a result, based on my insertion nature, this relation is not sorted on student_id, but sorted on the assignment_id. I choose heap file as my file organization. My page is quite small it can only store 20 records, or 200 bytes in one page. The SearchKeySize is 4 bytes and PointerSize is 2 bytes. My buffer size is also small, 5 pages.
If my most frequent query is to find grades of individual students (e.g., select assignment_id, score from grades where student_id=3347;)
I want to improve the I/O cost. I am debating if I need to build index for student_id, or to sort based on student_id. So I need to do some estimation. Please help me by answering the following questions.
What is the I/O cost of multi-way merge sort (aka, external sort) if I sort the relation after I enter all records? Explain the process.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started