Answered step by step
Verified Expert Solution
Link Copied!

Question

00
1 Approved Answer

PART A: Short Answer Questions [40 marks] 1) Relation bankCustomer has 50,000 tuples, which are stored as fixed length and fixed format records; each has

image text in transcribed
image text in transcribed
PART A: Short Answer Questions [40 marks] 1) Relation bankCustomer has 50,000 tuples, which are stored as fixed length and fixed format records; each has the length of 350 bytes. Tuples contain the non-key attribute name with length of 15 bytes. The tuples are stored sequentially in a number of blocks, ordered by name. Each block has the size of 4,096 bytes and each tuple is fully contained in one block. What is number of disk blocks needed to store the relation bankCustomer? 2/40 2) With the same information in Part A, Question 1), suppose that a primary index using B+ tree on the name attribute is to be created. A 10-byte pointer to actual tuples (an 8 byte block id and 2 byte offset) is needed for each index entry. Each index entry is also fully contained in one block. If the primary index is sparse, i.e. one index entry for one block, what would be the maximum number of blocks needed to store the index? 2/40 3) With the same information in Part A, Question 2), what would be the minimum number of blocks needed to store the index? (Hint: in this case, all tuples have the same name) 2/40] 4) Removal of a search key from a B+ tree may cause a non-leaf node to become underfull. What are the two strategies to restore the balance of the tree? 2/40 5) Briefly describe a popular application of the R-Tree in database systems. 2/40 6) What is the main purpose of query optimisation in relational database systems? 2/40 7) Briefly describe how the comparative selection A>1(r) can be efficiently evaluated using a secondary index, where A stands for the attribute name of the relation r, and V stands for a constant. 2/40 8) What is meant by 'pipelining' in query optimisation in relational database systems? 2/40] 9) What are the two simple rules used in heuristic optimisation that transform query evaluation trees to improve execution performance? 2/40 10) What are the four properties of transactions in relational database systems? [2/40] 11) Briefly describe how to test conflict serialisability of a schedule. [2/40] 12) Briefly describe how to determine if a schedule is cascadeless. [2/40] 13) Briefly describe the two-phase locking protocol for concurrency control in relational database systems. [2/40] 14) In log-based recovery algorithm, what is meant by 'undoing' and 'redoing' a log record of the form ' ', where T stands for a transaction, X for the data item, and V for values? [2/40] 15) Describe what a recovery system would do when performing checkpointing. [2/40] 16) In distributed database systems, what are the responsibilities of a transaction coordinator? [2/40] 17) Briefly describe how RDF (Resource Description Framework) graph datasets and linked data can be queried. [2/40] 18) Briefly describe the concept of 'Eventual Consistency' in big data storage systems. [2/40] 19) What are the two categories of consensus algorithms used in public blockchain based storage systems? [2/40] 20) Suppose that an online company has stored millions of movie reviews in its database. After purchasing a reasonable amount of labeled review data, it wants to analyse if reviews in its database are positive or negative, e.g. for sentimental analysis. Propose two algorithms for this task. [2/40]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions