Question

1 Approved Answer

Posted on Jun 28, 2024

Database Administration CHAPTER OBJECTIVES I Understand the need for and importance of database I ttno'illI the meaning of ACID transaction administralin - Learn the tour

Database Administration CHAPTER OBJECTIVES I Understand the need for and importance of database I ttno'illI the meaning of ACID transaction administralin - Learn the tour 1992 ANSI standard isolation levels I Know basic administrative and managerial can functions I Learn different \"a\" of processing a database using I Understand the need for concurrency control. security. cursors and backup and \"\"9\"\" I Understand the need for security and specific tasks for I Learn about typical problems that can occur when improving database security multiple users process a database concurrently I Know the difference between recovery via reprocessing I Understand the use of locking and the problem of deadlock and recovery via rollbackirolltorrvard I Learn the difference between optimistic and pessimistic I Understand the nature of the tasks required for recovery locking using rollbackirollfonvard his chapter describes the maior tasks of an important business func- tion called database administration. This function involves managing a database to maximize its value to an organization. Usually. database administration involves balancing the conflicting goals of protecting the data- base and maximizing its availability and benefit to users. Both the terms data administration and database administration are used in the industry. in some cases, the terms are considered to be synonymous: in other cases. they have different meanings. Most commonly. the term data administration refers to a function that applies to an entire organization; it is a management-oriented function that concerns corporate data privacy and security issues. The term database administration refers to a more technical function that is specific to a particular database. including the applications that process that database. This chapter addresses database administration. Databases vary considerably in size and scope. from single-user personal databases to large interorganizational databases. such as airline reservation systems. All databases have a need for database administration. although the tasks to be accomplished vary in complexity. When using a personal database. for example. individuals follow simple procedures for backing up their data, and they keep minimal records for documentation. In this case. the person who uses the database also performs the DBA functions. even though he or she is probably unaware of it. For multiusm database applications. database administration becomes both more important and more difficult. Consequently. it generally has formal recognition. For some applications. one or two people are given this function 397 396 Pans DatabaseMnnaoament on a part-time basis. For large Internet or intranet databases, database ad- ministration responsibilities are often too timeeonsuming and too varied to be handled even by a single full-time person. Supporting a database with doz- ens or hundreds of users requires considerable time as well as both technical knowledge and diplomatic skill, and it is usually handled by an office of data- base administration. The manager of the office is often known as the database administrator. In this case. BIA refers to either the office or the manager. The overall responsibility of a DBA is to facilitate the development and use of a database. Usually. this means balancing the conflicting goals of protecting the database and maximizing its availability and benefit to users. The DEA is responsible for the development, operation. and maintenance of the database and its applications. The DBMS software must be kept up to date with the latest fixes. The DEA also monitors database and application performance and adjusts configuration parameters. the physical placement of the data. and the use of indices. In this chapter. we examine three important database administration functions: concurrency control. security. and backup and recovery. Then we discuss the need for configuration change management. But before you learn about any of this. we will create the Heather Sweeney Designs database discussed in the previous chapters: you'll use it as an example database for the discussion in this chapter and in Chapter 7. THE HEATHER SWEENEY DESIGNS DATABASE The SQL statements to create the Heather Sweeney Designs (1131)) database are shown in Figure 34?. These SQL statements are in MySQL. 8.0 syntax. The SQL statements are built from the HSD database design in Figure 5~2?, and the column constraints follow the attribute specifications in Figure 5-26 and Figure 5-28 and the referential integrity con- straint specications outlined in Figure 5-29. The SQL statements to populate the HSD database are shown in Figure 338. The completed HSD database is shown in the MySQL Workbench in Figure 6-1. THE NEED FOR CONTROL. SECURITY. AND RELIABILITY Databases come in a variety of sizes and scopes. from singlevuset databases to huge, inter- organizational databases, such as inventory management systems. As shown in Figure 6-2. databases also vary in the way they are processed. II is possible for every one of the application elements in Figure 6-2 to be operating at the same time. Queries. forms, and reports can be gmeraled while Web pages (using Active Server Pages [ASP] and Java Server Pages USP]. PHP. or other optional access the data- base, possibly invoking stored procedures. Traditional application programs running in Visual Basic. Cit. Java. and other programming languages can be processing transactions on the database. All this activity can cause pieces of programming code stored in the DBMS which are known as SOL/Persistent Stored Module: (SQUPSM), and which include userdened Emotions, triggers, and stored proceduru. and which are discussed in the online Extension B "Advmced SQL,\" to be invoked. While all this is occurring. Chapter 6 Database Administration 399 FIGURE 6-1 The HSD Database in MySQL Workbench MySQL Workbench Local instance MySQLBO x Est View Quary Database Server Tools Scripting Help The HSD database 1 . select * from customers and table objects OO The data in the CUSTOMER table Acson Output 2 000 see / 0.900 sec Oracle MySQL Community Server 8.0, Oracle Corporation. FIGURE 6-2 Forms The Database Reports Processing Database Environment Queries Active Server Pages . NET DBMS (ASP.NET) Triggers Java Server Pages (JSP) Stored Procedures Application Programs in Visual Basic C#, Java, etc. constraints, such as those on referential integrity, must be enforced. Finally, hundreds, or even thousands, of people might be using the system, and they might want to process the database 24 hours a day, 7 days a week. Three database administration functions are necessary to bring order to this potential chaos. First, the actions of concurrent users must be controlled to ensure that results are400 Part 3 Database Management consistent with what is expected. Second, security measures must be in place and enforced so that only authorized users can take authorized actions at appropriate times. Finally, backup and recovery techniques and procedures must be operating to protect the database in case of failure and to recover it as quickly and accurately as possible when necessary. We will consider each of these in turn. CONCURRENCY CONTROL The purpose of concurrency control is to ensure that one user's work does not inappropri- ately influence another user's work. In some cases, these measures ensure that a user gets the same result when processing with other users as that person would have received if processing alone. In other cases, it means that the user's work is influenced by other users but in an anticipated way. For example, in an order-entry system, a user should be able to enter an order and get the same result, whether there are no other users or hundreds of other users. However, a user who is printing a report of the most current inventory status might want to obtain in-process data changes from other users, even if those changes might later be canceled. Unfortunately, no concurrency control technique or mechanism is ideal for all circum- stances; they all involve trade-offs. For example, a user can obtain strict concurrency con- trol by locking the entire database, but while that person is processing no other user will be able to do anything. This is robust protection, but it comes at a high cost. As you will see, other measures are available that are more difficult to program and enforce but allow more throughput, which is defined as the maximum rate of processing. Still other measures are available that maximize throughput but for a low level of concurrency control. When de trade-offs. signing multiuser database applications, developers need to choose among these The Need for Atomic Transactions In most database applications, users submit work in the form of transactions, also known as logical units of work (LUWs). A transaction (or LUW) is a series of actions to be taken on a database such that all of them are performed successfully or none of them are per- formed at all, in which case the database remains unchanged. Such a transaction is some- times called atomic because it is performed as a unit. Consider the following sequence of database actions that could occur when recording a new order: 1. Change the customer record, increasing the value of Amount Owed. 2. Change the salesperson record, increasing the value of Commission Due. 3. Insert the new-order record into the database. Suppose the last step fails, perhaps because of insufficient file space. Imagine the con- fusion that would ensue if the first two changes were made but the third one was not. The customer would be billed for an order that was never received, and a salesperson would receive a commission on an order that was never sent to the customer. Clearly, these three should be done. actions need to be taken as a unit: Either all of them should be done or none of them Figure 6-3 compares the results of performing these activities as a series of indepen dent steps [Figure 6-3(a)] and as an atomic transaction [Figure 6-3(b)]. Notice that when the steps are carried out atomically and one fails, no changes are made in the database. Also note that the application program must issue the commands equivalent to the Start Transaction (marks the beginning of the transaction), Commit Transaction (saves the new data to the database and ends the transaction), and Rollback Transaction (undoes any data changes and ends the transaction) commands shown in Figure 6-3(b) to mark the boundaries of the transaction logic. The particularChapter 6 Database Administration 401 FIGURE 6-3 Before Action After Comparison of the CUSTOMER START CUSTOMER Results of Applying C-no Order # Description Cost Serial Actions Versus 123 1000 400 Baseballs $2400 . Add new-order data to C-no Order # Description Cost CUSTOMER . 123 1000 400 Baseballs $2400 a Multiple-Step 123 8000 250 Basketballs $6500 Transaction SALESPERSON SALESPERSON Name Total- Sales 2. Add new-order data to Total- JONES | $3200 . . . ... SALESPERSON. Name Sales JONES |$9700 . . . . . . ORDERS ORDERS Order 3. Insert new 1000 ORDER. Onner a 1000 2000 2000 4000 3000 . . 4000 . . 5000 5000 6000 7000 STOP 6000 "FULL. 7000 "FULL. (a) Two of Three Activities Successfully Completed, Resulting in Database Anomalies Before Transaction After CUSTOMER Start Transaction C-no Order # Description Cost Change CUSTOMER 123 1000 400 Baseballs $2400 CUSTOMER data Change SALESPERSON C-no Order # Description Cost data 123 1000 400 Baseballs $2400 Insert ORDER data SALESPERSON If no errors then Commit Transaction SALESPERSON Name Torail- Total- Sales Rollback Transaction Sales JONES |$3200 .. .. End If Name JONES $3200 .. ... RDERS ORDERS Order # Order # 1000 . . . 1000 . 2000 2000 -. 3000 4000 4000 - 5000 5000 . . . 6000 7000 . . 6000 . . . "FULL. 7000 . . "FULL. (b) No Change Made Because Entire Transaction Not Successful syntax of these commands varies from one DBMS product to another. In SQL, this set of commands is known as SQL Transaction Control Language (TCL), and we will discuss it later in this chapter. Concurrent Transaction Processing When two transactions are being processed against a database at the same time, they are termed concurrent transactions. Although it might appear to the users that concurrent transactions are being processed simultaneously, this is not necessarily true because mod- ern multi-core central processing units (CPUs) of the machine processing the database can execute only a few instructions at a time. Usually transactions are interleaved, which means402 Part 3 Database Management FIGURE 6-4 User A User B Example of Concurrent Processing of Two 1. Read Item 100. 1. Read Item 200. Users' Tasks 2. Change Item 100. 2. Change Item 200. 3. Write Item 100. 3. Write Item 200. One possible order of processing at database server 1. Read Item 100 for A 2. Read Item 200 for B. 3. Change Item 100 for A. 4. Write Item 100 for A. 5. Change Item 200 for B. em 200 for B. the operating system switches CPU services among tasks so that only a portion of each of them is carried out in a given interval. This switching among tasks is done so quickly that two people seated at browsers side by side, processing against the same database, might believe that their two transactions are completed simultaneously. However, in reality the two transactions are interleaved. Figure 6-4 shows two concurrent transactions. User A's transaction reads Item 100, changes it, and rewrites it in the database. User B's transaction takes the same actions but on Item 200. The CPU processes User A's transaction until the CPU must wait for a read or write operation to complete or for some other action to finish. The operating system then shifts control to User B. The CPU processes User B's transaction until a similar interrup tion in the transaction processing occurs, at which point the operating system passes con- trol back to User A. Again, to the users, the processing appears to be simultaneous, but in reality it is interleaved, or concurrent. The Lost Update Problem The concurrent processing illustrated in Figure 6-4 poses no problems because the users are processing different data. Now suppose both users want to process Item 100. For example, User A wants to order 5 units of Item 100, and User B wants to order 3 units of Item 100. Figure 6-5 illustrates this problem. User A reads Item 100's record (from the database on disk), which is transferred into a user work area (in memory). According to the record, 10 items are in inventory. Then User B reads Item 100's record, and it goes into another user work area. Again, according to the record, 10 items are in inventory. Now, User A takes 5 of them, decrements the count of items in its user work area to 5, and rewrites the record for Item 100. Then User B takes 3, decrements the count in its user work area to 7, and rewrites the record for Item 100. The database now shows, incorrectly, that 7 units of Item 100 remain in inventory. To review, the inventory started at 10, then User A took 5, User B took 3, and the database wound up showing that 7 were left in inventory. Clearly, this is a problem. Both users obtained data that were correct at the time they obtained the data. However, when User B read the record, User A already had a copy that it was about to up date. This situation is called the lost update problem or the concurrent update problem. Another similar problem is called the inconsistent read problem. In this situation, User A reads data that have been processed by only a portion of a transaction from User B. As a result, User A reads incorrect data.Chapter 6 Database Administration 403 FIGURE 6-5 User A User B Example of the Lost Update Problem 1. Read Item 100 1. Read Item 100 (assume item count is 10). (assume item count is 10). 2. Reduce count of items by 5. 2. Reduce count of items by 3. 3. Write Item 100. 3. Write Item 100. Order of processing at database server 1. Read Item 100 (for A) 2. Read Item 100 (for B). 3. Set item count to 5 (for A). 4. Write Item 100 for A. 5. Set item count to 7 (for B). 6. Write Item 100 for B. Note: The change and write in steps 3 and 4 are lost. Resource Locking One remedy for the inconsistencies caused by concurrent processing is to prevent multiple applications from obtaining copies of the same rows or tables when those rows or tables are about to be changed. This remedy, called resource locking, prevents concurrent processing problems by disallowing sharing by locking data that are retrieved for update. Figure 6-6 shows the order of processing using lock commands. Because of the lock, User B's transaction must wait until User A is finished with the Item 100 data. Using this strategy, User B can read Item 100's record only after User A has completed the modification. In this case, the final item count stored in the database is 2, which is what it should be. (It started with 10, then A took 5 and B took 3, leaving 2.) FIGURE 6-6 User A User B Example of Concurrent Processing with 1. Lock Item 100. 1. Lock Item 100. Explicit Locks 2. Read Item 100. 2. Read Item 100. 3. Reduce count by 5. 3. Reduce count by 3. 4. Write Item 100. 4. Write Item 100. Order of processing at database server 1. Lock Item 100 for A 2. Read Item 100 for A. A's transaction 3. Lock Item 100 for B; cannot, so place B in wait state. 4. Set item count to 5 for A. 5. Write Item 100 for A. 6. Release A's lock on Item 100. B's transaction 7. Place lock on Item 100 for B. 8. Read Item 100 for B. 9. Set item count to 2 for B. 10. Write Item 100 for B. 11. Release B's lock on Item 100.404 Part 3 Database Management Locks can be placed automatically by the DBMS or by a command issued to the DBMS from the application program or by running a query. Locks placed by the DBMS are called implicit locks; those placed by command are called explicit locks. In the preceding example, the locks were applied to rows of data; however, not all locks are applied at this level. Some DBMS products lock at the page level (group of rows stored together physically in secondary memory), some at the table level, and some at the database level. The size of a lock is referred to as the lock granularity. Locks with large granularity (table and database levels) are easy for the DBMS to administer but frequently cause conflicts. Locks with small granularity (page, row, or field) are difficult to administer, because the DBMS has many more details to keep track of and check, but conflicts are less common. Locks also vary by type. An exclusive lock locks an item from access of any type. No other transaction can read or change the data. A shared lock locks an item from being changed but not from being read; that is, other transactions can read the item as long as they do not attempt to alter it. Serializable Transactions When two or more transactions are processed concurrently, the results in the database should be logically consistent with the results that would have been achieved had the trans- actions been processed in an arbitrary serial fashion. Serial means that the currently exe- cuting transaction runs to completion before any other transaction, or part of a transaction, is executed: there is no interleaving or concurrency. This differs from parallel, which means that two or more actions can be done at once, or concurrent, which is when actions from different transactions are interleaved. A scheme for processing concurrent transactions in such a way that the database results are consistent with a serial execution way is said to be serializable. Serializability can be achieved through a number of different means. One way is to process the transaction by using two-phase locking. With this strategy, transactions are al- lowed to obtain locks as necessary, but once the first lock is released, no new locks can be obtained. Transactions thus have a growing phase in which the locks are obtained and a shrinking phase in which the locks are released. A special case of two-phase locking, called strict two-phase locking, is used with a number of DBMS products. With it, locks are obtained throughout the transaction, but no lock is released until the COMMIT or ROLLBACK command is issued. This strategy is more restrictive than two-phase locking requires, but it is easier to implement. Consider an order-entry transaction that involves processing data in the CUSTOMER, SALESPERSON, and ORDER tables. To make sure the database will suffer no anomalies due to concurrency, the order-entry transaction issues locks on CUSTOMER, SALESPERSON, and ORDER as needed; makes all the database changes; and then re- leases all its locks. Deadlock Although locking solves one problem, it causes another. Consider what might happen when two users each want to order two items from inventory. Suppose User A wants to or der some paper, and, if she can get the paper, she also wants to order some pencils. In addi- tion, suppose that User B wants to order some pencils, and, if he can get the pencils, he also wants to order some paper. An example of the possible order of processing is shown in Figure 6-7. In this figure, Users A and B are locked in a condition known as deadlock, sometimes called the deadly embrace. Each is waiting for a resource that the other person has locked. Two common ways of solving this problem are preventing the deadlock from occurring and allowing the deadlock to occur and then breaking it.Chapter 6 Database Administration 405 FIGURE 6-7 User A User B Example of Deadlock 1. Lock paper. 1. Lock pencils. 2. Take paper. 2. Take pencils. 3. Lock pencils. 3. Lock paper. Order of processing at database server 1. Lock paper for User A. 2. Lock pencils for User B 3. Process A's request; write paper record. 4. Process B's request; write pencil record. 5. Put A in wait state for pencils. 6. Put B in wait state for paper. " Locked ** Deadlock can be prevented in several ways. One way is to allow users to issue only one lock request at a time; in essence, users must lock all the resources they want at once. For example, if User A in Figure 6-7 had locked both the paper and the pencil records at the beginning, the deadlock would not have occurred. A second way to prevent deadlock is to require all application programs to lock resources in the same order. Almost every DBMS has algorithms for detecting deadlock. When deadlock occurs, the normal solution is to roll back one of the transactions to remove its changes from the database. Optimistic Versus Pessimistic Locking Locks can be invoked in two basic styles. With optimistic locking, the assumption is made that no conflict will occur. Data are read, the transaction is processed, updates are issued, and then a check is made to see whether conflict occurred. If there was no conflict, the transaction finishes. If there was conflict, the transaction is repeated until it processes with no conflict. With pessimistic locking, the assumption is made that conflict will occur. Locks are issued, the transaction is processed, and then the locks are freed. Figure 6-8 and Figure 6-9 show examples of both styles of locking for a transaction that is reducing the quantity of the pencil row in the PRODUCT table by 5. Figure 6-8 shows optimistic locking. First, the data are read and the current value of Quantity of pen- cils is saved in the variable OldQuantity. The transaction is then processed, and, if no er- rors have occurred, a lock is obtained on PRODUCT. The lock might be only for the pencil row, or it might be at a larger level of granularity. In any case, an SQL statement is issued to update the pencil row with a WHERE condition that the current value of Quantity equals OldQuantity. If no other transaction (or set of transactions) has changed the Quantity of the pencil row, then this UPDATE will be successful. If another transaction (or set of trans- actions) has changed the Quantity of the pencil row, the UPDATE will fail, and the trans- action will need to be repeated. Figure 6-9 shows the logic for the same transaction using pessimistic locking. In this case, a lock is obtained on PRODUCT (at some level of granularity) before any work is begun. Then values are read, the transaction is processed, the UPDATE occurs, and PRODUCT is unlocked. The advantage of optimistic locking is that the lock is obtained only after the transac- tion has been processed. Thus, the lock is held for less time than with pessimistic locking. If the transaction is complicated or if the client is slow (due to transmission delays or to the406 Part 3 Database Management FIGURE 6-8 Example of Optimistic SELECT PRODUCT.Name, PRODUCT. Quantity FROM PRODUCT Locking WHERE PRODUCT.Name = 'Pencil' OldQuantity = PRODUCT.Quantity Set NewQuantity = PRODUCT.Quantity - 5 (process transaction - take exception action if NewQuantity 10000; After an application program opens a cursor, it can place the cursor somewhere in the result set. Most commonly, the cursor is placed on the first or last row, but other possibilities exist. Each row from the result set is retrieved using the MySQL FETCH SQL command. There is an ISO standard for defining cursors, and most DBMSs have their own extensions for declaring and processing cursors.Chapter 6 Database Administration FIGURE 6-13 Summary of Cursor Cursor Type Description Comments Types Static Application sees the data as they were at the time the Changes made by this cursor cursor was opened. are visible. Changes from other sources are not visible. Backward and forwar scrolling are allowed. Keyset When the cursor is opened, a primary key value is saved Updates from any source are visible. Inserts from sources for each row in the recordset. When the application outside this cursor are not visible (there is no key for them in the accesses a row, the key is used to fetch the current keyset). Inserts from this cursor values for the row. appear at the bottom of the recordset. Deletions from any source are visible. Changes in row order are not visible. If the isolation level is dirty read, then committed updates and deletions are visible; otherwise, only committed updates and deletions are visible. Dynamic Changes of any type and from All inserts, updates, deletions, any source are visible. and changes in recordset order are visible. If the isolation level is dirty read, then uncommitted changes are visible. Otherwise, only committed changes are visible. A transaction can open several cursors-either sequentially or simultaneously. In addi- tion, two or more cursors may be open on the same table either directly on the table or through an SQL view on that table. Because cursors require considerable memory, having many cursors open at the same time (for example, for a thousand concurrent transactions) consumes considerable memory. One way to reduce cursor burden is to define reduced- capability cursors and use them when a full-capability cursor is not needed. Figure 6-13 lists three cursor types supported by Microsoft SQL Server. In SQL Server, cursors may be either forward-only cursors or scrollable cursors. With a forward- only cursor, the application can only move forward through the records, and changes made by other cursors in this transaction and other transactions will be visible only if they occur to the rows ahead of the cursor. With a scrollable cursor, the application can scroll forward and backward through the records. There are three types of SQL Server cursors, each of which can be implemented as either a forward-only or scrollable cursor. A static cursor takes a snapshot of a relation and processes that snapshot. Changes made using this cursor are visible; changes from other sources are not visible. MySQL just supports static, forward-only cursors. A dynamic cursor is a fully featured cursor. All inserts, updates, deletions, and changes in row order are visible to a dynamic cursor. Unless the isolation level of the transaction is read uncommitted, only committed changes are visible. Keyset cursors combine some features of static cursors with some features of dynamic cursors. When the cursor is opened, a primary key value is saved for each row. When the application positions the cursor on a row, the DBMS uses the key value to read the current value of the row. Inserts of new rows by other cursors (in this transaction or in other trans- actions) are not visible. If the application issues an update on a row that has been deleted