8/13/2016

Data concurrency in DB2 deep dive

DB2 concurrency basics

In a multi-user database environment, transactions are usually executed simultaneously. Each transaction has the potential to interfere with any other transaction that is running. Without any appropriate locking mechanism, the following undesirable effects may occur:

Lost updates: Two applications, A and B, might both read the same row from the database, and both calculate new values for one of its columns based on the data these applications read. If A updates the row with its new value and B then also updates the row, the update performed by A is lost.


Access to uncommitted data: Application A might update a value in the database, and application B might read that value before it was committed. Then, if the value of A is not later committed, but backed out, the calculations performed by B are based on uncommitted (and presumably invalid) data.

Non-repeatable reads: Some applications involve the following sequence of events:
Application A reads a row from the database, then goes on to process other SQL requests.
In the meantime, application B either modifies or deletes the row and commits the change.
Later, when application A attempts to read the original row again, it receives the modified row or discovers that the original row has been deleted.

Phantom reads: The phantom read phenomenon occurs when:
Your application executes a query that reads a set of rows based on some search criterion.
Another application inserts new data or updates existing data that would satisfy your application's query.
Your application repeats the query from the first step (within the same unit of work). Now some additional rows are returned as part of the result set that were not returned when the query was initially executed in the first step.

Isolation levels


DB2 Universal Database provides different levels of protection to isolate the data from each of the database applications while it is being accessed. These levels of protection are known as isolation levels, or locking strategies. Choosing an appropriate isolation level ensures data integrity and also avoids unnecessary locking. The isolation levels supported by DB2 are listed below, ordered in terms of concurrency, starting with the maximum:

Uncommitted Read: The Uncommitted Read (UR) isolation level, also known as "dirty read," is the lowest level of isolation supported by DB2. It can be used to access uncommitted data changes of other applications. For example, an application using the Uncommitted Read isolation level will return all of the matching rows for the query, even if that data is in the process of being modified and may not be committed to the database. You need to be aware that if you are using Uncommitted Read, two identical queries may get different results, even if they are issued within a unit of work, since other concurrent applications can change or modify those rows that the first query retrieves.

Uncommitted Read transactions will hold very few locks. Thus they are not likely to wait for other transactions to release locks. If you are accessing read-only tables or it is acceptable for the application to retrieve uncommitted data updated by another application, use this isolation level, because it is most preferable in terms of performance.

Cursor Stability: The Cursor Stability (CS) isolation level is the default isolation level and locks any row on which the cursor is positioned during a unit of work. The lock on the row is held until the next row is fetched or the unit of work is terminated. If a row has been updated, the lock is held until the unit of work is terminated. A unit of work is terminated when either a COMMIT or ROLLBACK statement is executed.
An application using Cursor Stability cannot read uncommitted data. In addition, the application locks the row that has been currently fetched, and no other application can modify the contents of the current row. As the application locks only the row on which the cursor is positioned, two identical queries may still get different results even if they are issued within a unit of work.
When you want the maximum concurrency while seeing only committed data from concurrent applications, choose this isolation level.

Read Stability: The Read Stability (RS) isolation level locks those rows that are part of a result set. If you have a table containing 100,000 rows and the query returns 10 rows, then only 10 rows are locked until the end of the unit of work.
An application using Read Stability cannot read uncommitted data. Instead of locking a single row, it locks all rows that are part of the result set. No other application can change or modify these rows. This means that if you issue a query twice within a unit of work, the second run can retrieve the same answer set as the first. However, you may get additional rows, as another concurrent application can insert rows that match to the query.

Repeatable Read: The Repeatable Read (RR) isolation level is the highest isolation level available in DB2. It locks all rows that an application references within a unit of work, no matter how large the result set. In some cases, the optimizer decides during plan generation that it may get a table level lock instead of locking individual rows, since an application using Repeatable Read may acquire and hold a considerable number of locks. The values of the LOCKLIST and MAXLOCKS database configuration parameters will affect this decision.
An application using Repeatable Read cannot read uncommitted data of a concurrent application. As the name implies, this isolation level ensures the repeatable read to applications, meaning that a repeated query will get the same record set as long as it is executed in the same unit of work. Since an application using this isolation level holds more locks on rows of a table, or even locks the entire table, the application may decrease concurrency. You should use this isolation level only when changes to your result set within a unit of work are unacceptable.

When to choose which isolation level

When you choose the isolation level for your application, decide which concurrency problems are unacceptable for your application and then choose the isolation level which prevents these problems. Remember that the more protection you have, the less concurrency is available.
  • ·         Use the Uncommitted Read isolation level only if you use queries on read-only tables, or if you are using only SELECT statements and getting uncommitted data from concurrent applications is acceptable. This isolation level provides the maximum concurrency.
  • ·         Use the Cursor Stability isolation level when you want the maximum concurrency while seeing only committed data from concurrent applications.
  • ·         Use the Read Stability isolation level when your application operates in a concurrent environment. This means that qualified rows have to remain stable for the duration of the unit of work.
  • ·         Use the Repeatable Read isolation level if changes to your result set are unacceptable. This isolation level provides minimum concurrency.


How to specify isolation levels:

If no isolation level is specified, DB2 Universal Database will use a default of Cursor Stability (CS). The following options are available to specify the current isolation level:

At the statement level:
SELECT (INTO), DELETE, UPDATE ... WITH {RR, RS, CS, UR}

For dynamic SQL within the current session:
SET CURRENT ISOLATION = {RR, RS, CS, UR} or SET ISOLATION {RR, RS, CS, UR}

At precompile or bind time:
ISOLATION {RR, RS, CS, UR} option of the command line processor PREP or BIND commands

From the DB2 Call Level Interface (DB2 CLI):
For DB2 CLI, change the isolation level as part of the DB2 CLI configuration (db2cli.ini). This can be done either by:
UPDATE CLI CFG FOR SECTION sectionname USING TXNISOLATION {1, 2, 4, 8}
where:
1 = SQL_TXN_READ_UNCOMMITTED - Read Uncommitted (Uncommitted Read)
2 = SQL_TXN_READ_COMMITTED (default) - Read Committed (Cursor Stability)
4 = SQL_TXN_REPEATABLE_READ - Repeatable Read (Read Stability)
8 = SQL_TXN_SERIALIZABLE - Serializable (Repeatable Read)

editing db2cli.ini and setting applTxnIsolation = {1, 2, 4, 8}

Locks, lock wait, deadlock, and lock escalation:

Locks:
DB2 Universal Database isolates transactions from each other through the use of locks. A lock is a mechanism that is used to associate a data resource with a single transaction, with the purpose of controlling how other transactions interact with that resource while it is associated with the owning transaction. (The transaction that a locked resource is associated with is said to "hold" or "own" the lock). The DB2 Database Manager uses locks to prohibit transactions from accessing uncommitted data written by other transactions (unless the Uncommitted Read isolation level is used) and to prohibit the updating of rows by other transactions when the owning transaction is using a restrictive isolation level. Once a lock is acquired, it is held until the owning transaction is terminated (COMMIT or ROLLBACK). At that point, the lock is released and the data resource is made available to other transactions.

Lock Wait:
If one transaction attempts to access a data resource in a way that is incompatible with the lock being held by another transaction, that transaction must wait until the owning transaction has ended. When a lock wait event occurs, the transaction attempting to access the data resource simply stops execution until the owning transaction has terminated and the incompatible lock is released.

Deadlock:
A deadlock occurs when two or more transactions are in a cycle of lock wait for one another, and a lock timeout does not break the cycle.

Lock Escalation:
When the DB2 Database Manager escalates a lock from a smaller lock granularity to a higher one due to insuffient lock memory space (for example, when it converts many row locks into a single table lock), this is called lock escalation. This process is depending on two database parameters: LOCKLIST (amount of memory allocated for the locks) and MAXLOCKS (maximum percent of lock list before escalation).

Monitoring locks and lock information

MONITOR SWITCH -- In order to gain full monitor lock information, turn on the lock monitor switch (or reset it, if it is already turned on):
·         UPDATE MONITOR SWITCHES USING LOCK ON (or OFF)
·         RESET MONITOR ALL (or FOR DB <database>)
           
GET SNAPSHOT FOR -- You can retrieve lock information by issuing a GET SNAPSHOT command for the following objects:
·         ALL DATABASES
·         ALL APPLICATIONS
·         APPLICATION APPLID <appl-id> (or AGENTID <appl-handle>)
·         LOCKS ON <database>
·         LOCKS FOR APPLICATION APPLID <appl-id> (or AGENTID <appl-handle>)

db2pd problem determination tool -- DB2 UDB V8.2 introduced a new monitoring and troubleshooting tool called db2pd. You can use this tool to retrieve lock information:
·         db2pd -db <database> -locks showlocks
·         db2pd -db <database> -locks wait
·         db2pd -db <database> -transactions
·         db2pd -db <database> -agents

Snapshot monitor SQL table functions -- You can use SQL queries to retrieve some lock-related snapshot monitor SQL table functions:
·         SNAPSHOT_LOCK
·         SNAPSHOT_APPL
·         SNAPSHOT_LOCKWAIT
·         SELECT * FROM TABLE(SNAPSHOT_APPL('<database>', -1)) AS SNAPSHOT_APPL
·         SELECT agent_id, appl_id FROM TABLE(SNAPSHOT_APPL(CAST (NULL as VARCHAR), -1)) AS SNAPSHOT_APPL
           

Reducing lock waits and minimizing deadlocks:

You can reduce lock waits and deadlocks by:
·         Issuing COMMIT statements at the right frequency
·         Choosing the appropriate isolation level
·         Specifying the FOR FETCH ONLY clause in the SELECT statement
·         Releasing read locks using the WITH RELEASE option of the CLOSE CURSOR statement, if acceptable
·         Using type-2 indexes to eliminate next key locks
·         Tuning the LOCKLIST and MAXLOCKS database configuration parameters to avoid lock escalations that impact concurrency
           
Summary of lock-related parameters and registry variables

Database configuration parameters:

LOCKLIST: Indicates the amount of storage that is allocated to the lock list, in 4K pages.
MAXLOCKS: Defines the minimum percentage of the lock list held by an application that must be filled before the database manager performs lock escalation.
LOCKTIMEOUT: Specifies the number of seconds that an application will wait to obtain a lock. This parameter helps avoid global deadlocks for applications.
DLCHKTIME: The deadlock check interval defines the frequency at which the database manager checks for deadlocks among all the applications connected to a database.
           
Registry Variables:
DB2LOCK_TO_RB: Specifies whether lock timeouts cause the entire transaction to be rolled back, or only the current statement. If DB2LOCK_TO_RB is set to STATEMENT, locked timeouts cause only the current statement to be rolled back. Any other setting results in transaction rollback.

DB2_KEEPTABLELOCK: Allows DB2 UDB to maintain the table lock when an Uncommitted Read or Cursor Stability isolation level is closed. In this case, the table lock is released at the end of the transaction, just as it would be released for Read Stability and Repeatable Read scans.

DB2_MAX_NON_TABLE_LOCKS: defines the maximum number of NON table locks a transaction can have before it releases all of these locks. NON table locks are table locks that are kept in the hash table and transaction chain even when the transaction has finished using them. Because transactions often access the same table more than once, retaining locks and changing their state to NON can improve performance.

DB2_EVALUNCOMMITTED, DB2_SKIPDELETED and DB2_SKIPINSERTED




0 comments:

Post a Comment

ads