TSM - Road to Deadlocks - Article

Alexandru Crăciun - Senior Database Engineer @ Cognizant Softvision

In this article you will understand what isolation levels are in SQL Server, what the locking and blocking mechanism is about, what a deadlock is, and how to capture and prevent a deadlock in SQL Server.

The deadlock is a situation when two or more processes are waiting for each other to finish due to locking, leading to an infinite waiting, which means that neither of them can progress anymore.

Deadlocks are important because, if they occur often in a highly transactional database, it can lead to serious performance issues and multiple user errors.

Before we go deeper in what deadlocks are and how to prevent them, we must first understand some basic concepts in relational databases: ACID properties, isolation levels, concurrency issues, locking and blocking.

ACID Properties

A Database Transaction is a logical unit of processing in a DBMS which entails one or more database access operation. In a nutshell, database transactions represent the real-world events of any enterprise.

All types of database access operation which are held between the beginning and end transaction statements are considered as a single logical transaction in DBMS. During transactions, the database is inconsistent. Only after the database is committed, is the state changed from one consistent state to another.

Summary: Transaction - single logical unit of work in a database. In an RDBMS, a transaction follows the ACID properties.

A - Atomicity: Transactions are all or nothing.
C - Consistency: The database must be consistent before and after the transaction.
I - Isolation: Transactions do not affect each other.
D - Durability: Committed data will not be lost, even if a system failure occurs.

Isolation Levels

Isolation levels define the degree to which a transaction must be isolated from the data modifications made by other transactions in SQL Server. Isolation levels will isolate the process and protect it from other transaction. To do this, locking is required.

Low levels of isolation will give high levels of concurrency and shared use of data, but can result in lower levels of consistency with dirty or phantom reads. SQL Server will use the isolation levels to command the read behavior of the database/transaction.

Concurrency Issues

As multiple transactions could be running concurrently in a database system, we can find transactions that read from or writes to the same data object (row, table, index). It introduces a set of problems to which different transaction isolation levels tend to respond. A transaction isolation level defines how and when the database system will present changes made by any transaction to other user sessions.

In order to select the appropriate transaction isolation level, having a good understanding of common concurrency problems that can occur is mandatory. The following lines provide a quick summary:

Dirty Reads - Reading data occurs when another process is changing the data and the original process is reading uncommitted data.
Non-Repeatable Reads - This occurs when a process reads the same data multiple times within a session. Between data reads, another process may change the data and therefore different values are returned.
Phantom Reads - This is similar to Non-Repeatable Reads, but instead of changing (updating) the data, another process adds or removes records between the reads of the same data during a process.

Types of isolation levels

Read Uncommited
Read Commited
Repeateable Read
Serializable
Snapshot

Read Uncommitted Transaction Isolation Level

It is the lowest isolation level or the least restrictive isolation level. One transaction may read not yet committed changes made by another transaction. It allows dirty reads.

Read Committed Transaction Isolation Level

It is the SQL Server default isolation level. It prevents reading data from uncommitted transactions, thus it does not allow dirty reads. It can result in phantom reads / repeatable reads.

Repeatable Read Transaction Isolation Level

This will ensure that, if data is reread within a transaction that data does not change. No transaction can modify data until the transaction reading the data is complete. This comes at the cost that all shared locks in the read will hold until the transaction is completed. Dirty and Non-repeatable reads will be eliminated, but Phantom reads may still occur.

Serializable Transaction Isolation Level

This is the most restrictive isolation level and the best way to ensure ACID properties. A read/write block will be issued for all of the records affected by the transaction. It eliminates phantom reads.

Snapshot Transaction Isolation Level

In this isolation level a transaction recognizes only data which is committed before the start of the transaction. This is done by using snapshot of data in tempdb. It is done with the help of row version where a separate version of each modified row is maintained in tempdb. High cost/usage to tempdb. It eliminates dirty, non-repeatable, phantom reads and lost updates.

Locking and blocking

Locking is a mechanism used by SQL Server Database Engine to synchronize access of multiple users to the same piece of data at the same time. Locks are managed internally by Lock Manager. To minimize the cost of locking, SQL Server locks resources automatically at a level appropriate to the task. Locking at a smaller granularity, such as rows, increases concurrency, but has a higher overhead because more locks must be held. The amount of time the lock is held is dependent on the isolation level.

The 2 basics types of locks are: read locks and write locks.

Read Locks

Shared - While a shared lock is being held, other transactions can read but cannot modify locked data. The lock is released after the locked data has been read, unless the isolation level is at or higher than Repeatable Read.

Concurrent - This is when you read data in Read Uncommitted transaction isolation level or when you specify a NOLOCK hint.

Write Locks

Update - Update locks are a mix of shared and exclusive locks. When an update statement is executed, SQL Server has to find data it wants to modify first. To avoid lock conversion deadlock an update lock is used. Only one update lock can be held on data at one time, similar to an exclusive lock. The update lock itself can't modify the underlying data, so when data is modified, it is transferred to an exclusive lock.

Exclusive - Exclusive locks are used to lock data being modified by one transaction, thus preventing modifications by other concurrent transactions. A transaction always gets an exclusive lock on any data it modifies and holds that lock until the transaction completes, regardless of the transaction isolation level set for that transaction.

Resources to be locked

RID (Row Identifier) - a single row within a heap

Key - a row lock within an index
Page - a 8KB page in a database
Extent - a contiguous group of 8 pages
HoBT - a heap or B-tree
Table - the whole table including indexes
File - a lock on a database file
Object - a lock on any object like a stored procedure, view; basically, any objects found sys.all_objects
Database - a lock on whole database

Lock Modes

Shared (S) - Used for read operations that won't need to update/modify data
Update (U) - Used for resources that might be updated
Exclusive (X) - Used for data modifications operations
Intent (I) - Used to establish lock hierarchy (IS), (IX), (IU), (SIU), (UIX) and (SIX)
Schema (Sch) - Used when schema of a table is updated
Bulk Update (BU) - Used to bulk copy data into a table
Key-range - Protects range of rows

Locking Escalation

SQL Server uses lock escalation to manage the locking granularity. Lock escalation is internally managed and is an optimization technique to control the amount of locks that are held within Lock Manager of SQL Server.

Locking at a lower granularity, such as rows, increases concurrency, as access is only restricted to those rows rather than the whole table. A lock is an in-memory structure of 96 bytes in size.

Lock granularity consists of DATABASE, TABLE, PAGE and ROW locks. A connection to SQL Server will create a shared lock on database level, preventing actions as "DROP DATABASE" while there is still a connection.

SQL Server will perform lock granularity from top down: first it will check if a table lock is necessary, then a page, then a row. An Intent (Shared or Exclusive) is issued on the table and page and a Shared or Exclusive on the row.

Once SQL Server needs more than 5000 row locks, it will escalate by default to a table lock.

Lock Escalation manual altering

The default can be altered by using query hints like ROWLOCK, PAGLOCK, TABLOCK

select * from tbl_test with (ROWLOCK)
 where fld_id >1000

You can alter lock escalation for each table using the following syntax:

ALTER TABLE tbl_test 
 SET LOCK_ESCALATION =

This is not recommended, as it is best to allow the Database Engine to escalate locks.

Deadlocks

A deadlock occurs when 2 processes are blocked because each process is holding a resource the other one needs.

Example: Kid_1 has a lock on Phone_1 and requests a lock on Headphone_2. Kid_2 has a lock on Headphone_2 and has requested a lock on Phone_1. Both kids are waiting for each other.

The SQL Server Engine decides which ones to be the deadlock victim as neither task can continue.

Who is the victim?

SQL Server Database Engine checks for a deadlock every 5 seconds. The transaction that is chosen as a victim is the one that has the lower DEADLOCK PRIORITY which may be set on the session explicitly. If both transactions have the same deadlock priority, then the victim will be the transaction that is least expensive to rollback. In other words, the transaction that made the fewest changes to the database (least bytes written to database log) will be rolled back.

Info about Deadlocks in SQL Server

Enable trace flag 1222 : DBCC TRACEON(1222,-1) or set it up as the startup parameter in SQL Server Configuration Manager. Info can be found in SQL Server Error Log.

Using SQL Server Profiler / server side trace by adding the event: Locks: Deadlock Graph.

Using Extended Events: using the default system_health extended event or manually creating one and adding events like: lock_deadlock, lock_deadlock_chain or xml_deadlock_report.

Example of a captured deadlock graph in SQL Server:

Common solutions to avoid deadlocks

Now that we understand what a deadlock is, here are some common scenarios to prevent/fix a deadlock:

Access objects in the same logical order. Let's take the example with the kids. The solution in this case would be that Kid_1 and Kid_2 should both first get the Phone and only after request the Headphone. In this case the deadlock would be avoided and it will be reduced to a simple locking case where 1 Kid has to wait for the other one to finish having both items: Phone and Headphone. After the first Kid finishes using them, the other one will get them both.
Keep the transactions as short as possible.
Change the isolation level to SNAPSHOT ISOLATION.
Create a covering index. Covering indexes remove the need for the underlying clustered index to be touched on selects. If your deadlock is being caused by shared locks from a select on a small subset of the full rows, then you may be able to move these locks to a new index, and, thus, the two competing queries don't need locks on the same resources.
Set maximum degree of parallelism (MAXDOP) to 1. Some deadlocks are caused by parallelism, and, in this case, the query is deadlocking itself. This might come with a performance degradation, but at least it won't error out.
Use costume code to rerun the transaction that is chosen as victim.