A recent question on the OTN forum asked about narrowing down the cause of deadlocks, and this prompted me to set up a little example. Here’s a deadlock graph of a not-quite-standard type:
Deadlock graph: ---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-00040001-000008EC-00000000-00000000 50 249 X 48 9 X TX-000A001F-000008BC-00000000-00000000 48 9 X 50 249 S
My session (the one that dumped the trace file) is 249, and I was blocked by session 9. The slight anomaly, of course, is that I was waiting on a TX lock in mode 4 (Share) rather than the more common mode 6 (eXclusive).
There are plenty of notes on the web these days to tell you that this wait relates in some way to a unique index (or some associated referential integrity) or an ITL wait. (Inevitably there are a couple of other less frequently occurring and less well documented reasons, such as waits for tablespaces to change state but I’m going to ignore those for now). The question is, how do I tell whether this example is related to uniqueness (indexing) or ITLs ? For recent versions of Oracle the answer is in the rest of the trace file which now hold the recent wait history for the session that dumped the trace file.
Reading down my trace file, past the line which says “Information for THIS session”, I eventually get to this:
Current Wait Stack: 0: waiting for 'enq: TX - allocate ITL entry' name|mode=0x54580004, usn<<16 | slot=0xa001f, sequence=0x8bc wait_id=80 seq_num=81 snap_id=1
So it didn’t take me long to find out I had an ITL problem (which should be a pretty rare occurrence in newer versions of Oracle); but there’s more:
... There is at least one session blocking this session. Dumping 1 direct blocker(s): inst: 1, sid: 9, ser: 40192 Dumping final blocker: inst: 1, sid: 9, ser: 40192 There are 2 sessions blocked by this session. Dumping one waiter: inst: 1, sid: 357, ser: 7531 wait event: 'enq: TX - allocate ITL entry' ... Session Wait History: elapsed time of 0.000035 sec since current wait 0: waited for 'enq: TX - allocate ITL entry' name|mode=0x54580004, usn<<16 | slot=0x5000c, sequence=0xa39 wait_id=79 seq_num=80 snap_id=1 wait times: snap=5.002987 sec, exc=5.002987 sec, total=5.002987 sec wait times: max=5.000000 sec wait counts: calls=2 os=2 occurred after 0.000047 sec of elapsed time 1: waited for 'enq: TX - allocate ITL entry' name|mode=0x54580004, usn<<16 | slot=0xa001f, sequence=0x8bc wait_id=78 seq_num=79 snap_id=1 wait times: snap=1 min 4 sec, exc=1 min 4 sec, total=1 min 4 sec wait times: max=1 min 4 sec wait counts: calls=22 os=22 occurred after 0.000032 sec of elapsed time ... 8: waited for 'enq: TX - allocate ITL entry' name|mode=0x54580004, usn<<16 | slot=0x5000c, sequence=0xa39 wait_id=71 seq_num=72 snap_id=1 wait times: snap=5.001902 sec, exc=5.001902 sec, total=5.001902 sec wait times: max=5.000000 sec wait counts: calls=2 os=2 occurred after 0.000042 sec of elapsed time 9: waited for 'enq: TX - allocate ITL entry' name|mode=0x54580004, usn<<16 | slot=0xa001f, sequence=0x8bc wait_id=70 seq_num=71 snap_id=1 wait times: snap=4.005342 sec, exc=4.005342 sec, total=4.005342 sec wait times: max=4.000000 sec wait counts: calls=2 os=2 occurred after 0.000031 sec of elapsed time ... Sampled Session History of session 249 serial 3931 --------------------------------------------------- The history is displayed in reverse chronological order. sample interval: 1 sec, max history 120 sec --------------------------------------------------- [9 samples, 11:14:50 - 11:14:58] waited for 'enq: TX - allocate ITL entry', seq_num: 81 p1: 'name|mode'=0x54580004 p2: 'usn<= 8 sec (still in wait) [5 samples, 11:14:45 - 11:14:49] waited for 'enq: TX - allocate ITL entry', seq_num: 80 p1: 'name|mode'=0x54580004 p2: 'usn<<16 | slot'=0x5000c p3: 'sequence'=0xa39 time_waited: 5.002987 sec (sample interval: 4 sec) ...
The little report that follows the initial wait state shows that the situation was a little messy – session 9 was my first and last blocker, but there was another session tangled up in the chain of waits, session 357.
Following this there’s a set of entries from my v$session_wait_history - and if you look carefully at the slot and sequence that appears on the second line of each wait you’ll notice that my waits have been alternating between TWO other sessions/transactions before I finally crashed.
Finally there’s a set of entries for my session extracted from v$active_session_history. (Question: I’m only allowed to query v$active_session_history if I’ve licensed the Diagnostic Pack – so should I shut my eyes when I get to this part of the trace file ;) This breakdown also shows my session alternating between waits on the two different blockers, giving me a pretty good post-event breakdown of what was going on around the time of the deadlock.
