Puzzling deadlocks with "on duplicate key update"

Hi Danielle,

One trx1 already holds a lock which trx2 is requesting, but trx2 holds a lock trx1 was requesting. Check the following example:

tx1:begin;INSERT INTO `rankings` (`namespace`,`bucket`,`item`,`score`) VALUES ('nms1','bucket1','item1',100) ON DUPLICATE KEY UPDATE `rankings`.`score`=VALUES(`score`);<br>tx2:begin;INSERT INTO `rankings` (`namespace`,`bucket`,`item`,`score`) VALUES ('nms2','bucket2','item2',200) ON DUPLICATE KEY UPDATE `rankings`.`score`=VALUES(`score`);INSERT INTO `rankings` (`namespace`,`bucket`,`item`,`score`) VALUES ('nms1','bucket1','item1',300) ON DUPLICATE KEY UPDATE `rankings`.`score`=VALUES(`score`);<br><br>tx1:INSERT INTO `rankings` (`namespace`,`bucket`,`item`,`score`) VALUES ('nms2','bucket2','item2',400) ON DUPLICATE KEY UPDATE `rankings`.`score`=VALUES(`score`);&nbsp;

Results in:

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction


Checking error.log, I see that:
------------------------

2020-06-09 19:23:46 0x7f279eafa700*** (1) TRANSACTION:TRANSACTION 1815, ACTIVE 4 sec insertingmysql tables in use 1, locked 1LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2

and

*** (2) TRANSACTION:TRANSACTION 1814, ACTIVE 11 sec insertingmysql tables in use 1, locked 1LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2


In your deadlock, more rows and tables were involved (a more complex transaction?):

TRANSACTION:TRANSACTION 1836940429, ACTIVE 0 sec insertingmysql tables in use 1, locked 1LOCK WAIT 7 lock struct(s), heap size 1136, 4 row lock(s)MySQL thread id 78953000

and

*** (2) TRANSACTION:TRANSACTION 1836940430, ACTIVE 0 sec insertingmysql tables in use 1, locked 1LOCK WAIT 6 lock struct(s), heap size 1136, 4 row lock(s), 


Which means that more statements than in my example were executed and doing gap lock compared to my example. Btw, my example cannot be fixed by using isolation level “READ COMMITED” while yours do, and an educated guess is that you are generating some values to be inserted from previous reads which is solved by changing isolation level.

> Then I tried to change the isolation level to COMMITTED READS, and this fixes the problem, I guess because the exclusive lock on the row, in case of updates, is only taken on the record rather than on the gap (although I don’t understand the deep reason behind this deadlock)

> But now I’m not sure if this could introduce some inconsistencies in the data.

In order to fully understand what happened (and since there are more statements apart from the “INSERT ON UPDATE”), you should check the entire transactions that conflict in a deadlock.
You can log deadlocks using https://www.percona.com/doc/percona-toolkit/3.0/pt-deadlock-logger.html , and then you can review which statements were executed by enabling Performance schema event history table: https://dev.mysql.com/doc/refman/8.0/en/events-statements-history-table.html . You can then then identify the queries which were involved in the transaction using queries similar to the one described here: https://www.psce.com/en/blog/2015/01/22/tracking-mysql-query-history-in-long-running-transactions/

> From what I understand COMMITTED READS should only affect reads within the transaction. But what kind of reads are performed in a “insert … on duplicate key update” transaction?
> Doesn’t look like those insert queries should be affected by any concurrent change of data, especially because, as I mentioned, processes work on different namespace-bucket couples.

Yes, REPEATABL READ and READ COMMITED should only affect reads inside the transaction. You can check handlers by executing "show  status like ‘%handler%’; after executing both INSERT ON DUPLICATE KEY vs only INSERT and it should show same number of rows being read, but since the issue is fixed by changing isolation level, then the difference on locking must be because of previous statements inside the transactions (which as previously mentioned, your transaction shows more structures and rows locked).If you use above query to find out which other statements are being executed at the same time, it will provide more information in understanding what is happening.

1 Like