While running the following pt-online-schema-change to replace a table with a newer version of itself we encounter a number of deadlocks that severely impacted query latency, in this case I believe inserts into the source table slowed drastically setting off all our alarms. I am trying to piece together an explanation on why this happened when pt-online-schema-change was designed as a non blocking tool. Management is hard on me to figure this out. So any help would be very much appreciated. Here’s the pt-online-schema-change statement:
sudo pt-online-schema-change --alter='engine=innodb' \
--max-load Threads_running:50,Threads_connected:300 \
--critical-load Threads_running:250,Threads_connected:400 \
--set-vars lock_wait_timeout=3 --tries create_triggers:5:5,drop_triggers:5:5 \
--no-check-replication-filters --recursion-method dsn=t=percona.dsns \
--execute --statistics --print --ask-pass D=logs,t=bll_arg,u=<use>
our underlying application detected locks
Deadlock found when trying to get lock; try restarting transaction
Looking at Innodb Engine Status we see the lock type (lock mode AUTO-INC waiting):
LATEST DETECTED DEADLOCK
------------------------
2020-08-11 19:19:51 7fa2a5926700TOO DEEP OR LONG SEARCH IN THE LOCK TABLE WAITS-FOR GRAPH,
WE WILL ROLL BACK FOLLOWING TRANSACTION
*** TRANSACTION:
TRANSACTION 164388054388, ACTIVE 0 sec setting auto-inc lock, thread declared inside InnoDB 5000
mysql tables in use 2, locked 2
3 lock struct(s), heap size 360, 0 row lock(s), undo log entries 2
MySQL thread id 16530486, OS thread handle 0x7fa2a5926700, query id 45930441097 100.##.###.###
logs update
REPLACE INTO `logs`.`_bll_arg_new` (`bll_arg_id`, `bll_trans_id`, `name`, `value`, `modified`,
`created`) VALUES (NEW.`bll_arg_id`, NEW.`bll_trans_id`, NEW.`name`, NEW.`value`, NEW.`modified`,
NEW.`created`)
*** WAITING FOR THIS LOCK TO BE GRANTED:
TABLE LOCK table `logs`.`_bll_arg_new` trx id 164388054388 lock mode AUTO-INC waiting
*** WE ROLL BACK TRANSACTION (2)
In short we’re using pt-online-schema-change to copy an old table to a newer version of itself to regain disk space after we deleted over 5B rows.
My environment is as follows:
MySQL:
mysql> show global variables like '%version%';
+-------------------------+------------------------------------------------------+
| Variable_name | Value |
+-------------------------+------------------------------------------------------+
| innodb_version | 5.6.23-72.1 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.6.23-72.1-log |
| version_comment | Percona Server (GPL), Release 72.1, Revision 0503478 |
| version_compile_machine | x86_64 |
| version_compile_os | debian-linux-gnu |
+-------------------------+------------------------------------------------------+
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| innodb_autoinc_lock_mode | 1 |
+--------------------------+-------+
+---------------+-----------------+
| Variable_name | Value |
+---------------+-----------------+
| tx_isolation | REPEATABLE-READ |
+---------------+-----------------+
percona-toolkit:
/var/log# pt-online-schema-change --version
pt-online-schema-change 2.2.13
Table we were attempting to copy over
CREATE TABLE `bll_arg` (
`bll_arg_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`bll_trans_id` bigint(20) unsigned NOT NULL,
`name` varchar(255) NOT NULL,
`value` text,
`modified` int(10) unsigned NOT NULL,
`created` int(10) unsigned NOT NULL,
PRIMARY KEY (`bll_arg_id`),
KEY `bll_trans_id` (`bll_trans_id`)
) ENGINE=InnoDB AUTO_INCREMENT=27560394433 DEFAULT CHARSET=utf8
Again any help in understanding this would be greatly appreciated.