we use ptosc with liquibase to apply changes to schema. We have a small cluster with thousands of databases, each representing a tenant. This means that each time we need to apply a structure change to the schema it would take one process a long time to go over all databases to apply the changes.
Recently we started experimenting with scaling the number of liquibase containers with the idea of applying changes to more than 1 database at a time in a test environment (no other traffic). As soon as we went to applying changes to 5 databases simultaneously we were consistently hit with one or more liquibase processes failing with deadlocks. A blank tenant would need to apply about ~10 structure patches to catch up. The tables that the patches are being applied are empty in the test case. Liquibase/ptosc always fails on this exact step:
08:22:04.769 INFO [liquibase.Liquibase]: Error creating triggers: DBD::mysql::db selectall_arrayref failed: WSREP detected deadlock/conflict and aborted the transaction. Try restarting the transaction [for Statement "SELECT TRIGGER_SCHEMA, TRIGGER_NAME, DEFINER, ACTION_STATEMENT, SQL_MODE, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, EVENT_MANIPULATION, ACTION_TIMING FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_MANIPULATION = ? AND ACTION_TIMING = ? AND TRIGGER_SCHEMA = ? AND EVENT_OBJECT_TABLE = ?"] at /usr/bin/pt-online-schema-change line 11444.
While trying to debug this, we were monitoring the full processlist on the node and no process hold an active query for any prolonged amount of time (>1s). We looked at the innodb engine status and there was no deadlock being reported as well. We tried with pt-deadlock-logger, again no lock was logged. There are still some things we will be trying out as time allows, but just wanted to ask the community if there is anything obvious that im missing. The lock wait timeout is not configured in my.cnf, and the node reports the value at default - 50s.
This is on mysql 5.7.40-31.63.1.el7, ptosc - 3.5.4-2.el7. The cluster is running 3 nodes with one of those being a writer node.