We have a cluster MySQL 8.0.32-24 for Linux on x86_64 (Percona Server, Release 24, Revision e5c6e9d2) Master/Slave. Database size is 916GB. Relation is lagging, in show slave Status\G shows Slave_SQL_Running_State: Waiting for dependent transaction to commit.
The lag floating for the last 24 hours rose to 16k and dropped to Seconds_Behind_Master: 8021.
Hello mag1kan1n,
It’s expected to see in process list Waiting for dependent transaction to commit when using multithreaded replication, as not all transactions may be applied in parallel ( it’s consistency cost )
The first thing I would like you to suggest is to relax ACID settings to reduce load on IO system and speedup the replication process:
SET GLOBAL innodb_flush_log_at_trx_commit=2,sync_binlog=0;
But it would help if you considered the risk of the losing all transactions for the last 1 minute in case of crash of mysql, so it’s not good option to set on writer node in case if you are using this node as replica and the writer as well. However if it’s pure replica, it’s one of the best way to increase replication speed.
There are multiple causes that can make a replica fall behind. You should check if there is CPU starvation, memory swapping , disk saturation else replication lagging behind might be due to contention. Make sure no subsystem is saturated because I see some unusually large buffer configuration and it’s possible there is swapping or high memory pressure (I am specially looking at you join_buffer_size = 320M ).
Note that even if you set multiple parallel workers, at times only 1 (or a few) transactions can be concurrently executing and might make a replica fall behind. For example DDLs and very large transactions will make the replica be single threaded for a while. You can find out parallel replication efficiency with the above blogpost
If replication is mostly single threaded then you should tune your workload rather than tune MySQL configuration.
For running DDLs on the primary you should use pt-online-schema-change pt-online-schema-change — Percona Toolkit Documentation
if you have very large transactions then you should split them into smaller chunks. I.e rather than deleting 1M rows in 1 transaction, commit every 10k rows
You can also try setting innodb_flush_method = O_DIRECT rather than O_DSYNC
Last, remember that every parameter that you configure in the database will be a tradeoff between using more resources or relaxing consistency at the cost of performance. Relaxing consistency (disabling innodb_doublewrite like you did) can cause corruption in the event of a crash so you should not change parameters lightly as they can cause more harm than good
innodb_io_capacity = 100 <-- change to default 200
tmp_table_size = 4G <-- The actual value used will be the smaller
max_heap_table_size = 8G <-- of these two. You should make them the same.
sync_binlog = 1 <-- causes LOTS of disk IO. You should set to 1000 so it syncs less often
Hi all, we were helped by including innodb_flush_method = O_DIRECT instead of O_DSYNC. Now there is no lag on the replica. But as far as I understand from the description, this is a tradeoff between fault tolerance and speed? Here’s a 24-hour graph -
The different between O_DSYNC and O_DIRECT is that one uses filesystem buffers, and the other does not. Technically, yes, this can be less fault tolerance because you are skipping the filesystem’s journal, but InnoDB has its own protection against this which is the doublewrite buffer. Ideally, you would turn the dblwb on and use O_DIRECT for maximum performance with data reliability.