MySQL group replication - how to find queries causing replication lag?

aurelien_shz · March 3, 2021, 8:42pm

We have a MySQL Group Replication cluster with 9 servers, and we are currently experiencing replication lag (i.e high values in performance_schema.replication_group_member_stats.COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE for several servers).

How can we determine which queries are causing a set of MySQL servers running Group Replication to have replication lag? We noticed that neither slow queries nor frequent queries are necessarily correlated with lag across servers.
How can we systematically identify which queries are causing the lag? We have noticed that some specific joins we had were causing lag spikes and, once we removed them, the lag reduced. This is, however, not the case for all joins.

Thanks for any help!

matthewb · March 3, 2021, 9:19pm

Hi @aurelien_shz ,
Do you have PMM set up and monitoring all 9 servers? Without something like this, that let’s you correlate query load, cpu/disk/memory, and mysql stats, it will be very difficult to figure out what is causing the issue. Are all 9 servers exactly the same, both hardware and software config?

9 server is quite high and quite unusual. Have you tried running with less, say 5? Do you experience the same issue with less members?

Topic		Replies	Views
odd replication lag situation Other MySQL® Questions	2	391	October 22, 2008
replication lag Other MySQL® Questions	1	442	October 26, 2007
Replication lag issue (MySQL) MySQL & MariaDB mysql	1	625	September 22, 2023
MySQL 8.0.37 with 2 Replicas, one with replica lag, the other isn't Other MySQL® Questions	3	224	July 9, 2024
Concerns regarding the MySQL group replication	4	597	March 4, 2024

MySQL group replication - how to find queries causing replication lag?

Related topics