MySQL 8.0.37 with 2 Replicas, one with replica lag, the other isn't

CRBML · July 5, 2024, 8:09am

Sorry this forum must be full of this.

But I have been struggling with this for a long time now. I have one primary instance with 2 replicas, and one is having delay every morning when the load is heavy.

The Data is a mix of InnoDB and MyISAM tables each on separated disks

Here are the config files (of both replicas)

slave_preserve_commit_order=off
slave_parallel_type=DATABASE
binlog_format   = MIXED
sync_binlog     = 0
max_allowed_packet      = 128M
thread_stack            = 192K
thread_cache_size       = 8


table_definition_cache  = 8192
table_open_cache        = 1500
open_files_limit        = 2500
replica_parallel_workers = 8
relay_log_recovery      = ON
relay_log_space_limit   = 50G
max_heap_table_size     = 128M
tmp_table_size          = 128M

innodb_file_per_table = 1
innodb_buffer_pool_size=32G
innodb_read_io_threads=12
innodb_write_io_threads=12
innodb_flush_neighbors=0
innodb_log_file_size=1G
innodb_flush_log_at_trx_commit = 2

The replicas are on 2 differents data centers, but I don’t think the hardware is the issue there. The replication was working fine before upgrading to MySQL 8.

On PMM, I noticed that the hourly written relay log is much more heavy on the laggy replica than on the functionnal one (<40G vs 8G ).

As the replicas have the same parameters for replication, I don’t understand this difference in replicated logs.

matthewb · July 5, 2024, 2:25pm

If the config files are exactly the same, and the data being replicated is the same (ie: no replication filters), then we need to examine the hardware. Please setup Percona Monitoring and Management (PMM) to monitor all 3 hosts. We need to see CPU/Disk/Memory/Network for each host in order to make a proper diagnosis.

CRBML · July 8, 2024, 8:10am

Thank you for your response,

The Primary instance (32vCPU 128GB RAM) and the Problematic replica (16vCPU 64GB Ram) are on AMD EPYC 7571 inside VMs (in the same DC).

The Replica that doesn’t show issues is on bare metal 32vCPU 128GB RAM on AMD EPYC 7371 16-Core Processor

Here are some summary graphs and replication status on the machines :

Primary :

Replica Working correctly :

Replica with Lag :

The replication lag starts at 7:30 when the load is heavy.
and you can see that the relay logs written/h are heavy on the problematic replica

Please tell me if you need other data.

matthewb · July 9, 2024, 1:12pm

Ok. The replica which is lagging behind is 1/2 as powerful as the source. Best practice is that source and replica should have the same hardware/configuration.

Also, looking at your source graph, ‘MySQL Handlers’. You have 2 million op/s for READ_RND_NEXT which represents FULL TABLE SCAN. The next highest operations are Index Reads at 674K op/s.

You have a much bigger problem with bad queries than replication lag. Improving queries will not only reduce your near-max-cpu problem, but will help replication lag as well.

Topic		Replies	Views
replication lag Other MySQL® Questions	1	451	October 26, 2007
Replication lag after upgrade to Percona Server 8.0	9	2851	January 25, 2023
MySQL8 replication lagging MySQL & MariaDB mysql	6	848	June 27, 2024
Replication lagging badly Other MySQL® Questions	2	1427	August 24, 2012
Reduce Replication Lag \| Percona Community Community website	5	1218	April 22, 2025

MySQL 8.0.37 with 2 Replicas, one with replica lag, the other isn't

Related topics