But I have been struggling with this for a long time now. I have one primary instance with 2 replicas, and one is having delay every morning when the load is heavy.
The Data is a mix of InnoDB and MyISAM tables each on separated disks
The replicas are on 2 differents data centers, but I don’t think the hardware is the issue there. The replication was working fine before upgrading to MySQL 8.
On PMM, I noticed that the hourly written relay log is much more heavy on the laggy replica than on the functionnal one (<40G vs 8G ).
As the replicas have the same parameters for replication, I don’t understand this difference in replicated logs.
If the config files are exactly the same, and the data being replicated is the same (ie: no replication filters), then we need to examine the hardware. Please setup Percona Monitoring and Management (PMM) to monitor all 3 hosts. We need to see CPU/Disk/Memory/Network for each host in order to make a proper diagnosis.
Ok. The replica which is lagging behind is 1/2 as powerful as the source. Best practice is that source and replica should have the same hardware/configuration.
Also, looking at your source graph, ‘MySQL Handlers’. You have 2 million op/s for READ_RND_NEXT which represents FULL TABLE SCAN. The next highest operations are Index Reads at 674K op/s.
You have a much bigger problem with bad queries than replication lag. Improving queries will not only reduce your near-max-cpu problem, but will help replication lag as well.