We’ve got a replication lag issue with one of our databases.
Its structure is composed by one master (Debian 7 - MySQL 5.6, innodb version 5.6.29-76.2, Percona Server (GPL), Release 76.2, Revision ddf26fe) replicating to four slaves (Ubuntu 20 - MySQL 5.7, innodb version 5.7.39-42, Percona Server (GPL), Release ‘42’, Revision ‘b0a7dc2da2e’)
The master is a physical server, and the slaves are virtual machines (vmWare)
Our goal is to get rid of the physical machine, and work only with virtuals (promoting one of these slaves to master), but until we solve this lag issue, we can not move forward.
Lag replication is found mainly on a concrete slave (let’s call it “dammned_server”) damned server who in turn is master of a fifth slave. That is to say: 1 master 5.6 replicating in 4 slaves 5.7, but one of them is the master of a fifth slave 5.7 at the same time. And this one (“damned_server”), the ‘slave-and-master-at-the-same-time’, is the one who suffers lag so often
Any ideas? Have you encountered a similar scenario that you have been able to solve in some way?
I think you have this?
5.6 -> 5.7 -> 5.7
What does “suffers lag so often” mean? Is it just 1 or 2s lag? That’s nothing to worry about. Is is consistent many minutes of lag? If so, are you using ROW replication everywhere? Do you have multi-threaded replication enabled on all 5.7 machines? What other metrics are you monitoring? Does the 'D_S" server have high disk IO? High CPU? “lag” can come from many aspects.