Replica out of sync - Percona MySQL version 5.7.32-35

I have a Master-Replica setup(Row based replication), both of them are on version 5.7.32-35. The server configurations are as follows :
OS for Master/Replica : CentOS Linux release 7.5.1804 (Core).
Both of the servers are on the same VLAN.
Master’s memory:

root@va-perconam-1:/data/binlogs$ free -g
              total        used        free      shared  buff/cache   available
Mem:             15          11           0           0           4           4
Swap:             3           0           3

Replica’s memory:

[root@va-perconas-1 ~]$ free -g
              total        used        free      shared  buff/cache   available
Mem:             11           8           0           0           3           3
Swap:             3           0           3

Show Master Status :

mysql> show master status \G
*************************** 1. row ***************************
             File: va-perconam-1-bin.057625
         Position: 487678218
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 8def4d8c-d0ba-11e8-bfbf-0050568072a9:1-670865000:671415329-677646507,
c98de176-3826-11eb-8ee4-00505680b00e:1-41624798665
1 row in set (0.00 sec)

Show Replica Status :

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.64.1xx.xx
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: va-perconam-1-bin.057625
          Read_Master_Log_Pos: 488104483
               Relay_Log_File: va-perconas-1-relay-bin.000301
                Relay_Log_Pos: 290353657
        Relay_Master_Log_File: va-perconam-1-bin.057603
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 290353424
              Relay_Log_Space: 24113238353
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 13989
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 11
                  Master_UUID: c98de176-3826-11eb-8ee4-00505680b00e
             Master_Info_File: /data/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 8def4d8c-d0ba-11e8-bfbf-0050568072a9:1-670865000:671415329-677646507,
c98de176-3826-11eb-8ee4-00505680b00e:1-41624798665
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.01 sec)

ERROR: 
No query specified

Show full processlist on Replica:

mysql> show full processlist;
+-------+-------------+-----------+------+---------+-------+----------------------------------+-----------------------+-----------+---------------+
| Id    | User        | Host      | db   | Command | Time  | State                            | Info                  | Rows_sent | Rows_examined |
+-------+-------------+-----------+------+---------+-------+----------------------------------+-----------------------+-----------+---------------+
| 19932 | system user |           | NULL | Connect | 42246 | Waiting for master to send event | NULL                  |         0 |             0 |
| 19933 | system user |           | NULL | Connect | 14057 | Reading event from the relay log | NULL                  |         0 |             0 |
| 22721 | root        | localhost | NULL | Query   |     0 | starting                         | show full processlist |         0 |             0 |
+-------+-------------+-----------+------+---------+-------+----------------------------------+-----------------------+-----------+---------------+
3 rows in set (0.00 sec)

No queries under : select * from information_schema.innodb_trx; on both Master and Replica.

Binary logs generation on Master:

-rw-r----- 1 mysql mysql 1.1G Feb  6 09:19 va-daperconam-1-bin.057595
-rw-r----- 1 mysql mysql 1.1G Feb  6 09:21 va-daperconam-1-bin.057596
-rw-r----- 1 mysql mysql 1.1G Feb  6 09:22 va-daperconam-1-bin.057597
-rw-r----- 1 mysql mysql 1.1G Feb  6 09:23 va-daperconam-1-bin.057598
-rw-r----- 1 mysql mysql 1.1G Feb  6 09:25 va-daperconam-1-bin.057599
-rw-r----- 1 mysql mysql 1.1G Feb  6 19:00 va-daperconam-1-bin.057600
-rw-r----- 1 mysql mysql 1.1G Feb  6 19:16 va-daperconam-1-bin.057601
-rw-r----- 1 mysql mysql 1.1G Feb  6 19:17 va-daperconam-1-bin.057602
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:23 va-daperconam-1-bin.057603
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:24 va-daperconam-1-bin.057604
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:25 va-daperconam-1-bin.057605
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:26 va-daperconam-1-bin.057606
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:27 va-daperconam-1-bin.057607
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:29 va-daperconam-1-bin.057608
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:30 va-daperconam-1-bin.057609
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:31 va-daperconam-1-bin.057610
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:32 va-daperconam-1-bin.057611
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:33 va-daperconam-1-bin.057612
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:34 va-daperconam-1-bin.057613
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:35 va-daperconam-1-bin.057614
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:37 va-daperconam-1-bin.057615
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:38 va-daperconam-1-bin.057616
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:39 va-daperconam-1-bin.057617
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:40 va-daperconam-1-bin.057618
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:41 va-daperconam-1-bin.057619
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:42 va-daperconam-1-bin.057620
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:44 va-daperconam-1-bin.057621
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:45 va-daperconam-1-bin.057622
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:46 va-daperconam-1-bin.057623
-rw-r----- 1 mysql mysql 1.1G Feb  6 20:47 va-daperconam-1-bin.057624
-rw-r----- 1 mysql mysql 3.1K Feb  6 20:47 va-daperconam-1-bin-idx.index
-rw-r----- 1 mysql mysql 468M Feb  6 23:15 va-daperconam-1-bin.057625

Relay log on Replica:

-rw-r----- 1 mysql mysql   274 Feb  6 19:17 va-daperconas-1-relay-bin.000300
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:23 va-daperconas-1-relay-bin.000301
-rw-r----- 1 mysql mysql   76K Feb  6 20:23 va-daperconas-1-relay-bin.000302
-rw-r----- 1 mysql mysql   274 Feb  6 20:23 va-daperconas-1-relay-bin.000303
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:24 va-daperconas-1-relay-bin.000304
-rw-r----- 1 mysql mysql   93K Feb  6 20:24 va-daperconas-1-relay-bin.000305
-rw-r----- 1 mysql mysql   274 Feb  6 20:24 va-daperconas-1-relay-bin.000306
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:25 va-daperconas-1-relay-bin.000307
-rw-r----- 1 mysql mysql  172K Feb  6 20:25 va-daperconas-1-relay-bin.000308
-rw-r----- 1 mysql mysql   274 Feb  6 20:25 va-daperconas-1-relay-bin.000309
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:26 va-daperconas-1-relay-bin.000310
-rw-r----- 1 mysql mysql  150K Feb  6 20:26 va-daperconas-1-relay-bin.000311
-rw-r----- 1 mysql mysql   274 Feb  6 20:26 va-daperconas-1-relay-bin.000312
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:27 va-daperconas-1-relay-bin.000313
-rw-r----- 1 mysql mysql  142K Feb  6 20:27 va-daperconas-1-relay-bin.000314
-rw-r----- 1 mysql mysql   274 Feb  6 20:27 va-daperconas-1-relay-bin.000315
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:29 va-daperconas-1-relay-bin.000316
-rw-r----- 1 mysql mysql  126K Feb  6 20:29 va-daperconas-1-relay-bin.000317
-rw-r----- 1 mysql mysql   274 Feb  6 20:29 va-daperconas-1-relay-bin.000318
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:30 va-daperconas-1-relay-bin.000319
-rw-r----- 1 mysql mysql  101K Feb  6 20:30 va-daperconas-1-relay-bin.000320
-rw-r----- 1 mysql mysql   274 Feb  6 20:30 va-daperconas-1-relay-bin.000321
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:31 va-daperconas-1-relay-bin.000322
-rw-r----- 1 mysql mysql  165K Feb  6 20:31 va-daperconas-1-relay-bin.000323
-rw-r----- 1 mysql mysql   274 Feb  6 20:31 va-daperconas-1-relay-bin.000324
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:32 va-daperconas-1-relay-bin.000325
-rw-r----- 1 mysql mysql   93K Feb  6 20:32 va-daperconas-1-relay-bin.000326
-rw-r----- 1 mysql mysql   274 Feb  6 20:32 va-daperconas-1-relay-bin.000327
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:33 va-daperconas-1-relay-bin.000328
-rw-r----- 1 mysql mysql  117K Feb  6 20:33 va-daperconas-1-relay-bin.000329
-rw-r----- 1 mysql mysql   274 Feb  6 20:33 va-daperconas-1-relay-bin.000330
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:34 va-daperconas-1-relay-bin.000331
-rw-r----- 1 mysql mysql   60K Feb  6 20:34 va-daperconas-1-relay-bin.000332
-rw-r----- 1 mysql mysql   274 Feb  6 20:34 va-daperconas-1-relay-bin.000333
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:35 va-daperconas-1-relay-bin.000334
-rw-r----- 1 mysql mysql  154K Feb  6 20:35 va-daperconas-1-relay-bin.000335
-rw-r----- 1 mysql mysql   274 Feb  6 20:35 va-daperconas-1-relay-bin.000336
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:37 va-daperconas-1-relay-bin.000337
-rw-r----- 1 mysql mysql  183K Feb  6 20:37 va-daperconas-1-relay-bin.000338
-rw-r----- 1 mysql mysql   274 Feb  6 20:37 va-daperconas-1-relay-bin.000339
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:38 va-daperconas-1-relay-bin.000340
-rw-r----- 1 mysql mysql  107K Feb  6 20:38 va-daperconas-1-relay-bin.000341
-rw-r----- 1 mysql mysql   274 Feb  6 20:38 va-daperconas-1-relay-bin.000342
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:39 va-daperconas-1-relay-bin.000343
-rw-r----- 1 mysql mysql  110K Feb  6 20:39 va-daperconas-1-relay-bin.000344
-rw-r----- 1 mysql mysql   274 Feb  6 20:39 va-daperconas-1-relay-bin.000345
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:40 va-daperconas-1-relay-bin.000346
-rw-r----- 1 mysql mysql  114K Feb  6 20:40 va-daperconas-1-relay-bin.000347
-rw-r----- 1 mysql mysql   274 Feb  6 20:40 va-daperconas-1-relay-bin.000348
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:41 va-daperconas-1-relay-bin.000349
-rw-r----- 1 mysql mysql  169K Feb  6 20:41 va-daperconas-1-relay-bin.000350
-rw-r----- 1 mysql mysql   274 Feb  6 20:41 va-daperconas-1-relay-bin.000351
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:42 va-daperconas-1-relay-bin.000352
-rw-r----- 1 mysql mysql  100K Feb  6 20:42 va-daperconas-1-relay-bin.000353
-rw-r----- 1 mysql mysql   274 Feb  6 20:42 va-daperconas-1-relay-bin.000354
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:44 va-daperconas-1-relay-bin.000355
-rw-r----- 1 mysql mysql  107K Feb  6 20:44 va-daperconas-1-relay-bin.000356
-rw-r----- 1 mysql mysql   274 Feb  6 20:44 va-daperconas-1-relay-bin.000357
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:45 va-daperconas-1-relay-bin.000358
-rw-r----- 1 mysql mysql  151K Feb  6 20:45 va-daperconas-1-relay-bin.000359
-rw-r----- 1 mysql mysql   274 Feb  6 20:45 va-daperconas-1-relay-bin.000360
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:46 va-daperconas-1-relay-bin.000361
-rw-r----- 1 mysql mysql  141K Feb  6 20:46 va-daperconas-1-relay-bin.000362
-rw-r----- 1 mysql mysql   274 Feb  6 20:46 va-daperconas-1-relay-bin.000363
-rw-r----- 1 mysql mysql  1.1G Feb  6 20:47 va-daperconas-1-relay-bin.000364
-rw-r----- 1 mysql mysql  101K Feb  6 20:47 va-daperconas-1-relay-bin.000365
-rw-r----- 1 mysql mysql   274 Feb  6 20:47 va-daperconas-1-relay-bin.000366
-rw-r----- 1 mysql mysql  2.4K Feb  6 23:02 va-daperconas-1-relay-bin.index
-rw-r----- 1 mysql mysql  469M Feb  6 23:16 va-daperconas-1-relay-bin.000367

Flush Variables on Master:

mysql> show variables like '%flush%';
+-------------------------------------------+-------+
| Variable_name                             | Value |
+-------------------------------------------+-------+
| binlog_max_flush_queue_time               | 0     |
| binlog_skip_flush_commands                | OFF   |
| flush                                     | OFF   |
| flush_time                                | 0     |
| innodb_adaptive_flushing                  | ON    |
| innodb_adaptive_flushing_lwm              | 10    |
| innodb_flush_log_at_timeout               | 1     |
| innodb_flush_log_at_trx_commit            | 1     |
| innodb_flush_method                       |       |
| innodb_flush_neighbors                    | 1     |
| innodb_flush_sync                         | ON    |
| innodb_flushing_avg_loops                 | 30    |
| innodb_use_global_flush_log_at_trx_commit | ON    |
+-------------------------------------------+-------+
13 rows in set (0.00 sec)

Flush variables on Replica:

mysql> show variables like '%flush%';
+-------------------------------------------+-------+
| Variable_name                             | Value |
+-------------------------------------------+-------+
| binlog_max_flush_queue_time               | 0     |
| binlog_skip_flush_commands                | OFF   |
| flush                                     | OFF   |
| flush_time                                | 0     |
| innodb_adaptive_flushing                  | ON    |
| innodb_adaptive_flushing_lwm              | 10    |
| innodb_flush_log_at_timeout               | 1     |
| innodb_flush_log_at_trx_commit            | 1     |
| innodb_flush_method                       |       |
| innodb_flush_neighbors                    | 1     |
| innodb_flush_sync                         | ON    |
| innodb_flushing_avg_loops                 | 30    |
| innodb_use_global_flush_log_at_trx_commit | ON    |
+-------------------------------------------+-------+
13 rows in set (0.00 sec)

I don’t see any deadlock on the Master:

mysql> show engine innodb status \G:
*************************** 1. row ***************************
  Type: InnoDB
  Name: 
Status: 
=====================================
2024-02-06 23:23:06 0x7efc82e35700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 16 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 341766 srv_active, 0 srv_shutdown, 12869 srv_idle
srv_master_thread log flush and writes: 354635
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 68033626
OS WAIT ARRAY INFO: signal count 31466949
RW-shared spins 0, rounds 56170956, OS waits 31055175
RW-excl spins 0, rounds 805515810, OS waits 13771428
RW-sx spins 16231171, rounds 461519294, OS waits 13676399
Spin rounds per wait: 56170956.00 RW-shared, 805515810.00 RW-excl, 28.43 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 46356265297
Purge done for trx's n:o < 46356265297 undo n:o < 0 state: running but idle
History list length 39
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421113355317544, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421113355292728, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421113355223920, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421113355249864, not started
0 lock struct(s), heap size 1136, 0 row lock(s)

There is no issues on the latency as the servers are on the same VLAN.

va-perconas-1 (0.0.0.0)                                                                                                                                                                                                    Tue Feb  6 23:24:39 2024
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                                                                                                                                                                             Packets               Pings
 Host                                                                                                                                                                                                      Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 10.64.1xx.xx                                                                                                                                                                                            0.0%    12    0.2   0.2   0.2   0.4   0.0

I don’t what else do I need to check, still the lag is increasing. I stopped and started the slave but got no luck. The lag increase is random but it’s irritating without any root cause.

Can someone please help me here ?

There are two solution: reduce tha ACID properties on replica:
Please share below from replica:
select @@innodb_flush_log_at_trx_commit,@@sync_binlog,@@binlog_format,@@innodb_max_dirty_pages_pct,@@innodb_lru_scan_depth,@@innodb_io_capacity,@@read_only,@@super_read_only;

You can set innodb_flush_log_at_trx_commit=2 and sync_binlog=0 also please read about these two servers, incase of doing a planned failover, you should set these two variables to default and wait for replica to sync and then do failover.

also, check your disk if there is any letency on the disk.

second: configure Multi-thread replication on replica and see how it performs:

Here is the output of the query:

mysql> select @@innodb_flush_log_at_trx_commit,@@sync_binlog,@@binlog_format,@@innodb_max_dirty_pages_pct,@@innodb_lru_scan_depth,@@innodb_io_capacity,@@read_only,@@super_read_only;
+----------------------------------+---------------+-----------------+------------------------------+-------------------------+----------------------+-------------+-------------------+
| @@innodb_flush_log_at_trx_commit | @@sync_binlog | @@binlog_format | @@innodb_max_dirty_pages_pct | @@innodb_lru_scan_depth | @@innodb_io_capacity | @@read_only | @@super_read_only |
+----------------------------------+---------------+-----------------+------------------------------+-------------------------+----------------------+-------------+-------------------+
|                                1 |             1 | ROW             |                    75.000000 |                    1024 |                  200 |           1 |                 0 |
+----------------------------------+---------------+-----------------+------------------------------+-------------------------+----------------------+-------------+-------------------+
1 row in set (0.00 sec)

I got this input from other articles : innodb_flush_log_at_trx_commit=2 and sync_binlog=0, however, wanted to see like if there are any alternatives because I am aware of the fact that we need to reduce the ACID properties with the changes.

Here is the output from iostat for disk latency:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdb               0.00     0.00    5.00  784.00   640.00  6016.50    16.87     0.99    1.26    5.40    1.24   1.19  94.10
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    5.00  784.00   640.00  6023.00    16.89     0.99    1.26    5.40    1.24   1.19  94.10
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

Hi Mahendra,
As Mughees pointed out you can try multi-threaded replication to see if it helps. Try setting slave_parallel_workers according to the replica’s CPU cores.
However, looking at disk utilization you might have to relax ACID properties.

More about Multi-threaded replication - A Dive Into MySQL Multi-Threaded Replication - Percona Database Performance Blog