Not the answer you need?
Register and ask your own question!

Replication lag on MySQL server after increasing the number of inserts on the master

spirit1984spirit1984 EntrantCurrent User Role Beginner
I've ran recently to a problem. We have a MySQL 5.7 Innodb replica with RBR replication (8 slave parallel workers). Now, until this Monday, our master executed 2500 inserts per second (in average), and the replica seemed to be fine with that. But recently the number of inserts went to 2800 inserts per second, and the replica does not seem to catch on (at evenings maybe). I checked the slave status. It seems like the IO thread seems to work just fine (there is no problem with getting events from the master), but the SQL threads don't run that fast. So the question is - how can I improve that - increase of parallel workers does not seem to help at all.

By the way, here is the result of sql show status:
Slave_IO_State: Waiting for master to send event
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.001634
Read_Master_Log_Pos: 268027451
Relay_Log_File: mysql-relay-bin.004672
Relay_Log_Pos: 386106741
Relay_Master_Log_File: mysql-bin.001632
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 386106528
Relay_Log_Space: 2415514767
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 1688
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 75
Master_UUID: 4db1d460-6b9b-11e8-a372-005056956c40
Master_Info_File: /opt/mysql-data/master.info
SQL_Delay: 0
SQL_Remaining_Delay:
Slave_SQL_Running_State: Waiting for dependent transaction to commit
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:

Comments

  • spirit1984spirit1984 Entrant Current User Role Beginner
    I have a MySQL 5.7 innodb replica with row based replication. I've noticed something curious. The master has about 3000 inserts per second, and the replica seems to catch up fine with that. But if I run a long-time query (let's say for a minute or so) scanning the big table with 300 million rows, the replica starts to lag. I've checked the slave status, the IO thread seems to be reading from master just fine, but the slave sql thread seems not to be doing so well. Not sure why is that. Now zabbix shows me, that my select query seems to be causing high disk utilization (since it needs parts of the table that are on disk), I guess that could slow down the reading of relay log file or applying the transactions. Not sure what I have to tune here in order to get rid of this - how come a single select query causes such a heavy replication log (the number of inserts on the replica falls down to 1500 inserts per second instead of 3000).
  • spirit1984spirit1984 Entrant Current User Role Beginner
    I finally found the solution myself. First, I increased the number of slave_parallel_workers to 4, but that gave me nothing (even with LOGICAL_CLOCK), because the transactions were highly dependent. However, it turned out, that after I increased on master binlog_group_commit_sync_delay to 10000 (that is, 10 milliseconds), the lag disappeared. This setting is very important, since it is the setting that actually allows the replication servers to execute something in parallel.
  • lorraine.pocklingtonlorraine.pocklington Percona Community Manager Legacy User Role Patron
    Hi there, thank your for posting your solution it may well help someone else. Sorry we didn't get to you more quickly but really pleased that you found an answer. I'll highlight your solution to the team here too. :)
  • spirit1984spirit1984 Entrant Current User Role Beginner
    Hi there, thank your for posting your solution it may well help someone else. Sorry we didn't get to you more quickly but really pleased that you found an answer. I'll highlight your solution to the team here too. :)

    I actually suggest two things. One - someone from the mysql community should really fix the documentation for slave_parallel_workers giving at least some hint for binlog_group_commit_sync_delay. Second - I would like to write something in the blog about our experience, since it was a really curious one.
  • lorraine.pocklingtonlorraine.pocklington Percona Community Manager Legacy User Role Patron
    Well I would love that, it could be very suitable for our new Community Blog https://www.percona.com/community-blog/
    Could you email me? [EMAIL="lorraine.pocklington@percona.com"][email protected][/EMAIL] reaches me. Thank you!!
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.