Strange Seconds_Behind_Master behavior

Deniska80 · July 2, 2024, 10:51am

Hello, i’ve got one master and two replics from this master (all the same versions 5.7.44-48-log), it’s GTID replication. One replica works fine, and the second one has issue.
It’s “Seconds_Behind_Master” value constantly changes: i mean it’s 0 for several ‘show slave status’ command, and then, after several seconds it’s become much more - 3836 for example. Next call show slave status immediatly shows 0 again.
No errors in log-files, “Slave_IO_Running” and “Slave_SQL_Running” always Yes, time on all servers are the same and ntp synced.
In fact - this replica is far beyond master, but why then it almost always show “Seconds_Behind_Master”: 0? And when it’s 0 “Retrieved_Gtid_Set” and “Executed_Gtid_Set” are the same.
From monitoring tool it’s look like this

Anyone has some clue?

Deniska80 · July 3, 2024, 6:03am

Ok, i stoped this strange replica, made xtrabackup from master (as always), transfer files snd start replica again. It’s all takes about 14 hours. Ok, i started replica again and this strange behavior sill there ^(
One seconds ‘show slave status’ shows
…
Seconds_Behind_Master: 51211
Slave_SQL_Running_State: Reading event from the relay log
…
another seconds it show
…
Seconds_Behind_Master: 0
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
…

Don’t have any idea, why is ths

kedarpercona · July 3, 2024, 6:21am

Hi @Deniska80,

What was the replica executing while you saw the increased lag?
Did you see while your lag was increasing did you have exec_master_log_pos moving?
Is the binlog size too large (or larger than usual)?
Check contents of binary log using mysqlbinlog command and see what the replica is trying to execute. ( mysqlbinlog --base64-output=decode-rows -vv --start-position=<Exec_Master_Log_Pos> <Relay_Master_Log_File> )

Ref: How to Read Simplified SHOW REPLICA STATUS Output

Thanks,
K

Deniska80 · July 3, 2024, 7:55am

What was the replica executing while you saw the increased >lag?
Not sure, what i understand you clearly. It executes commands from relay-bin.log, what it fetch from master. No other load on this server, it’s dedicated DB. By the way - it’s hard to catch moment, when it’s not null. I just make 10 consecutive calls of ‘show slave status’ in 2 seconds and only once i saw ‘Seconds behind master’ defferent from 0.

Did you see while your lag was increasing did you have >exec_master_log_pos moving?
Yes, it’s moving allways

Is the binlog size too large (or larger than usual)?
Bin log on master? No, usually size

see what the replica is trying to execute.
looks like it normally executes commands from master, about 500 op/s

One thing, i noted. May be it’s important. At this time this replica has “Master_Log_File: mysql-bin.016728”, while master allready has mysql-bin.017213. That’s OK, replica behind master. But at the same time replica has only two relay-bin.xxxx files (as i understand one is some kind of index, and other constantly gowing, while reach max_binlog_size). Then this files instantly removes and next two files appears. As i remeber from past, when replica try to reach master, there are a lot of relay-bin.xxxx files, coz fetching binlog is faster, than executes it.

May be that’s the reason? Low network speed?

kedarpercona · July 4, 2024, 8:28am

hmmm! 400+ binlogs behind?! Well then I think you surely need to figureout why do we have this case. May be it is low network… not sure if your replication down for long…

Deniska80 · July 4, 2024, 11:40am

[quote=“kedarpercona, post:5, topic:31349”]
400+ binlogs behind?!
my binlog just 100Mb size, so 400 files is just about 8 hours

main question for me is “why sometimes Seconds behind master is 0”. May be it’s 0 in time, while next binlog not fully fetched from master, due to network perfomance… but it’s strange.

Deniska80 · July 4, 2024, 11:52am

Ok, i iperf network perfomance between this replica and master, results are about 20…40Mbit/s. But speed of growing current relay-bin.log is far slower, it’s transfer one 100Mb file about 1 minute. That’s really strange, i think replica must fetch bin logs on full speed.

Topic		Replies	Views
Replication show unexpected seconds_behind_master Other MySQL® Questions	1	527	October 25, 2015
The value of Seconds_Behind_Master is increasing Other MySQL® Questions	2	4378	August 20, 2017
Seconds_Behind_Master keeps increasing Percona XtraBackup	2	382	April 4, 2024
seconds behind master Other MySQL® Questions	2	1631	February 9, 2011
Strange replication's stuck Percona Server for MySQL 5.7 mysql , percona	10	1400	July 6, 2023

Strange Seconds_Behind_Master behavior

Related topics