Xtrabackup cause replication delay during backup

yehudaf1 · December 13, 2022, 11:49am

Hi

we upgrade to mysql 8 and since then every time we backup the slave server we are having a lag wit the replication.

its was not occurred when we were on version 5.7.
all tables are innodb.
the server are not busy at all with small amount of IO.

thanks
Yuda

Mauricio_Cacho · December 14, 2022, 12:16am

Hi @yehudaf1 , welcome to Percona Forum!

Could you please share the command you’re using?

yehudaf1 · December 17, 2022, 10:02pm

#full backup
2022-12-17T04:00:02.238295-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized client arguments: --user=root --password=* --no-server-version-check=1 --backup=1 --target-dir=/db_backup/backup_percona/latest/full --kill-long-queries-timeout=60 --kill-long-query-type=select

#incremental backup
2022-12-17T21:10:06.946681-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized client arguments: --user=root --password=* --no-server-version-check=1 --backup=1 --target-dir=/db_backup/backup_percona/latest/inc_2022-Dec-17-21-10-06 --incremental-basedir=/db_backup/backup_percona/latest/full --kill-long-queries-timeout=60 --kill-long-query-type=select

yehudaf1 · January 1, 2023, 2:22pm

Hi

any idea why it may happing ?
during the backup the replication have delay and there for our reporting application doesn’t have valid data.

thanks
yehuda

julienarcin · October 12, 2023, 4:25am

Hi

I would like to bump this up, I have the same problem.

julienarcin · October 12, 2023, 4:33am

My command is :

xtrabackup --defaults-extra-file=/root/.my.cnf --backup --register-redo-log-consumer --parallel=5 --stream=xbstream

And here is a screenshot of the replication status :

kedarpercona · October 12, 2023, 5:29am

Hi @julienarcin

I had a look at your command and noted this option --register-redo-log-consumer. This is not something I recalled using and then I read that this was “recently” introduced.

–register-redo-log-consumer()¶

The --register-redo-log-consumer parameter is disabled by default. When enabled, this parameter lets Percona XtraBackup register as a redo log consumer at the start of the backup. The server does not remove a redo log that Percona XtraBackup (the consumer) has not yet copied. The consumer reads the redo log and manually advances the log sequence number (LSN). The server blocks the writes during the process. Based on the redo log consumption, the server determines when it can purge the log.

What I read is “The server blocks the writes during the process.”. Without going in to much of a detail can you try the same without redologconsumer option?

Thanks,
K

Chanakya · October 25, 2023, 6:31pm

Hi,

we have same problem.
We are not using the command you referred above.

xtrabackup --defaults-file=my.cnf --backup --socket=/tmp/mysqld.sock --user=mysqld --stream=xbstream --extra-lsndir=/backup/MYBACKUP/config --slave-info --target-dir=/backup/MYBACKUP/database --read-buffer-size=400M --no-server-version-check --skip-strict

@Marcelo_Altmann @kedarpercona Pls help.

thank you

Marcelo_Altmann · October 26, 2023, 3:08pm

Xtrabackup uses Lock Tables For Backup wich will block the SQL thread in case it is processing a DDL or DML on non-transactional tables. I would indicate you to monitor what SQL thread is doing or blocking on during the backup.

Chanakya · October 26, 2023, 3:15pm

@Marcelo_Altmann Thank you for responding!
I will monitor the SQL threads. Could you pls confirm the options I am using are good ?
“xtrabackup --defaults-file=my.cnf --backup --socket=/tmp/mysqld.sock --user=mysqld --stream=xbstream --extra-lsndir=/backup/MYBACKUP/config --slave-info --target-dir=/backup/MYBACKUP/database --read-buffer-size=400M --no-server-version-check --skip-strict”

Also, Is there anyway to ensure a consistent backup while not blocking DDL/DMLs ?

Marcelo_Altmann · October 26, 2023, 3:32pm

@Chanakya I would probably increase the number of parallel copy threads there, via --parallel=X .

Also, Is there anyway to ensure a consistent backup while not blocking DDL/DMLs ?

Just to clarify, DML on InnoDB is NOT blocked. We have some research and WIP work to reduce the time the instance remains under LTFB/LIFB allowing DDL. At the current release, the answer to your question is no, we need this lightweight lock to ensure consistent backups.

Chanakya · October 26, 2023, 3:38pm

Acknowledged on the parallel threads. Does this number has to be same for xbstream threads ?

Thank you for confirmation. What is LTFB/LIFB ?

Marcelo_Altmann · October 26, 2023, 3:42pm

for xbstream you will probably be capped at the network level. But this is something that you will have to experiment to find the sweet spot.

Thank you for confirmation. What is LTFB/LIFB ?
Those are the MDL locks taken to ensure a consistent backup.
Lock Tables For Backup (LTFB) - Percona Server Only. Lighter than LIFB
Lock Instance For Backup (LIFB) - MySQL

Chanakya · October 26, 2023, 3:44pm

Thanks much for the answers Marcelo

Sanjay_Sheoran · March 8, 2024, 4:57pm

Do we have any further update on this issue, we are still facing this issue in all 8.x versions

Abaddon_Daemon · December 18, 2024, 5:08pm

I wanted to weigh in here. I’ve found a few reports on this bug, and they all seem to dead-end at the same place. We’ve been chasing this for about 6 months within our enterprise. This came up on our largest database server. As far as we could tell other servers weren’t being impacted, but the large one would have replication set back for hours every night when backups ran. For the longest time I was rejecting the notion that backups were causing it, because they never had before Percona 8.x.

Anyway, eventually I started randomizing the time our backups ran to correlate the replication delay, and was able to prove that whenever our idle standby ran its backup… that was when replication fell behind. This was counterintuitive to me because that particular host isn’t doing anything EXCEPT replication and backups… so I was surprised it was suffering when the primary database host didn’t seem to have any issues at all.

I suspect maybe this is because we run row-based replication, so the load patterns on the standby are different than the load patterns on the master, but I can’t prove that and would be curious to know if others experiencing this problem are also used row-based replication.

In any event, we’ve found a workaround, though I’m not sure if it’s ideal. Still, I wanted to share it with others who may be in this same situation.

By adding the flags “–compress --compress-threads=2 --parallel=4” we no longer experience replication delay when the backup is running (I tried with higher parallelism thread counts, but they didn’t seem to matter in a substantial way. 2/4 seems pretty comfortable for any server and gets you 90% of the performance gains, at least for our workloads. YMMV). I can’t explain why doing inline compression prevents replication from falling behind, but for our purposes this is a sufficient workaround for now. Hopefully this might help others who feel stuck in this situation.

Topic		Replies	Views
Xtrabackup fails after some time Percona XtraBackup mysql	8	2066	April 24, 2023
Lag on slave when backup is in progress	5	1366	February 9, 2023
Xtrabackup backup failed Percona XtraBackup	7	438	August 16, 2024
Xtrabackup locks tables/databases to write during backup? Percona XtraBackup	8	1881	July 19, 2019
Xtrabackup - High Seconds Behind Master while backing up Percona XtraBackup	4	87	December 12, 2024

Xtrabackup cause replication delay during backup

–register-redo-log-consumer()¶

Related topics