Percona Backup: 2.0.3
Percona MongoDB Distribution: 6.0.4-3
Friends. I am having an issue with backup. I am backing up to a cifs share which is mounted at startup.
After a few hours, I get the following error from the pitr backup:
2023-05-10T22:28:23.000-0500 I [pitr] created chunk 2023-05-11T03:23:15 - 2023-05-11T03:28:15. Next chunk creation scheduled to begin at ~2023-05-10T22:33:12
2023-05-10T22:33:15.000-0500 D [pitr] remove pbmPitr/xxx/xxx/20230511/20230511032815-11856.20230511033315-11361.oplog.s2 due to upload errors
2023-05-10T22:33:15.000-0500 E [pitr] streaming oplog: unable to upload chunk {1683775695 11856}.{1683775995 11361}: read data: oplog has insufficient range, some records since the last saved ts {1683775695 11856} are missing. Run pbm backup
to create a valid starting point for the PITR.
2023-05-10T22:33:45.000-0500 D start_catchup
2023-05-10T22:33:45.000-0500 D lastTS set to {1683775695 11856} 2023-05-11T03:28:15
2023-05-10T22:33:45.000-0500 I streaming started from 2023-05-11 03:28:15 +0000 UTC / 1683775695
2023-05-10T22:33:45.000-0500 E streaming oplog: unable to upload chunk {1683775695 11856}.{1683775995 11361}: read data: oplog has insufficient range, some records since the last saved ts {1683775695 11856} are missing. Run pbm backup
to create a valid starting point for the PITR.
And the insufficient range error repeats until I disable pitr.
Interestingly enough, on the directory i see these files, but the final file is this different naming format:
20230511030815-3.20230511031315-3.oplog.s2
20230511031315-3.20230511031815-3.oplog.s2
20230511031815-3.20230511032315-3.oplog.s2
20230511032315-3.20230511032815-11856.oplog.s2
I am able to write to the directory, both as myself and the mongod user via sudo -su.
Recreating a new snapshot works without issue but after a few hours, I’ll get these errors again.
I have tried reducing the backup window from 10 minutes to 5 minutes, as you can see above. But that didn’t help.
I thought one secondary node was having issues, so I changed the back priority of that server to .1, but I still got errors when backups were running from the primary node.
I haven’t tried changing the compression to gzip with no compression.
I have a development, stage, and test environment with the same configuration, with no issues.
Is there a retry available for pitr backup? Are there any other flags I can use?