Hi Percona team,
I would like to kindly ask for clarification about the correct restore procedure for PBM external snapshots + PITR replay.
Environment:
PBM: 2.12.0
PSMDB: 8.0.4-2
Topology: single-node replica sets
Storage: WiredTiger
Backup model: PBM external backup metadata + LVM snapshot + PBM PITR chunks
Our workflow:
-
Run PBM external backup.
-
Take LVM snapshot.
-
Store PBM metadata, including
last_write_ts. -
Restore snapshot X-1 to an isolated host.
-
Start temporary
mongod. -
Insert:
db.getSiblingDB("local")["replset.oplogTruncateAfterPoint"].insertOne({
_id: "oplogTruncateAfterPoint",
oplogTruncateAfterPoint: Timestamp(lastWriteT, lastWriteI)
})
- Restart with:
--setParameter recoverFromOplogAsStandalone=true
--setParameter takeUnstableCheckpointOnShutdown=true
-
Apply PBM PITR oplogs up to target timestamp.
-
Compare against reference DB restored from later snapshot X.
I have two questions.
1. oplogTruncateAfterPoint vs stableTimestamp
PBM source appears to set oplogTruncateAfterPoint from restoreTS, usually LastWriteTS.
MongoDB recovery can then log something like:
The oplog truncation point is equal to or earlier than the stable timestamp,
so truncating after the stable timestamp instead
So MongoDB effectively clamps the requested truncate point to stableTimestamp if last_write_ts <= stableTimestamp.
Questions:
-
Is this expected and safe in PBM external snapshot restores?
-
Should our scripts explicitly calculate:
effectiveRestoreTS = max(PBM last_write_ts, WiredTiger stable/recoveryTimestamp)
before inserting oplogTruncateAfterPoint?
-
Or should we always use PBM metadata
last_write_tsand let MongoDB clamp internally? -
What timestamp should be treated as the real base endpoint for later PITR replay: PBM
last_write_ts, MongoDBendPoint, orapplyThroughOpTimefrom the recovery log?
This is important because for external snapshots PBM does not copy the files itself, so the actual snapshot files can sometimes have a WiredTiger stable timestamp later than PBM metadata last_write_ts.
2. TTL monitor during PITR replay
I could not find explicit TTL handling in PBM 2.12.0 restore / oplog replay source.
MongoDB Ops Manager restore docs start temporary restore mongod with:
--setParameter ttlMonitorEnabled=false
But PBM restore / replay code does not seem to pass this flag.
Questions:
-
During PBM PITR / oplog replay, is TTL monitor expected to be disabled manually?
-
In a single-node replica set, the replay target becomes primary. Can TTL deletes run during PBM replay?
-
If yes, is this considered safe?
-
Could TTL act as an extra writer outside the intended oplog replay stream?
-
Does PBM ignore or tolerate duplicate-key / missing-document errors during oplog replay in a way that could hide TTL-related divergence?
-
What is the supported recommendation for single-node replica sets where replaying on a secondary is not possible?
Our concern is that PITR replay should be deterministic: base snapshot + oplog stream should be the only source of changes. TTL monitor uses current wall-clock time, not the PITR target time, so it looks like a possible source of nondeterministic deletes during recovery.
We run single-node replica-sets.
Backups are made using low-level and fast LVM snapshots (PBM here is used to open / close backupCursor to have consistent backup + metadata. at the same time we are doing oplog slices each 1 minute independently (oplogOnly: true).
Physical backups are not suitable due to high load on PROD hosts and it takes significantly longer time to read & copy over network in comparison to LVM low-level tool which handles the snapshots.
That’s the reason why we are dependent on:
- oplogOnly: true option (for oplogs creation independently - not reliant on physical backups)
- handling restore process manually using custom scripts on top of PBM
we run reconciliation tests after each point in time recovery test to compare database which was recovered (snapshot + oplogs) vs reference database ( snapshot X-1). by comparing collection statistics.
Unfortunately there are mismatches even though I tried to reconstruct what to implement from PBM source code ( setting truncateAfterPoint to limit wiredTiger recovery.., using accurate ordinal epoch timestamp with increment,..).
Currently we don’t explicitly start mongo instance with TTL monitor disabled when applying oplogs.
and also mismatches happen on collections which dont have TTL indexes.
this is interesting case as we would expect databases being 1:1 and any logical mismatches in data sizes to be small (if any) or only on colls which have TTL indexes present.
We are running single-node replica-sets.
Thank you very much in advance.