Hi @nir-elk
Thanks for your response. I’d like to clarify a few points regarding your suggestions.
From my understanding, we only need the application to be in read-only mode. I’m able to configure the application accordingly.
With regards to Incremental Restore, the complete shutdown is necessary because we are restoring physical files.
To move those files back, the entire cluster, whether it’s a Replica Set or Shard Cluster, needs to be halted; it’s not possible to restore files with the database running. You can find more details at the documentation.
My main goal is to restore incremental backups with minimal downtime and without extra cluster.
Let’s assume you have a simple Replica Set with 3 members, and workaround to restore your backup into 1 one of those 3 nodes.
In theory, that node first needs to be ejected from the Replica Set → receive the restore, which can take some time → Then, to re-sync the other 2 nodes, it requires wiping and triggering an initial sync on each, taking their time accordingly to restore → only after that will you have the ReplicaSet fully functional.
Not only it seem a longer restore process, but also a more error-prone, too.
Additionally, could you please explain why you chose to perform incremental backups as physical rather than logical backups?
It’s important to clarify that incremetal backups exists only for physical backups; From the documentation:
Incremental backups require a physical base backup as a reference.
So, you have a base backup + incr1 + incr2 + incr3..
- In an incremental restore, It will use a base physical backup + incr until the point you want to restore.
For logical backups, you can use the PITR(Point in-time Recovery) feature, which will save slices of the Oplog after X given minutes*(default 10min)*.
- During a logical restore, you will use a Full Logical backup + replay the oplog via PiTR until the time you want to restore.
Logical restore is usually slower than physical restore and adds some overhead on the replication, as logical data is restored on PRIMARY and then replicated to the SECONDARY nodes. But on the other hand, physical backup consumes more of your backup storage as files are stored with all the fragmentation and index structure. While, logical backup is only the data, without fragmentation or indexes.
The correct option always depends on your business requirements.
Best,
Jean.