Percona Mongo PBM Physical Restore fails with replicaset

jared1 · March 21, 2025, 6:18pm

Initially we achieved successful restore of a Physical backup of Percona Mongo using PBM using just one DB.

Now, we have created a Replicaset with two Percona Mongo DBs and one arbiter. Backup works fine, but on restore, first PBM seems to be waiting for the Containers (Percona Mongo running in Docker) to be turned off, which is fine. But then after turning both the data-bearing nodes off, the restore procedure hangs. We just get this message repeatedly in the logs:

E [pitr] init: get conf: get: server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: perconamongo8:27017, Type: Unknown, Last error: dial tcp: lookup perconamongo8 on 127.0.0.11:53: no such host }, { Addr: perconamongo7:27017, Type: Unknown, Last error: dial tcp: lookup perconamongo7 on 127.0.0.11:53: no such host }, { Addr: perconamongoarb:27017, Type: RSArbiter, Average RTT: 343041 }, ] }

For a Physical backup restore, what are we supposed to do with a replicaset? If keep the nodes running, PBM seems to wait for them to be turned off. If we turn them off, then PBM complains that it cannot see them.

Thanks.

Ivan_Groenewold · March 24, 2025, 11:07am

Hi, for physical restore you have to stop arbiter nodes before attempting it. Keep the other nodes running.

jared1 · March 25, 2025, 5:02pm

Thanks for your reply. We are still stuck.

I am providing logs from our two Percona Mongo nodes. The arbiter has been turned off. We have tried different combinations of restoring to the Primary or Secondary. The Percona PBM on the Primary node seems to wait for the node to become secondary, and the Secondary waits to be shutdown, according to the Docker logs on the PBM containers.

The documentation here says to turn off the nodes for Physical restore, so we have tried that too (even though I understood from the above that we should only turn off the Arbiter)..

2025-03-25T16:16:45.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] waiting to became secondary
2025-03-25T16:16:46.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] waiting to became secondary
2025-03-25T16:16:47.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] waiting to became secondary
2025-03-25T16:16:48.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] waiting for the node to shutdown
2025-03-25T16:19:06.000+0000 E [pitr] init: get conf: get: server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: perconamongo7:27017, Type: Unknown, Last error: dial tcp: lookup perconamongo7 on 127.0.0.11:53: no such host }, { Addr: perconamongo8:27017, Type: RSSecondary, Average RTT: 349172 }, { Addr: perconamongoarb:27017, Type: Unknown, Last error: dial tcp: lookup perconamongoarb on 127.0.0.11:53: no such host }, ] }

2025-03-25T16:13:37.000+0000 I got epoch {1742919217 4}
2025-03-25T16:13:37.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] backup: 2025-02-20T11:51:16Z
2025-03-25T16:13:37.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] recovery started
2025-03-25T16:13:37.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] port: 27282
2025-03-25T16:13:38.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] mongod binary: mongod, version: v8.0.4-1
2025-03-25T16:13:38.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] moving to state starting
2025-03-25T16:13:38.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] waiting for cluster
2025-03-25T16:13:48.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] converged to state starting
2025-03-25T16:13:48.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] starting
2025-03-25T16:13:48.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] moving to state running
2025-03-25T16:13:48.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] waiting for cluster
2025-03-25T16:14:03.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] converged to state running
2025-03-25T16:14:03.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] send to stopAgent chan
2025-03-25T16:14:03.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] stop agents heartbeats
2025-03-25T16:14:03.000+0000 I [restore/2025-03-25T16:13:36.827616408Z] stopping mongod and flushing old data
2025-03-25T16:14:03.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] shutdown server
2025-03-25T16:14:03.000+0000 D [restore/2025-03-25T16:13:36.827616408Z] waiting for the node to shutdown
2025-03-25T16:19:08.000+0000 E [pitr] init: get conf: get: server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: perconamongo8:27017, Type: RSSecondary, Average RTT: 710142 }, { Addr: perconamongo7:27017, Type: Unknown, Last error: dial tcp: lookup perconamongo7 on 127.0.0.11:53: no such host }, { Addr: perconamongoarb:27017, Type: Unknown, Last error: dial tcp: lookup perconamongoarb on 127.0.0.11:53: no such host }, ] }

Further assistance would be greatly appreciated.

Ivan_Groenewold · March 26, 2025, 11:11am

Hi, a few clarifications:

there is no way of “restoring to the primary or secondary”. PBM restores ALL the members of the replica set to the same point in time.
The doc page you linked is about restoring from a logical backup and since you are trying to restore a physical backup you should be reading this instead.
the doc page about physical restores mentions stopping mongos router and arbiter nodes. Don’t stop primary or secondary. Also check instructions here for running in Docker.

If you still suspect an issue please open a bug report [here] (Jira) with full instructions to reproduce so the dev team can take a look.

Topic		Replies	Views
PBM failed to restore physical backup Percona Backup for MongoDB	1	819	October 25, 2022
Restore from pbm hangs after some time Percona Backup for MongoDB	3	44	February 18, 2025
PMB can not connect to Mongodb Percona Backup for MongoDB	9	1274	June 11, 2023
Can't restore a Physical Backup using PBM 2.0.5 Percona Backup for MongoDB	9	1370	August 11, 2023
Pbm restore fails Percona Backup for MongoDB percona , mongodb , pbm	5	1312	October 19, 2023

Percona Mongo PBM Physical Restore fails with replicaset

Related topics