Hi All,
After pbm backup is completed, I execute pbm restore. Since I can’t mount a NFS system, I copy both backup files of config server and mongod to all of three nodes. The following is my enviroment"
server 1: mongod port 27018, config server 27019, mongos 27020
server 2: mongod port 27018, config server 27019, mongos 27020
server 2: mongod port 27018, config server 27019, mongos 27020
For each server, there are two pbm-agent instances to access mongod and config server. The backup folder is “/backup”. I copied backupset of config server and mongod to “/backup/” folder of each server.
I type “pbm restore backupname”. The pbm status shows the following message:
Cluster:
========
configsvr:
- configsvr/192.168.1.40:27019: pbm-agent FAILED status:
> ERROR with ERROR: lost agent, last heartbeat: 1663295876
- configsvr/192.168.1:27019: pbm-agent FAILED status:
> ERROR with ERROR: lost agent, last heartbeat: 1663295876
- configsvr/192.168.10.40:27019: pbm-agent FAILED status:
> ERROR with ERROR: lost agent, last heartbeat: 1663295872
stb1:
- stb1/192.168.1.40:27018: pbm-agent FAILED status:
> ERROR with ERROR: lost agent, last heartbeat: 1663295872
- stb1/192.168.1.41:27018: pbm-agent FAILED status:
> ERROR with ERROR: lost agent, last heartbeat: 1663295871
- stb1/192.168.10.40:27018: pbm-agent FAILED status:
> ERROR with ERROR: lost agent, last heartbeat: 1663295873
PITR incremental backup:
========================
Status [OFF]
Currently running:
==================
(none)
Backups:
========
FS /backup
Snapshots:
2022-09-16T01:47:12Z 14.69GB <physical> [complete: 2022-09-16T01:47:07Z]
2022-09-16T01:18:26Z 18.73GB <physical> [complete: 2022-09-16T01:18:25Z]
2022-09-14T07:32:39Z 0.00B <physical> [ERROR: upload file `/usr/local/mongodb-configsvr/collection-56-8997624888314356377.wt`: upload file: read data: open file for reading: open /usr/local/mongodb-configsvr/collection-56-8997624888314356377.wt: permission denied.] [2022-09-14T07:32:44Z]
I check the pbm logs:
2022-09-16T02:35:49Z I [stb1/192.168.10.40:27018] [restore/2022-09-16T01:47:12Z] restore started
2022-09-16T02:35:49Z I [configsvr/192.168.10.40:27019] [restore/2022-09-16T01:47:12Z] restore started
2022-09-16T02:35:50Z I [stb1/192.168.10.40:27018] [restore/2022-09-16T01:47:12Z] moving to state running
2022-09-16T02:35:50Z I [configsvr/192.168.10.40:27019] [restore/2022-09-16T01:47:12Z] moving to state running
2022-09-16T02:36:37Z I [stb1/192.168.1.40:27018] starting PITR routine
2022-09-16T02:36:37Z I [stb1/192.168.1.40:27018] pbm-agent:
Version: 1.8.1
Platform: linux/amd64
GitCommit: e3f8f2c535b3a8b69a54a86fc9271d3e6de5ff80
GitBranch: release-1.8.1
BuildTime: 2022-07-04_08:15_UTC
GoVersion: go1.16.9
2022-09-16T02:36:37Z I [stb1/192.168.1.40:27018] node: stb1/192.168.1.40:27018
2022-09-16T02:36:37Z I [stb1/192.168.1.40:27018] listening for the commands
2022-09-16T02:36:41Z I [configsvr/192.168.1.40:27019] pbm-agent:
Version: 1.8.1
Platform: linux/amd64
GitCommit: e3f8f2c535b3a8b69a54a86fc9271d3e6de5ff80
GitBranch: release-1.8.1
BuildTime: 2022-07-04_08:15_UTC
GoVersion: go1.16.9
2022-09-16T02:36:41Z I [configsvr/192.168.1.40:27019] starting PITR routine
2022-09-16T02:36:41Z I [configsvr/192.168.1.40:27019] node: configsvr/192.168.1.40:27019
2022-09-16T02:36:41Z I [configsvr/192.168.1.40:27019] listening for the commands
2022-09-16T02:37:56Z I [stb1/192.168.1.40:27018] got command restore [name: 2022-09-16T02:37:55.934711134Z, backup name: 2022-09-16T01:47:12Z] <ts: 1663295875>
2022-09-16T02:37:56Z I [stb1/192.168.1.40:27018] got epoch {1663292827 180}
2022-09-16T02:37:56Z I [configsvr/192.168.1.40:27019] got command restore [name: 2022-09-16T02:37:55.934711134Z, backup name: 2022-09-16T01:47:12Z] <ts: 1663295875>
2022-09-16T02:37:56Z I [configsvr/192.168.1.40:27019] got epoch {1663292827 180}
2022-09-16T02:37:57Z I [stb1/192.168.1.40:27018] [restore/2022-09-16T01:47:12Z] restore started
2022-09-16T02:37:57Z I [configsvr/192.168.1.40:27019] [restore/2022-09-16T01:47:12Z] restore started
2022-09-16T02:37:58Z I [stb1/192.168.1.40:27018] [restore/2022-09-16T01:47:12Z] moving to state running
2022-09-16T02:37:58Z I [configsvr/192.168.1.40:27019] [restore/2022-09-16T01:47:12Z] moving to state running
Is the pbm status incorrect? How to check the pbm restore process status in my case?
Thanks,
Dillon