Hi, im trying to restore an incremental backup, all instances are running in docker containers.
I added the pbm-agent & to the mongodb entrypoint, so it now starts when container starts up. But…
i do a incremental backup, everything works perfect. then i delete one of my collections, and will try and run a restore.
The following below is the whole restore process, and logs, after it tried to restore, the config servers studently loosing connection, have no idea why, and when i check my db. the collection is not restored.
Hope you can see what happens from the logs
[mongodb@mongodb_content_cfg_server1 percona]$pbm status
Cluster:
mongo_data_rs2:
- mongo_data_rs2/mongodb_content_rs2n1:27017 [P]: pbm-agent v2.2.1 OK
mongo_data_rs3: - mongo_data_rs3/mongodb_content_rs3n1:27017 [P]: pbm-agent v2.2.1 OK
mongo_data_rs1: - mongo_data_rs1/mongodb_content_rs1n1:27017 [P]: pbm-agent v2.2.1 OK
mongo_conf: - mongo_conf/mongodb_content_cfg_server1:27017 [P]: pbm-agent v2.2.1 OK
- mongo_conf/mongodb_content_cfg_server2:27017 [S]: pbm-agent v2.2.1 OK
- mongo_conf/mongodb_content_cfg_server3:27017 [S]: pbm-agent v2.2.1 OK
PITR incremental backup:
Status [OFF]
Currently running:
(none)
Backups:
FS /opt/backup/
Snapshots:
2023-08-28T10:30:25Z 24.09MB [restore_to_time: 2023-08-28T10:30:28Z]
2023-08-28T09:01:02Z 14.49MB <incremental, base> [restore_to_time: 2023-08-28T09:01:04Z]
[mongodb@mongodb_content_cfg_server1 percona]$ pbm list
Backup snapshots:
2023-08-28T09:01:02Z <incremental, base> [restore_to_time: 2023-08-28T09:01:04Z]
2023-08-28T10:30:25Z [restore_to_time: 2023-08-28T10:30:28Z]
PITR :
[mongodb@mongodb_content_cfg_server1 percona]$ pbm restore 2023-08-28T10:30:25Z
Starting restore 2023-08-28T10:55:42.716998345Z from ‘2023-08-28T10:30:25Z’…Restore of the snapshot from ‘2023-08-28T10:30:25Z’ has started.
Check restore status with: pbm describe-restore 2023-08-28T10:55:42.716998345Z -c </path/to/pbm.conf.yaml>
No other pbm command is available while the restore is running!
[mongodb@mongodb_content_cfg_server1 percona]$ pbm restore 2023-08-28T10:30:25Z
Starting restore 2023-08-28T10:55:42.716998345Z from ‘2023-08-28T10:30:25Z’…Restore of the snapshot from ‘2023-08-28T10:30:25Z’ has started.
Check restore status with: pbm describe-restore 2023-08-28T10:55:42.716998345Z -c </path/to/pbm.conf.yaml>
No other pbm command is available while the restore is running!
[mongodb@mongodb_content_cfg_server1 percona]$ pbm describe-restore 2023-08-28T10:55:42.716998345Z -c /opt/percona/storage.yaml
name: “2023-08-28T10:55:42.716998345Z”
opid: “”
backup: “”
type: physical
status: running
last_transition_time: “2023-08-28T10:56:09Z”
replsets:
- name: mongo_data_rs1
status: running
last_transition_time: “2023-08-28T10:55:59Z”
nodes:- name: mongodb_content_rs1n1:27017
status: running
last_transition_time: “2023-08-28T10:55:54Z”
- name: mongodb_content_rs1n1:27017
- name: mongo_data_rs2
status: running
last_transition_time: “2023-08-28T10:56:03Z”
nodes:- name: mongodb_content_rs2n1:27017
status: running
last_transition_time: “2023-08-28T10:55:58Z”
- name: mongodb_content_rs2n1:27017
- name: mongo_data_rs3
status: running
last_transition_time: “2023-08-28T10:56:03Z”
nodes:- name: mongodb_content_rs3n1:27017
status: running
last_transition_time: “2023-08-28T10:55:58Z”
- name: mongodb_content_rs3n1:27017
- name: mongo_conf
status: running
last_transition_time: “2023-08-28T10:56:04Z”
nodes:- name: mongodb_content_cfg_server1:27017
status: running
last_transition_time: “2023-08-28T10:55:59Z” - name: mongodb_content_cfg_server2:27017
status: running
last_transition_time: “2023-08-28T10:55:58Z” - name: mongodb_content_cfg_server3:27017
status: running
last_transition_time: “2023-08-28T10:55:58Z”
- name: mongodb_content_cfg_server1:27017
[mongodb@mongodb_content_cfg_server1 percona]$ pbm status
Cluster:
mongo_data_rs1:
- mongo_data_rs1/mongodb_content_rs1n1:27017 [P]: pbm-agent v2.2.1 OK
mongo_data_rs2: - mongo_data_rs2/mongodb_content_rs2n1:27017 [P]: pbm-agent v2.2.1 OK
mongo_conf: - mongo_conf/mongodb_content_cfg_server1:27017 [P]: pbm-agent FAILED status:
ERROR with ERROR: lost agent, last heartbeat: 1693220172
- mongo_conf/mongodb_content_cfg_server2:27017 [S]: pbm-agent FAILED status:
ERROR with ERROR: lost agent, last heartbeat: 1693220171
- mongo_conf/mongodb_content_cfg_server3:27017 [S]: pbm-agent FAILED status:
ERROR with ERROR: lost agent, last heartbeat: 1693220172
mongo_data_rs3:
- mongo_data_rs3/mongodb_content_rs3n1:27017 [P]: pbm-agent v2.2.1 OK
PITR incremental backup:
Status [OFF]
Currently running:
(none)
Backups:
FS /opt/backup/
Snapshots:
2023-08-28T10:30:25Z 24.09MB [restore_to_time: 2023-08-28T10:30:28Z]
2023-08-28T09:01:02Z 14.49MB <incremental, base> [restore_to_time: 2023-08-28T09:01:04Z]
[mongodb@mongodb_content_cfg_server1 percona]$ pbm list --restore
Restores history:
[mongodb@mongodb_content_cfg_server1 percona]$ pbm logs
2023-08-28T10:55:43Z I [mongo_conf/mongodb_content_cfg_server1:27017] [restore/2023-08-28T10:55:42.716998345Z] waiting for starting
status in rs map[.pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_conf/node.mongodb_content_cfg_server1:27017:{} .pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_conf/node.mongodb_content_cfg_server2:27017:{} .pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_conf/node.mongodb_content_cfg_server3:27017:{}]
2023-08-28T10:55:44Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:55:42.716998345Z] moving to state starting
2023-08-28T10:55:44Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:55:42.716998345Z] waiting for starting
status in rs map[.pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_data_rs1/node.mongodb_content_rs1n1:27017:{}]
2023-08-28T10:55:48Z I [mongo_data_rs3/mongodb_content_rs3n1:27017] [restore/2023-08-28T10:55:42.716998345Z] waiting for cluster
2023-08-28T10:55:48Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] [restore/2023-08-28T10:55:42.716998345Z] waiting for cluster
2023-08-28T10:55:48Z I [mongo_conf/mongodb_content_cfg_server1:27017] [restore/2023-08-28T10:55:42.716998345Z] waiting for shards map[.pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_conf/rs:{} .pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_data_rs1/rs:{} .pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_data_rs2/rs:{} .pbm.restore/2023-08-28T10:55:42.716998345Z/rs.mongo_data_rs3/rs:{}]
2023-08-28T10:55:49Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:55:42.716998345Z] waiting for cluster
2023-08-28T10:55:54Z I [mongo_conf/mongodb_content_cfg_server1:27017] [restore/2023-08-28T10:55:42.716998345Z] waiting for cluster
2023-08-28T10:56:23Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] pbm-agent:
Version: 2.2.1
Platform: linux/amd64
GitCommit: 5d2c6eeb81be87a21ccf8d3ead6cba762e6bf70f
GitBranch: release-2.2.1
BuildTime: 2023-07-25_13:42_UTC
GoVersion: go1.19
2023-08-28T10:56:23Z
DataNode logs:
[mongodb@mongodb_content_rs2n1 /]$ pbm logs
2023-08-28T10:51:23Z I [mongo_conf/mongodb_content_cfg_server1:27017] got epoch {1693219863 8}
2023-08-28T10:51:23Z I [mongo_conf/mongodb_content_cfg_server1:27017] [restore/2023-08-28T10:51:23.296062775Z] backup: 2023-08-28T10:30:25Z
2023-08-28T10:51:23Z I [mongo_conf/mongodb_content_cfg_server1:27017] [restore/2023-08-28T10:51:23.296062775Z] recovery started
2023-08-28T10:51:23Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] got command restore [name: 2023-08-28T10:51:23.296062775Z, snapshot: 2023-08-28T10:30:25Z] <ts: 1693219883>
2023-08-28T10:51:23Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] got epoch {1693219863 8}
2023-08-28T10:51:23Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] got command restore [name: 2023-08-28T10:51:23.296062775Z, snapshot: 2023-08-28T10:30:25Z] <ts: 1693219883>
2023-08-28T10:51:23Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] [restore/2023-08-28T10:51:23.296062775Z] backup: 2023-08-28T10:30:25Z
2023-08-28T10:51:23Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] [restore/2023-08-28T10:51:23.296062775Z] recovery started
2023-08-28T10:51:23Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] got epoch {1693219863 8}
2023-08-28T10:51:24Z I [mongo_conf/mongodb_content_cfg_server1:27017] [restore/2023-08-28T10:51:23.296062775Z] moving to state starting
2023-08-28T10:51:24Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:51:23.296062775Z] backup: 2023-08-28T10:30:25Z
2023-08-28T10:51:24Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:51:23.296062775Z] recovery started
2023-08-28T10:51:24Z I [mongo_conf/mongodb_content_cfg_server1:27017] [restore/2023-08-28T10:51:23.296062775Z] waiting for starting
status in rs map[.pbm.restore/2023-08-28T10:51:23.296062775Z/rs.mongo_conf/node.mongodb_content_cfg_server1:27017:{} .pbm.restore/2023-08-28T10:51:23.296062775Z/rs.mongo_conf/node.mongodb_content_cfg_server2:27017:{} .pbm.restore/2023-08-28T10:51:23.296062775Z/rs.mongo_conf/node.mongodb_content_cfg_server3:27017:{}]
2023-08-28T10:51:24Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] [restore/2023-08-28T10:51:23.296062775Z] moving to state starting
2023-08-28T10:51:24Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] [restore/2023-08-28T10:51:23.296062775Z] waiting for starting
status in rs map[.pbm.restore/2023-08-28T10:51:23.296062775Z/rs.mongo_data_rs2/node.mongodb_content_rs2n1:27017:{}]
2023-08-28T10:51:24Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:51:23.296062775Z] moving to state starting
2023-08-28T10:51:24Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:51:23.296062775Z] waiting for starting
status in rs map[.pbm.restore/2023-08-28T10:51:23.296062775Z/rs.mongo_data_rs1/node.mongodb_content_rs1n1:27017:{}]
2023-08-28T10:51:28Z I [mongo_data_rs3/mongodb_content_rs3n1:27017] [restore/2023-08-28T10:51:23.296062775Z] waiting for cluster
2023-08-28T10:51:29Z I [mongo_data_rs2/mongodb_content_rs2n1:27017] [restore/2023-08-28T10:51:23.296062775Z] waiting for cluster
2023-08-28T10:51:29Z I [mongo_data_rs1/mongodb_content_rs1n1:27017] [restore/2023-08-28T10:51:23.296062775Z] waiting for cluster