Hi,
We are managing a mongodb sharded cluster (Mongodb 4.4) and have 3 data nodes, 3 config servers and 2 mongos servers. I installed percona agent on all of the servers, and had a remote gcs storage. I was able to perform backup and upload the data to gcs. But when trying to do a restore, the process would hung and pbm logs were scant. Could you help with this issue?
> pbm status
Cluster:
========
configsvr:
- configsvr/g-fraud-es-mongodb-config-primary-01-mongo-a-01:27017: pbm-agent v1.5.0 OK
- configsvr/g-fraud-es-mongodb-config-secondary-01-mongo-a-01:27017: pbm-agent v1.5.0 OK
- configsvr/g-fraud-es-mongodb-config-arbiter-01-mongo-a-01:27017: pbm-agent v1.5.0 OK
rs0:
- rs0/g-fraud-es-mongodb-primary-set01-mongo-a-01:27017: pbm-agent v1.5.0 OK
- rs0/g-fraud-es-mongodb-secondary-set01-mongo-a-01:27017: pbm-agent v1.5.0 OK
- rs0/g-fraud-es-mongodb-secondary-02-set01-mongo-a-01:27017: pbm-agent v1.5.0 OK
PITR incremental backup:
========================
Status [OFF]
Currently running:
==================
PITR restore "2021-06-16T03:03:46Z", started at 2021-06-16T06:32:53Z. Status: snapshot restore. [op id: 60c99b1422caa6a3de0c4ce0]
Backups:
========
S3 asia-southeast1 s3://https://storage.googleapis.com/risk_integration_mongodb_backup/pbm/test
Snapshots:
2021-06-16T03:03:46Z 2.08MB [complete: 2021-06-16T03:04:07]
2021-06-15T10:06:34Z 1.37MB [complete: 2021-06-15T10:06:55]
PITR chunks:
2021-06-16T03:04:07 - 2021-06-16T04:48:14 643.50KB
2021-06-15T10:06:55 - 2021-06-16T03:03:52 5.15MB
> pbm logs -i 60c99b1422caa6a3de0c4ce0
2021-06-16T06:32:52Z I [/] [pitr/2021-06-16T04:48:14Z] recovery started
2021-06-16T06:32:52Z E [/] [pitr/2021-06-16T04:48:14Z] restore: define last oplog slice: get: mongo: no documents in result
2021-06-16T06:32:53Z I [rs0/g-fraud-es-mongodb-secondary-02-set01-mongo-a-01:27017] [pitr/2021-06-16T04:48:14Z] Node in not suitable for restore
2021-06-16T06:32:53Z I [configsvr/g-fraud-es-mongodb-config-primary-01-mongo-a-01:27017] [pitr/2021-06-16T04:48:14Z] recovery started
2021-06-16T06:32:53Z I [rs0/g-fraud-es-mongodb-primary-set01-mongo-a-01:27017] [pitr/2021-06-16T04:48:14Z] recovery started
2021-06-16T06:32:53Z I [rs0/g-fraud-es-mongodb-secondary-set01-mongo-a-01:27017] [pitr/2021-06-16T04:48:14Z] Node in not suitable for restore
2021-06-16T06:32:53Z I [configsvr/g-fraud-es-mongodb-config-secondary-01-mongo-a-01:27017] [pitr/2021-06-16T04:48:14Z] Node in not suitable for restore
2021-06-16T06:32:53Z I [configsvr/g-fraud-es-mongodb-config-arbiter-01-mongo-a-01:27017] [pitr/2021-06-16T04:48:14Z] Node in not suitable for restore
2021-06-16T06:32:53Z E [/] [pitr/2021-06-16T04:48:14Z] unbale to run the restore while another backup or restore process running
2021-06-16T06:33:01Z I [configsvr/g-fraud-es-mongodb-config-primary-01-mongo-a-01:27017] [pitr/2021-06-16T04:48:14Z] mongorestore finished
The restore ended very fast, but the data was not restored, and the status did not change. After hours of waiting we did a little digging into the code and deleted the locks in the db so we can try again. Unfortuately, the result was the same. The process hung, and data was not restored.
> pbm logs -i 60c9dab8956c4f8056f6c055
2021-06-16T11:04:25Z I [configsvr/g-fraud-es-mongodb-config-arbiter-01-mongo-a-01:27017] [restore/2021-06-16T03:03:46Z] node is not primary so it unsuitable to do restore
2021-06-16T11:04:25Z I [configsvr/g-fraud-es-mongodb-config-primary-01-mongo-a-01:27017] [restore/2021-06-16T03:03:46Z] restore started
2021-06-16T11:04:25Z I [configsvr/g-fraud-es-mongodb-config-secondary-01-mongo-a-01:27017] [restore/2021-06-16T03:03:46Z] node is not primary so it unsuitable to do restore
2021-06-16T11:04:25Z I [rs0/g-fraud-es-mongodb-secondary-set01-mongo-a-01:27017] [restore/2021-06-16T03:03:46Z] node is not primary so it unsuitable to do restore
2021-06-16T11:04:25Z I [/] [restore/2021-06-16T03:03:46Z] restore started
2021-06-16T11:04:25Z E [/] [restore/2021-06-16T03:03:46Z] restore: prepare snapshot: extra/unknown replica set found in the backup: rs0,configsvr
2021-06-16T11:04:25Z E [/] [restore/2021-06-16T03:03:46Z] unbale to run the restore while another operation running
2021-06-16T11:04:25Z I [rs0/g-fraud-es-mongodb-primary-set01-mongo-a-01:27017] [restore/2021-06-16T03:03:46Z] restore started
2021-06-16T11:04:25Z I [rs0/g-fraud-es-mongodb-secondary-02-set01-mongo-a-01:27017] [restore/2021-06-16T03:03:46Z] node is not primary so it unsuitable to do restore
2021-06-16T11:04:33Z I [configsvr/g-fraud-es-mongodb-config-primary-01-mongo-a-01:27017] [restore/2021-06-16T03:03:46Z] mongorestore finished
Any idea how to solve the issue? Thanks!!