We’ve got several other clusters (all simple 3-data-node PSS replica sets, no sharding) - backing up fine, but started adding new backups for a couple of clusters and are seeing errors like the following when the mongodump portion finishes:
I [backup/2021-01-05T21:27:41Z] mongodump finished, waiting for the oplog I [backup/2021-01-05T21:27:41Z] mark backup as error `check cluster for dump done: convergeCluster: lost shard repl-c-guild-c04, last beat ts: 1609882109`: <nil> E [backup/2021-01-05T21:27:41Z] backup: check cluster for dump done: convergeCluster: lost shard repl-c-guild-c04, last beat ts: 1609882109 D [backup/2021-01-05T21:27:41Z] releasing lock
Any idea what could be going on here?
There were no mongod issues during the backup, and oplog has stayed caught up on all nodes during the dump. (never more than a few seconds delay at most)
This is with 1.4.0.