Trouble Restoring Backup

I’m having some real trouble restoring a backup from a database that seems to have died. I’m not 100% sure what went wrong with the MongoDB - I unfortunately don’t have the logs, but suspect it was something to do with the SSL config.

Also unfortunately, I don’t have the credentials for the management users (e.g. userAdmin, clusterAdmin, etc) that exist on the backup… and the existing database has been deleted.

All I really need to do is extract the data from the backup - I can manually load it into a new MongoDB instance if need be. However, whenever I run the restore job, it seems to work fine but then the replicaset pods all start dying and I can’t connect to grab the data. I have been able to connect by directly remoting into the mongos pod, but that only lasts for a few minutes before the other pods start dying.

The cr.yaml that I’m using is exactly the same as the one used on the old database. Same number of shards, same number of replicas, etc.

I thought it might be that the restore job is overwriting the creds of the cluster management users, so I tried creating them with different names by manually creating a secret with the users + passwords, but no luck there either.

I’m kind of out of ideas at this point - is there anything else I could try?

Any help would be much appreciated!

EDIT: Forgot to mention - I do have the username and password for a root user in the backup, just not the clusterAdmin etc users.

EDIT 2: I’ve managed to log in to the restored cluster and update the passwords for the management users to match the ones I manually set. This initially seemed to fix the issue, I was able to connect, but the replica sets now error and restart around once every 2-3 minutes. I’m not really sure how to go about debugging this!

1 Like

Hi there,

Sorry to reopen this topic, but we’ve had a similar problem again. I have the credentials for all of the users this time, but it doesn’t seem to help.

One of our clusters died, and there didn’t seem to be any obvious way to bring it back online. We decided to delete the cluster, build a new one and restore the backup. We’ve set the cluster users to be the same as the ones we extracted from the old server. However, whilst the backup appears to successfully apply, the mongos pod does not successfully progress to a ready state and the database is completely unaccessible. All other pods (config serves, replica sets) restart every few minutes.

Is there any way to perform a data-only restore? Or is there some other way we can try and fix this?

Any help much appreciated!

1 Like

Do you have some logs ? I mean the mongos usually don’t die just for fun.

1 Like

@Geo ,

As @jamoser said we need more info here. CRs, k8s versions and specifics, logs, would help a lot in debugging and reproducing the problem.

1 Like