While creating a normal logical backup of a 1.3TB replica, its state changes to errored after about 16 hours of run with the following:
check for concurrent jobs: getting pbm object: create PBM connection to mongo-rs0-2.mongo-rs0.mongo.svc.cluster.local:27017,mongo-rs0-0.mongo-rs0.mongo.svc.cluster.local:27017,mongo-rs0-1.mong-rs0.mongo.svc.cluster.local:27017: create mongo connection: create mongo client: failed to find CERTIFICATE
The backup job however still progresses and completes, but I’m unsure as to the validity of the backup it creates (and the fact that I doubt it’s possible to restore from that unless manually changing its status to success)
+1 same issue, but with a much lighter base (1GB)
create pbm object: create PBM connection to prod-mongodb-rs0-0.prod-mongodb-rs0.mongo.svc.cluster.local:27017,prod-mongodb-rs0-1.prod-mongodb-rs0.mongo.svc.cluster.local:27017,prod-mongodb-rs0-2.prod-mongodb-rs0.mongo.svc.cluster.local:27017: create mongo connection: create mongo client: failed to find CERTIFICATE
PSMDB Operator 1.14.0
Would you need any additional info, please ask
I think it’s possible for it to be related to this, and as @Sergey_Pronin says, the backups are good but indicating a wrong failed status. I haven’t tried restoring from those failed ones though.
Same problem here and in my case, it’s not related to this. I think it is a problem with deploy certificates, i have two environments and i’ve done an operator upgrade from 1.9.0 to 1.14.0, in development it is working as expected, i got backups in S3 but in production, i got this error. Comparing both environments, i’ve only seen that the certificates doesn’t upgrade properly like in development, i say this because in ArgoCD the objects of each app are different, not the same result. In development, i see two secret objects for certificates but they aren’t in production so i’m guessing something is happening there.
@Semantic @aporrinali is it happening to any backup resource?
What do you mean by a resource in this case?
@Semantic I’m talking about
psmdb-backup. In other words - if you try manual backup through creating a psmdb-backup resource - is it always erroring out or there is a chance that it goes through?
Correct my backups are only done manually at the moment. Some clusters happen to work, some don’t even though I retried them a couple times.