While creating a normal logical backup of a 1.3TB replica, its state changes to errored after about 16 hours of run with the following:
check for concurrent jobs: getting pbm object: create PBM connection to mongo-rs0-2.mongo-rs0.mongo.svc.cluster.local:27017,mongo-rs0-0.mongo-rs0.mongo.svc.cluster.local:27017,mongo-rs0-1.mong-rs0.mongo.svc.cluster.local:27017: create mongo connection: create mongo client: failed to find CERTIFICATE
The backup job however still progresses and completes, but I’m unsure as to the validity of the backup it creates (and the fact that I doubt it’s possible to restore from that unless manually changing its status to success)
I think it’s possible for it to be related to this, and as @Sergey_Pronin says, the backups are good but indicating a wrong failed status. I haven’t tried restoring from those failed ones though.
Same problem here and in my case, it’s not related to this. I think it is a problem with deploy certificates, i have two environments and i’ve done an operator upgrade from 1.9.0 to 1.14.0, in development it is working as expected, i got backups in S3 but in production, i got this error. Comparing both environments, i’ve only seen that the certificates doesn’t upgrade properly like in development, i say this because in ArgoCD the objects of each app are different, not the same result. In development, i see two secret objects for certificates but they aren’t in production so i’m guessing something is happening there.
@Semantic I’m talking about psmdb-backup. In other words - if you try manual backup through creating a psmdb-backup resource - is it always erroring out or there is a chance that it goes through?