PSMDB 1.14.0. PBM (backup) exit code -1

aporrinali · July 19, 2023, 7:29am

Dear Percona Team,
I am facing an issue with backups on a cluster with data (less 1GB).
Long-story short: Backup fails with exit code -1 and stays in running state. Lock on MongoDB persists. Deleting backup which is in running state does nothing.

Steps to Reproduce:

Spin up fresh PSMDB cluster. Backup fresh cluster - OK (did it 5 times). Run mongorestore from another cluster to restore data:

mongorestore --db invite --host test-rs0.test.svc.cluster.local:27017 --authenticationDatabase invite --username 'mongodb-user' --password 'password' --gzip --archive=/bitnami/mongodb/backups/full-dev-dump-02-03-2023.archive.gz --drop --noIndexRestore

Wait 10-15 minutes. Initiate a new backup… and here starts some random behavior. It will do backup successfully once, or most likely it will fail with exit code -1 and no additional useful info.

Version:

PSMDB Operator 1.14.0
PSMDB 5.0.15-13, 6.0.4-3, 6.0.5-4
PBM 2.0.3, 2.0.5, 2.2.0

Logs:

2023-07-18T08:37:03.060+0000	Mux close namespace abc.Ski
2023/07/18 08:37:03 [entrypoint] `pbm-agent` exited with code -1
2023/07/18 08:37:03 [entrypoint] restart in 5 sec

pbm status

sh-4.4$ pbm status
Cluster:
========
rs0:
  - rs0/test-rs0-0.test-rs0.test.svc.cluster.local:27017 [P]: pbm-agent v2.0.5 OK
  - rs0/test-rs0-1.test-rs0.test.svc.cluster.local:27017 [S]: pbm-agent v2.0.5 OK
  - rs0/test-rs0-2.test-rs0.test.svc.cluster.local:27017 [S]: pbm-agent v2.0.5 OK


PITR incremental backup:
========================
Status [OFF]

Currently running:
==================
(none)

Backups:
========
S3 eu-central-1 s3://backup-abc-test-com/scheduled
  Snapshots:
    2023-07-18T08:36:33Z 0.00B <logical> [ERROR: Backup stuck at `running` stage, last beat ts: 1689669420] [2023-07-18T08:36:37Z]
    2023-07-18T08:33:10Z 160.53MB <logical> [restore_to_time: 2023-07-18T08:34:04Z]
    2023-07-18T08:14:11Z 27.18KB <logical> [restore_to_time: 2023-07-18T08:14:16Z]
sh-4.4$

test-rs0-2-backup-agent.log (5.9 KB)
.yaml renamed to .log
test-percona.yaml.log (8.0 KB)
test-backup3.yaml.log (1.6 KB)

Please help to investigate.

aporrinali · July 20, 2023, 10:06am

another error…

plus sometimes sometimes got Error while creating backup: failed to find CERTIFICATE

Topic		Replies	Views
Mongodb backup errors - percona backup Percona Backup for MongoDB	7	2135	April 15, 2021
PSMDB Backup Failing [check cluster for dump done: convergeCluster: backup on shard rs2 failed with: %!s(<nil>)] Percona Backup for MongoDB	2	60	April 23, 2025
Error while trying backup: check cluster for dump done: convergeCluster: lost shard rs0, last beat ts: Percona Operator for MongoDB closed-no-reply , pbm	5	884	November 9, 2024
Unable to restore the backup from any of the backup files using PBM agent! Percona Backup for MongoDB percona	1	61	April 22, 2025
Pbm restore failed: Percona Backup for MongoDB	1	884	September 7, 2022

PSMDB 1.14.0. PBM (backup) exit code -1

Steps to Reproduce:

Version:

Logs:

Related topics