Hi folks,
I have a problem with PSMDB restore in Kubernetes when using zstd compression.
Description:
I’ve created a simple replicaset cluster of 3 nodes with daily physical and logical backups using:
- compressionType: zstd
- compressionLevel: 6
Backups work perfectly fine, whether scheduled or on-demand. However, I am unable to restore my cluster from either physical or logical backups.
- Physical backup: The cluster switches to initializing mode and then fails with “backup not found.”
- Logical backup: The restore fails every time with an “unexpected EOF” error.
Steps to Reproduce:
- Create a simple replicaset
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDB
metadata:
name: replicaset-test-cluster
namespace: percona-mongodb
finalizers:
- percona.com/delete-psmdb-pods-in-order
spec:
pause: false
crVersion: 1.19.1
image: percona/percona-server-mongodb:7.0.15-9-multi
imagePullPolicy: Always
updateStrategy: SmartUpdate
upgradeOptions:
versionServiceEndpoint: https://check.percona.com
apply: disabled
schedule: "0 2 * * *"
setFCV: false
secrets:
users: replicaset-test-cluster-secrets
encryptionKey: replicaset-test-cluster-mongodb-encryption-key
pmm:
enabled: true
image: percona/pmm-client:2.44.0
serverHost: pmm-qa.slicetest.com
mongodParams: --environment=QA --cluster=replicaset-test-cluster
replsets:
- name: rs0
size: 3
configuration: |
operationProfiling:
mode: all
slowOpThresholdMs: 100
rateLimit: 10
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
podDisruptionBudget:
maxUnavailable: 1
expose:
enabled: false
type: ClusterIP
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
volumeSpec:
persistentVolumeClaim:
resources:
requests:
storage: 3Gi
nonvoting:
enabled: false
size: 3
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
podDisruptionBudget:
maxUnavailable: 1
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
volumeSpec:
persistentVolumeClaim:
resources:
requests:
storage: 3Gi
arbiter:
enabled: false
size: 1
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
sharding:
enabled: false
configsvrReplSet:
size: 3
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
podDisruptionBudget:
maxUnavailable: 1
expose:
enabled: false
type: ClusterIP
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
volumeSpec:
persistentVolumeClaim:
resources:
requests:
storage: 3Gi
mongos:
size: 3
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
podDisruptionBudget:
maxUnavailable: 1
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
expose:
type: ClusterIP
users:
- name: app-test-user
db: admin
passwordSecretRef:
name: app-test-user-secret
key: password
roles:
- name: readWrite
db: test_db
backup:
enabled: true
image: percona/percona-backup-mongodb:2.8.0-multi
storages:
s3-us-east:
type: s3
s3:
bucket: mongodb-backup-qa
credentialsSecret: replicaset-test-cluster-backup-s3
region: us-east-1
prefix: "replicaset-test-cluster"
pitr:
enabled: false
oplogOnly: false
compressionType: zstd
compressionLevel: 6
tasks:
- name: daily-s3-us-east
enabled: true
schedule: "0 0 * * *"
keep: 10
storageName: s3-us-east
compressionType: zstd
compressionLevel: 6
type: physical
- name: daily-s3-us-east-logic
enabled: true
schedule: "0 1 * * *"
keep: 10
storageName: s3-us-east
compressionType: zstd
compressionLevel: 6
type: logical
- Make some visible data changes, (create DB, collection, add some data), make a backup and drop some object to have anchor to check.
- Take the created backup and start the restore:
backup info:
NAME CLUSTER STORAGE DESTINATION TYPE STATUS COMPLETED AGE
cron-replicaset-test--20250408000000-8lmbh replicaset-test-cluster s3-us-east s3://mongodb-backup-qa/replicaset-test-cluster/2025-04-08T00:00:21Z physical ready 39h 39h
Restore config:
kind: PerconaServerMongoDBRestore
metadata:
name: restore-replicaset-test-cluster
spec:
clusterName: replicaset-test-cluster
backupName: cron-replicaset-test--20250408000000-8lmbh
storageName: s3-us-east
backupSource:
type: physical
destination: s3://mongodb-backup-qa/replicaset-test-cluster/2025-04-08T00:00:21Z
s3:
credentialsSecret: replicaset-test-cluster-backup-s3
region: us-east-1
bucket: mongodb-backup-qa
endpointUrl: https://s3.us-east-1.amazonaws.com/
prefix: "replicaset-test-cluster"
Version:
Operator: 1.19.1
MongoDB server: 7.0.15-9
Percona-Backup-Mongodb: 2.8.0
Logs:
Operator logs:
- Physical backup: gist:13d1579be189ac54b4392e08206ff611 · GitHub
- Logical backup: Logical backup restore with ztsd compression · GitHub
Expected Result:
A working cluster with actual data from the backup.
Actual Result:
- Physical backup: The cluster in initializing state is not stable.
- Logical backup: Partly lost data and users (YAML config helps to have users).
Additional Information:
I’ve switched compression to the default gzip, and I was able to restore both physical and logical backups on the same cluster. This suggests that the problem is with the zstd compression algorithm.