Pitr restore working different way in two ways

Description:

I’m testing pitr backup and restore in percona mongodb operator. There are two ways.

  1. using psmdb-backup crd - in backupName
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
  name: restore-phy2
spec:
  clusterName: mongodb-test-psmdb-db
  backupName: cron-mongodb-test-psm-20240109065000-qskww
  pitr:
    type: date
    date: 2024-01-09 06:52:41
  1. using s3 source in new kubernetes environment
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
  name: restore-pitr-source
spec:
  clusterName: mongodb-test-psmdb-db
  pitr:
    type: date
    date: 2024-01-09 06:52:41
  storageName: s3-percona-mongodb-backup
  backupSource:
    destination: s3://<bucket name>/<prefix>/2024-01-09T06:50:21Z

In first way, backup-agent container’s log is below.

2024-01-09T08:43:19.000+0000 I got command restore [name: 2024-01-09T08:43:18.183792383Z, snapshot: 2024-01-09T06:50:21Z point-in-time: <1704783161,0>] <ts: 1704789798>
2024-01-09T08:43:19.000+0000 I got epoch {1704789785 1}
2024-01-09T08:43:19.000+0000 I [restore/2024-01-09T08:43:18.183792383Z] to time: 2024-01-09T06:52:41Z
2024-01-09T08:43:19.000+0000 I [restore/2024-01-09T08:43:18.183792383Z] backup: 2024-01-09T06:50:21Z
2024-01-09T08:43:19.000+0000 I [restore/2024-01-09T08:43:18.183792383Z] recovery started
2024-01-09T08:43:19.000+0000 D [restore/2024-01-09T08:43:18.183792383Z] port: 27743
2024-01-09T08:43:19.000+0000 D [restore/2024-01-09T08:43:18.183792383Z] hearbeats stopped
2024-01-09T08:43:19.000+0000 E [restore/2024-01-09T08:43:18.183792383Z] restore: check mongod binary: run: exec: "mongod": executable file not found in $PATH. stderr:
2024/01/09 08:44:39 [entrypoint] got terminated, shutting down
2024/01/09 08:44:39 [entrypoint] kill `pbm-agent` (85): <nil>

In second way, backup-agent container’s log is below.

2024-01-09T08:41:11.000+0000 I got command restore [name: 2024-01-09T08:41:10.846677691Z, snapshot: 2024-01-09T06:50:21Z point-in-time: <1704783161,0>] <ts: 1704789670>
2024-01-09T08:41:11.000+0000 I got epoch {1704789657 2}
2024-01-09T08:41:11.000+0000 I [restore/2024-01-09T08:41:10.846677691Z] to time: 2024-01-09T06:52:41Z
2024-01-09T08:41:11.000+0000 I [restore/2024-01-09T08:41:10.846677691Z] backup: 2024-01-09T06:50:21Z
2024-01-09T08:41:11.000+0000 I [restore/2024-01-09T08:41:10.846677691Z] recovery started
2024-01-09T08:41:11.000+0000 D [restore/2024-01-09T08:41:10.846677691Z] port: 27631
2024-01-09T08:41:12.000+0000 D [restore/2024-01-09T08:41:10.846677691Z] hearbeats stopped
2024-01-09T08:41:12.000+0000 E [restore/2024-01-09T08:41:10.846677691Z] restore: check mongod binary: run: exec: "mongod": executable file not found in $PATH. stderr:

The difference is, second backup-agent fall in to stuck status. Anymore log is sdtouted and no action…
In second way, container killed immediately. (The timestamp is strange)

2024-01-09T08:43:19.000+0000 E [restore/2024-01-09T08:43:18.183792383Z] restore: check mongod binary: run: exec: "mongod": executable file not found in $PATH. stderr:
2024/01/09 08:44:39 [entrypoint] got terminated, shutting down

I’m testing in various way in other options.
Anybody here to advise to me?
Thank you.

Version:

operator : 1.15.0
backup-agent : percona/percona-backup-mongodb:2.3.1
mongodb-server : percona/percona-server-mongodb:6.0.9-7

@kyeongjun.me how does kubectl get psmdb-restore -o yaml look like?
is there any error there?

@Sergey_Pronin Thank you for your reply!! :slight_smile: I re tried pitr restore. The result is below.
just get psmdb-restore result is below. - requested

kubectl get psmdb-restore
NAME                  CLUSTER                 STATUS      AGE
restore-pitr-source   mongodb-test-psmdb-db   requested   2m37s

kubectl get psmdb-restore -o yaml result is below. (no error showing)

kubectl get psmdb-restore restore-pitr-source -oyaml
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
  creationTimestamp: "2024-01-17T02:18:55Z"
  generation: 1
  name: restore-pitr-source
  namespace: mongodb-test
  resourceVersion: "100038724"
  uid: f370ff7a-c0f7-4bf6-874e-0b07a5d79262
spec:
  backupSource:
    destination: s3://percona-mongodb-backup/prod/2024-01-17T01:19:19Z
  clusterName: mongodb-test-psmdb-db
  pitr:
    date: "2024-01-17 02:09:14"
    type: date
  storageName: s3-percona-mongodb-backup
status:
  pbmName: "2024-01-17T02:19:14.497636748Z"
  state: requested

pbm status command result in backup-agent container is below. - I stopped pitr after pitr backup created

kubectl exec -it mongodb-test-psmdb-db-rs0-0 -c backup-agent -- pbm status
Cluster:
========
cfg:
  - cfg/mongodb-test-psmdb-db-cfg-0.mongodb-test-psmdb-db-cfg.mongodb-test.svc.cluster.local:27017 [P]: pbm-agent v2.3.1 OK
rs0:
  - rs0/mongodb-test-psmdb-db-rs0-0.mongodb-test-psmdb-db-rs0.mongodb-test.svc.cluster.local:27017 [P]: pbm-agent v2.3.1 OK


PITR incremental backup:
========================
Status [OFF]

Currently running:
==================
(none)

Backups:
========
S3 ap-northeast-2 s3://percona-mongodb-backup/prod
  Snapshots:
    2024-01-17T01:19:19Z 1.42MB <physical> [restore_to_time: 2024-01-17T01:19:21Z]
  PITR chunks [7.25MB]:
    2024-01-17T01:19:22Z - 2024-01-17T02:09:14Z
    2024-01-17T01:18:15Z - 2024-01-17T01:19:21Z (no base snapshot)

After delete cluster (including pvc), recreate cluster. And I tried to pitr like below.

apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
  name: restore-pitr-source
spec:
  clusterName: mongodb-test-psmdb-db
  pitr:
    type: date
    date: 2024-01-17 02:09:14
  storageName: s3-percona-mongodb-backup
  backupSource:
    destination: s3://percona-mongodb-backup/prod/2024-01-17T01:19:19Z

operator logs - (INFO Waiting for restore metadata logs continue…)

2024-01-17T02:19:13.000+0000 D [resync] got backups list: 9
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-09T09:30:07Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-11T02:27:13Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-11T02:32:50Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-11T02:48:31Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-11T03:01:15Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-11T03:10:06Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-11T03:24:09Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-11T04:50:19Z.pbm.json
2024-01-17T02:19:13.000+0000 D [resync] bcp: 2024-01-17T01:19:19Z.pbm.json
2024-01-17T02:19:14.502Z	INFO	Restore state changed	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "38f935e8-c450-437f-98e9-ef8dcca7bad2", "previous": "", "current": "requested"}
2024-01-17T02:19:14.589Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "1d468ec5-e586-4e9d-abab-cfb9a3238823", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:19.588Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "d97ac96d-4f65-43f5-bcb5-c1a817f19389", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:24.671Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "ec974261-2a09-4b0c-b573-44ea4a39a009", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:29.755Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "8ff50ad6-4046-4131-83e6-b5232948dcd8", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:34.846Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "34810c1a-4711-4772-9656-d196e7ef6e20", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:39.929Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "21b61a56-d6fc-4bdb-96b0-404e0910ca0a", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:45.009Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "68ce5ee9-fce0-4910-b055-e3bd4677d90d", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:50.109Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "81081c6c-0247-4909-a658-f42f450885d7", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:19:55.191Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "4afda217-2e3f-4255-ac0b-2f98929bf8be", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}
2024-01-17T02:20:00.280Z	INFO	Waiting for restore metadata	{"controller": "psmdbrestore-controller", "object": {"name":"restore-pitr-source","namespace":"mongodb-test"}, "namespace": "mongodb-test", "name": "restore-pitr-source", "reconcileID": "c95bf534-be42-470b-889e-a58649da9349", "pbmName": "2024-01-17T02:19:14.497636748Z", "restore": "restore-pitr-source", "backup": ""}

log of backup-agent in config-server pod

Version:   2.3.1
Platform:  linux/amd64
GitCommit: 8c4265cfb2d9a7581b782a829246d8fcb6c7d655
GitBranch: release-2.3.1
BuildTime: 2023-11-29_13:31_UTC
GoVersion: go1.19
2024-01-17T02:14:38.000+0000 I starting PITR routine
2024-01-17T02:14:38.000+0000 I node: cfg/mongodb-test-psmdb-db-cfg-0.mongodb-test-psmdb-db-cfg.mongodb-test.svc.cluster.local:27017
2024-01-17T02:14:38.000+0000 I listening for the commands
2024-01-17T02:14:43.000+0000 W [agentCheckup] get current storage status: query mongo: mongo: no documents in result
2024-01-17T02:19:14.000+0000 I got command restore [name: 2024-01-17T02:19:14.497636748Z, snapshot: 2024-01-17T01:19:19Z point-in-time: <1705457354,0>] <ts: 1705457954>
2024-01-17T02:19:14.000+0000 I got epoch {1705457935 1}
2024-01-17T02:19:14.000+0000 I [restore/2024-01-17T02:19:14.497636748Z] to time: 2024-01-17T02:09:14Z
2024-01-17T02:19:14.000+0000 I [restore/2024-01-17T02:19:14.497636748Z] backup: 2024-01-17T01:19:19Z
2024-01-17T02:19:14.000+0000 I [restore/2024-01-17T02:19:14.497636748Z] recovery started
2024-01-17T02:19:15.000+0000 D [restore/2024-01-17T02:19:14.497636748Z] port: 27865
2024-01-17T02:19:15.000+0000 E [restore/2024-01-17T02:19:14.497636748Z] restore: check mongod binary: run: exec: "mongod": executable file not found in $PATH. stderr:
2024-01-17T02:19:15.000+0000 D [restore/2024-01-17T02:19:14.497636748Z] hearbeats stopped

log of backup-agent in replicaset pod

Version:   2.3.1
Platform:  linux/amd64
GitCommit: 8c4265cfb2d9a7581b782a829246d8fcb6c7d655
GitBranch: release-2.3.1
BuildTime: 2023-11-29_13:31_UTC
GoVersion: go1.19
2024-01-17T02:14:37.000+0000 I node: rs0/mongodb-test-psmdb-db-rs0-0.mongodb-test-psmdb-db-rs0.mongodb-test.svc.cluster.local:27017
2024-01-17T02:14:37.000+0000 I listening for the commands
2024-01-17T02:14:42.000+0000 W [agentCheckup] get current storage status: query mongo: mongo: no documents in result



2024-01-17T02:19:14.000+0000 I got command restore [name: 2024-01-17T02:19:14.497636748Z, snapshot: 2024-01-17T01:19:19Z point-in-time: <1705457354,0>] <ts: 1705457954>
2024-01-17T02:19:14.000+0000 I got epoch {1705457935 1}
2024-01-17T02:19:14.000+0000 I [restore/2024-01-17T02:19:14.497636748Z] to time: 2024-01-17T02:09:14Z
2024-01-17T02:19:14.000+0000 I [restore/2024-01-17T02:19:14.497636748Z] backup: 2024-01-17T01:19:19Z
2024-01-17T02:19:14.000+0000 I [restore/2024-01-17T02:19:14.497636748Z] recovery started
2024-01-17T02:19:14.000+0000 D [restore/2024-01-17T02:19:14.497636748Z] port: 27738
2024-01-17T02:19:14.000+0000 E [restore/2024-01-17T02:19:14.497636748Z] restore: check mongod binary: run: exec: "mongod": executable file not found in $PATH. stderr:
2024-01-17T02:19:14.000+0000 D [restore/2024-01-17T02:19:14.497636748Z] hearbeats stopped

After terminated mongos pod, cfg and rs pod is still Running but any actions not occurs.

kubectl get pod
NAME                                                    READY   STATUS    RESTARTS   AGE
mongodb-test-operator-psmdb-operator-7cf5dcd97c-67tsm   1/1     Running   0          7d16h
mongodb-test-psmdb-db-cfg-0                             2/2     Running   0          21m
mongodb-test-psmdb-db-rs0-0                             2/2     Running   0          21m

@Sergey_Pronin Any news on this topic ? By the way, does Restore backup to a new Kubernetes-based environment - Percona Operator for MongoDB work at all with Percona MongoDb Operator 1.15.0 ? Or should we wait for 1.16.0 ?

1 Like

Hey folks.

I’m trying to reproduce it. Will update you later today.

1 Like

@Sergey_Pronin Do you have an update or still need more information ?

No information needed.
Do you use sharding in your cluster?

We were able to reproduce the problem, but with slightly different error. Also it is reproducible for clusters with sharding enabled (even if it is just one shard). We are digging deeper now.

Yes, i have sharding cluster.

I tested in shared cluster with only 1 shard

My values.yaml of psmdb-db is below.

# Default values for psmdb-cluster.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

# Platform type: kubernetes, openshift
# platform: kubernetes

# Cluster DNS Suffix
# clusterServiceDNSSuffix: svc.cluster.local
# clusterServiceDNSMode: "Internal"

finalizers:
## Set this if you want that operator deletes the primary pod last
#  - delete-psmdb-pods-in-order
## Set this if you want to delete database persistent volumes on cluster deletion
#  - delete-psmdb-pvc

nameOverride: ""
fullnameOverride: ""

crVersion: 1.15.0
pause: false
unmanaged: false
allowUnsafeConfigurations: true
# ignoreAnnotations:
#   - service.beta.kubernetes.io/aws-load-balancer-backend-protocol
# ignoreLabels:
#   - rack
multiCluster:
  enabled: false
  # DNSSuffix: svc.clusterset.local
updateStrategy: SmartUpdate
upgradeOptions:
  versionServiceEndpoint: https://check.percona.com
  apply: disabled
  schedule: "0 2 * * *"
  setFCV: false

image:
  repository: percona/percona-server-mongodb
  tag: 6.0.9-7-amd64

imagePullPolicy: Always
# imagePullSecrets: []
# initImage:
#   repository: percona/percona-server-mongodb-operator
#   tag: 1.14.0
# initContainerSecurityContext: {}
# tls:
#   # 90 days in hours
#   certValidityDuration: 2160h
secrets:
  users: mongodb-prod-psmdb-db-secrets
  # encryptionKey: mongodb-prod-psmdb-db-mongodb-encryption-key
  # If you set users secret here the operator will use existing one or generate random values
  # If not set the operator generates the default secret with name <cluster_name>-secrets
  # users: my-cluster-name-secrets

pmm:
  enabled: false
  image:
    repository: percona/pmm-client
    tag: 2.39.0
  serverHost: monitoring-service

replsets:
  - name: rs0
    size: 3
    configuration: |
      systemLog:
        verbosity: 0        
      operationProfiling:
        mode: slowOp
        slowOpThresholdMs: 2000
    # terminationGracePeriodSeconds: 300
    # externalNodes:
    # - host: 34.124.76.90
    # - host: 34.124.76.91
    #   port: 27017
    #   votes: 0
    #   priority: 0
    # - host: 34.124.76.92
    # configuration: |
    #   operationProfiling:
    #     mode: slowOp
    #   systemLog:
    #     verbosity: 1
    # serviceAccountName: percona-server-mongodb-operator
    # topologySpreadConstraints:
    #   - labelSelector:
    #       matchLabels:
    #         app.kubernetes.io/name: percona-server-mongodb
    #     maxSkew: 1
    #     topologyKey: kubernetes.io/hostname
    #     whenUnsatisfiable: DoNotSchedule
    affinity:
      advanced:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - ap-northeast-2a
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - replicaset
            topologyKey: "kubernetes.io/hostname"

      # advanced:
      #   podAffinity:
      #     requiredDuringSchedulingIgnoredDuringExecution:
      #     - labelSelector:
      #         matchExpressions:
      #         - key: security
      #           operator: In
      #           values:
      #           - S1
      #       topologyKey: failure-domain.beta.kubernetes.io/zone
    # tolerations: []
    # priorityClass: ""
    annotations:
      proxy.istio.io/config: |
        proxyMetadata:
          ISTIO_META_IDLE_TIMEOUT: 0s
    labels:
      app: replicaset
    nodeSelector:
      role: mongodb-prod-replicaset
    storage:
      engine: wiredTiger
      wiredTiger:
        engineConfig:
          cacheSizeRatio: 0.8
    livenessProbe:
      initialDelaySeconds: 60
    readinessProbe:
      initialDelaySeconds: 60      
    # livenessProbe:
    #   failureThreshold: 4
    #   initialDelaySeconds: 60
    #   periodSeconds: 30
    #   timeoutSeconds: 10
    #   startupDelaySeconds: 7200
    # readinessProbe:
    #   failureThreshold: 8
    #   initialDelaySeconds: 10
    #   periodSeconds: 3
    #   successThreshold: 1
    #   timeoutSeconds: 2
    # runtimeClassName: image-rc
    # storage:
    #   engine: wiredTiger
    #   wiredTiger:
    #     engineConfig:
    #       cacheSizeRatio: 0.5
    #       directoryForIndexes: false
    #       journalCompressor: snappy
    #     collectionConfig:
    #       blockCompressor: snappy
    #     indexConfig:
    #       prefixCompression: true
    #   inMemory:
    #     engineConfig:
    #        inMemorySizeRatio: 0.5
    # sidecars:
    # - image: busybox
    #   command: ["/bin/sh"]
    #   args: ["-c", "while true; do echo echo $(date -u) 'test' >> /dev/null; sleep 5;done"]
    #   name: rs-sidecar-1
    #   volumeMounts:
    #     - mountPath: /volume1
    #       name: sidecar-volume-claim
    #     - mountPath: /secret
    #       name: sidecar-secret
    #     - mountPath: /configmap
    #       name: sidecar-config
    # sidecarVolumes:
    # - name: sidecar-secret
    #   secret:
    #     secretName: mysecret
    # - name: sidecar-config
    #   configMap:
    #     name: myconfigmap
    # sidecarPVCs:
    # - apiVersion: v1
    #   kind: PersistentVolumeClaim
    #   metadata:
    #     name: sidecar-volume-claim
    #   spec:
    #     resources:
    #       requests:
    #         storage: 1Gi
    #     volumeMode: Filesystem
    #     accessModes:
    #       - ReadWriteOnce
    podDisruptionBudget:
      maxUnavailable: 1
    # splitHorizons:
    #   my-cluster-name-rs0-0:
    #     external: rs0-0.mycluster.xyz
    #     external-2: rs0-0.mycluster2.xyz
    #   my-cluster-name-rs0-1:
    #     external: rs0-1.mycluster.xyz
    #     external-2: rs0-1.mycluster2.xyz
    #   my-cluster-name-rs0-2:
    #     external: rs0-2.mycluster.xyz
    #     external-2: rs0-2.mycluster2.xyz
    expose:
      enabled: false
      exposeType: ClusterIP
      # loadBalancerSourceRanges:
      #   - 10.0.0.0/8
      # serviceAnnotations:
      #   service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
      # serviceLabels: 
      #   some-label: some-key
    # schedulerName: ""
    resources:
      limits:
        memory: "16.0G"
      requests:
        cpu: "6000m"
        memory: "14.0G"
    volumeSpec:
      # emptyDir: {}
      # hostPath:
      #   path: /data
      pvc:
        # annotations:
        #   volume.beta.kubernetes.io/storage-class: example-hostpath
        # labels:
        #   rack: rack-22
        storageClassName: mongodb-expansion
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 100Gi
    # hostAliases:
    # - ip: "10.10.0.2"
    #   hostnames:
    #   - "host1"
    #   - "host2"
    nonvoting:
      enabled: true
      # podSecurityContext: {}
      # containerSecurityContext: {}
      size: 0
      configuration: |
        systemLog:
          verbosity: 0        
        operationProfiling:
          mode: slowOp
          slowOpThresholdMs: 2000

      affinity:
        advanced:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: topology.kubernetes.io/zone
                  operator: In
                  values:
                  - ap-northeast-2a
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - nonvoting
              topologyKey: "kubernetes.io/hostname"
      annotations:
        proxy.istio.io/config: |
          proxyMetadata:
            ISTIO_META_IDLE_TIMEOUT: 0s
      labels:
        app: nonvoting
      nodeSelector:
        role: mongodb-prod-nonvoting
      storage:
        engine: wiredTiger
        wiredTiger:
          engineConfig:
            cacheSizeRatio: 0.8
      livenessProbe:
        initialDelaySeconds: 60
      readinessProbe:
        initialDelaySeconds: 60
      podDisruptionBudget:
        maxUnavailable: 1
      resources:
        limits:
          memory: "16.0G"
        requests:
          cpu: "6000m"
          memory: "14.0G"
      volumeSpec:
        emptyDir: {}

    arbiter:
      enabled: false
      size: 0
      # serviceAccountName: percona-server-mongodb-operator
      affinity:
        advanced:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: topology.kubernetes.io/zone
                  operator: In
                  values:
                  - ap-northeast-2a
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - replicaset
              topologyKey: "kubernetes.io/hostname"

      annotations:
        proxy.istio.io/config: |
          proxyMetadata:
            ISTIO_META_IDLE_TIMEOUT: 0s
      labels:
        app: arbiter
      nodeSelector:
        role: mongodb-prod
      # tolerations: []
      # priorityClass: ""
      # annotations: {}
      # labels: {}
      # nodeSelector: {}

sharding:
  enabled: true
  balancer:
    enabled: true

  configrs:
    size: 3
    # terminationGracePeriodSeconds: 300
    # externalNodes:
    # - host: 34.124.76.90
    # - host: 34.124.76.91
    #   port: 27017
    #   votes: 0
    #   priority: 0
    # - host: 34.124.76.92
    # configuration: |
    #   operationProfiling:
    #     mode: slowOp
    #   systemLog:
    #     verbosity: 1
    # serviceAccountName: percona-server-mongodb-operator
    # topologySpreadConstraints:
    #   - labelSelector:
    #       matchLabels:
    #         app.kubernetes.io/name: percona-server-mongodb
    #     maxSkew: 1
    #     topologyKey: kubernetes.io/hostname
    #     whenUnsatisfiable: DoNotSchedule
    affinity:
      advanced:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - replicaset
                - config-server
            topologyKey: "kubernetes.io/hostname"
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - ap-northeast-2a
              
    # tolerations: []
    # priorityClass: ""
    annotations:
      proxy.istio.io/config: |
        proxyMetadata:
          ISTIO_META_IDLE_TIMEOUT: 0s
    labels:
      app: config-server
    nodeSelector:
      role: mongodb-prod
    # livenessProbe: {}
    # readinessProbe: {}
    # runtimeClassName: image-rc
    # sidecars:
    # - image: busybox
    #   command: ["/bin/sh"]
    #   args: ["-c", "while true; do echo echo $(date -u) 'test' >> /dev/null; sleep 5;done"]
    #   name: rs-sidecar-1
    #   volumeMounts:
    #     - mountPath: /volume1
    #       name: sidecar-volume-claim
    # sidecarPVCs: []
    # sidecarVolumes: []
    podDisruptionBudget:
      maxUnavailable: 1
    expose:
      enabled: false
      exposeType: ClusterIP
      # loadBalancerSourceRanges:
      #   - 10.0.0.0/8
      # serviceAnnotations:
      #   service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
      # serviceLabels: 
      #   some-label: some-key
    resources:
      limits:
        cpu: "2000m"
        memory: "4G"
      requests:
        cpu: "800m"
        memory: "1G"
    volumeSpec:
      # emptyDir: {}
      pvc:
        # annotations:
        #   volume.beta.kubernetes.io/storage-class: example-hostpath
        # labels:
        #   rack: rack-22
        storageClassName: mongodb-expansion
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 20Gi
    # hostAliases:
    # - ip: "10.10.0.2"
    #   hostnames:
    #   - "host1"
    #   - "host2"

  mongos:
    size: 2
    configuration: |
      operationProfiling:
        slowOpThresholdMs: 2000
      systemLog:
        verbosity: 0
      auditLog:
        destination: console
      setParameter:
        auditAuthorizationSuccess: true
    affinity:
      advanced:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - replicaset
                - mongos
            topologyKey: "kubernetes.io/hostname"
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - ap-northeast-2a
    annotations:
      proxy.istio.io/config: |
        proxyMetadata:
          ISTIO_META_IDLE_TIMEOUT: 0s
    labels:
      app: mongos
    nodeSelector:
      role: mongodb-prod-mongos
    # livenessProbe: {}
    # readinessProbe: {}
    # runtimeClassName: image-rc
    # sidecars:
    # - image: busybox
    #   command: ["/bin/sh"]
    #   args: ["-c", "while true; do echo echo $(date -u) 'test' >> /dev/null; sleep 5;done"]
    #   name: rs-sidecar-1
    #   volumeMounts:
    #     - mountPath: /volume1
    #       name: sidecar-volume-claim
    # sidecarPVCs: []
    # sidecarVolumes: []
    podDisruptionBudget:
      maxUnavailable: 1
    resources:
      limits:
        cpu: "2000m"
        memory: "8G"
      requests:
        cpu: "1000m"
        memory: "6G"
    expose:
      exposeType: ClusterIP
      servicePerPod: true
      # loadBalancerSourceRanges:
      #   - 10.0.0.0/8
      # serviceAnnotations:
      #   service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
      # serviceLabels: 
      #   some-label: some-key
    # auditLog:
    #   destination: file
    #   format: BSON
    #   filter: '{}'
    # hostAliases:
    # - ip: "10.10.0.2"
    #   hostnames:
    #   - "host1"
    #   - "host2"

backup:
  enabled: true
  image:
    repository: percona/percona-backup-mongodb
    tag: 2.3.1
  resources:
    limits:
      cpu: "600m"
      memory: "1G"
    requests:
      cpu: "300m"
      memory: "0.5G"
  storages:
    s3-percona-mongodb-backup:
      type: s3
      s3:
        region: ap-northeast-2
        bucket: percona-mongodb-backup
        prefix: "prod"
        credentialsSecret: mongodb-secrets-backup
  pitr:
    enabled: true
    oplogOnly: true
    oplogSpanMin: 10
    compressionType: gzip
  tasks:
  - name: daily-s3-backup-physical
    enabled: false
    schedule: "*/10 * * * *"
    keep: 3
    storageName: s3-percona-mongodb-backup
    type: physical
    compressionType: gzip

Hi @Sergey_Pronin, should i create a bug ticket or have you already done it ?

I have created one: [K8SPSMDB-1036] - Percona JIRA

1 Like

any progress regarding this issue? When I checked the JIRA issue it says pending release, I have production workload that has been impaired. The backup file is there in the s3 buckets but we cannot restore it to a new cluster. What other options do we have?

The fix is merged and already available in the main branch.
The official release will happen this week.

1 Like

Thank you!! Have a nice day :slight_smile: