Problems with backup

Description:

I have setup a pxc cluster in k8s . Everything works, but when I am trying to test backups , all backup pods are failing. What I see in the error log - the following error:

/opt/percona/peer-list -on-start=/usr/bin/get-pxc-state -service=percona-cluster-pxc 
/usr/bin/backup.sh: line 73: /opt/percona/peer-list: No such file or directory  

There is no directory /opt/percona at all. Suggest, please, what am I doing wrong?

My backup.yml:

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterBackup
metadata:
  finalizers:
    - delete-s3-backup
  name: test-backup
spec:
  pxcCluster: percona-cluster
  storageName: s3-backup

@John_Doe please provide more information - versions, steps to reproduce, yaml manifests and more.
Right now it will be fortune telling vs debugging.

Hi Sergey,

Thanks for the response.
The steps are easy - I created a cluster, it has been working for a couple weeks. But, whenever I try to run a manual backup using yaml from my original post backup is failing. And all I see the error in the backup pod’s log from the original post.

Also, it is my cluster.yaml:

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBCluster
metadata:
  annotations:
    meta.helm.sh/release-name: database
    meta.helm.sh/release-namespace: database
  creationTimestamp: "2024-06-25T10:29:40Z"
  finalizers:
  - delete-pxc-pods-in-order
  - delete-ssl
  - delete-proxysql-pvc
  - delete-pxc-pvc
  generation: 7
  labels:
    app.kubernetes.io/managed-by: Helm
  name: percona-cluster
  resourceVersion: "1318684640"
  uid: 76c99792-71cd-4fa9-8bb7-293fae20ef36
spec:
  allowUnsafeConfigurations: false
  backup:
    allowParallel: true
    backoffLimit: 6
    image: perconalab/percona-xtradb-cluster-operator:main-pxc8.0-backup
    storages:
      s3-backup:
        resources:
          limits:
            cpu: 2000m
            memory: 2G
          requests:
            cpu: 600m
            memory: 1G
        s3:
          bucket: backup-percona-cluster
          credentialsSecret: percona-backup-s3
          endpointUrl: https://s3.backup.host
          region: us-west-2
        type: s3
        verifyTLS: true
  crVersion: 1.14.0
  enableCRValidationWebhook: true
  haproxy:
    affinity:
      antiAffinityTopologyKey: kubernetes.io/hostname
    enabled: true
    gracePeriod: 30
    image: perconalab/percona-xtradb-cluster-operator:main-haproxy
    livenessProbes: {}
    podDisruptionBudget:
      maxUnavailable: 1
    readinessProbes: {}
    resources:
      limits:
        cpu: 700m
        memory: 1G
      requests:
        cpu: 600m
        memory: 1G
    sidecarResources:
      limits:
        cpu: 600m
        memory: 2G
      requests:
        cpu: 500m
        memory: 1G
    size: 3
  initContainer:
    resources:
      limits:
        cpu: 200m
        memory: 200M
      requests:
        cpu: 100m
        memory: 100M
  logcollector:
    enabled: true
    image: perconalab/percona-xtradb-cluster-operator:main-logcollector
    resources:
      limits:
        cpu: "2"
        memory: 4096Mi
      requests:
        cpu: 500m
        memory: 2048Mi
  pmm:
    enabled: false
    image: percona/pmm-client:2.41.2
    resources:
      requests:
        cpu: 300m
        memory: 150M
    serverHost: monitoring-service
  proxysql:
    enabled: false
    image: perconalab/percona-xtradb-cluster-operator:main-proxysql
    resources:
      limits:
        cpu: "2"
        memory: 4096Mi
      requests:
        cpu: 500m
        memory: 2048Mi
    size: 3
  pxc:
    affinity:
      antiAffinityTopologyKey: kubernetes.io/hostname
    autoRecovery: true
    gracePeriod: 600
    image: perconalab/percona-xtradb-cluster-operator:main-pxc8.0
    livenessProbes: {}
    podDisruptionBudget:
      maxUnavailable: 1
    readinessProbes: {}
    resources:
      limits:
        cpu: "2"
        memory: 4096Mi
      requests:
        cpu: 500m
        memory: 2048Mi
    size: 3
    volumeSpec:
      persistentVolumeClaim:
        resources:
          requests:
            storage: 30Gi
  updateStrategy: SmartUpdate
  upgradeOptions:
    apply: disabled
    schedule: 0 4 * * *
    versionServiceEndpoint: https://check.percona.com

@John_Doe is it deployed with helm?
I see you have perconalab docker repo, which indicates the usage of a main branch vs the released version. Is that correct?

Yes, @Sergey_Pronin , it is deployed with helm.

Yes, I took yaml file from the original GitHub page - GitHub - percona/percona-xtradb-cluster-operator: Percona Operator for MySQL based on Percona XtraDB Cluster

kubectl apply -f https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/main/deploy/cr.yaml

Do you suggest to use particular version instead ?

Hey @John_Doe ,

I would suggest to always use released versions. The latest one is Percona Operator for MySQL based on Percona XtraDB Cluster 1.15.0 (2024-08-20) - Percona Operator for MySQL

So instead of

kubectl apply -f https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/main/deploy/cr.yaml

do

kubectl apply -f https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/v1.15.0/deploy/cr.yaml

Hi Sergey,

Thanks for the advice. I applied Percona cluster version 1.14.0 because we already had an operator deployed with the same version . But error remains the main.

Is there any non-public way I could share the full configuration with you?

@John_Doe feel free to reach out at sergey.pronin@percona.com or book me through zoom for a quick chat: Zoom Scheduler

I would be curious to learn more about your use cases.

@John_Doe you can’t use PXCO 1.14.0 with the latest (images with main tag) product/components images. If you use operator version 1.14.0, you need to use images that were tested with this version. You can find list of these images in this file: percona-xtradb-cluster-operator/deploy/cr.yaml at v1.14.0 · percona/percona-xtradb-cluster-operator · GitHub