Backups to repo1 not working

Description:

I am trying to configure minimal backup for testing backup and restore procedures. These will lay the basis for a more solid backup procedure utilizing both local and offsite storage with encryption.

However, I am unable to even create a simple backup to the default cluster1-repo-host-0 and a manual trigger remains starting:

NAME            CLUSTER    REPO    DESTINATION   STATUS     TYPE   COMPLETED   AGE
manual-backup   cluster1   repo1                 Starting                      22m

Steps to Reproduce:

I deployed following the generic kubernetes install guide:

Relevant parts of cr.yaml:

spec:
  backups:
    pgbackrest:
      metadata:
        labels:
      image: percona/percona-postgresql-operator:2.4.1-ppg16.3-pgbackrest2.51-1
      repos:
        - name: repo1
          schedules:
            full: "21 2 * * *"
            differential: "0 1 * * 1-6"
            incremental: "0 1 * * 1-6"
          volume:
            volumeClaimSpec:
              storageClassName: longhorn
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 4Gi

The on-demand.yaml trigger:

---
apiVersion: pgv2.percona.com/v2
kind: PerconaPGBackup
metadata:
  name: manual-backup
  namespace: postgres-operator
spec:
  pgCluster: cluster1
  repoName: repo1
  options:
  - --type=full

Version:

2.4.1

Logs:

2024-09-08T11:25:01.981Z	INFO	Waiting for backup to start	{"controller": "perconapgbackup", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGBackup", "PerconaPGBackup": {"name":"manual-backup","namespace":"postgres-operator"}, "namespace": "postgres-operator", "name": "manual-backup", "reconcileID": "53e87ecf-a574-47cc-a67d-740b43651966", "request": {"name":"manual-backup","namespace":"postgres-operator"}}
2024-09-08T11:25:04.599Z	ERROR	get latest backup	{"controller": "perconapgcluster", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGCluster", "PerconaPGCluster": {"name":"cluster1","namespace":"postgres-operator"}, "namespace": "postgres-operator", "name": "cluster1", "reconcileID": "5b1f134b-55fa-430c-8ec0-0985782f2921", "error": "no completed backups found", "errorVerbose": "no completed backups found\ngithub.com/percona/percona-postgresql-operator/percona/watcher.getLatestBackup\n\t/go/src/github.com/percona/percona-postgresql-operator/percona/watcher/wal.go:129\ngithub.com/percona/percona-postgresql-operator/percona/watcher.WatchCommitTimestamps\n\t/go/src/github.com/percona/percona-postgresql-operator/percona/watcher/wal.go:65\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1695

Expected Result:

Successfully completing backup to local volume repo1.

Actual Result:

A manual trigger using the on-demand.yml listed above remains starting:

NAME            CLUSTER    REPO    DESTINATION   STATUS     TYPE   COMPLETED   AGE
manual-backup   cluster1   repo1                 Starting                      22m

Additional Information:

The posts on the forum here are not helpful in my situation, and I suspect are outdated.

Appreciate the help, thanks in advance!

Hi @Fragment2
The following part of CR works for me:

  backups:
    pgbackrest:
      image: percona/percona-postgresql-operator:2.4.1-ppg16.3-pgbackrest2.51-1
      repoHost:
        affinity:
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
             - weight: 1
               podAffinityTerm:
                 labelSelector:
                   matchLabels:
                     postgres-operator.crunchydata.com/data: pgbackrest
                 topologyKey: kubernetes.io/hostname
      repos:
      - name: repo1
        schedules:
          full: "0 0 * * 6"
        volume:
          volumeClaimSpec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi

You can ignore that error it shows that the operator can’t get the latest successful backup.

This my list of pods. As you can see I have repo (pgbackrest) pod which will be used for local backups and first completed stanza backup ncluster1-backup-hc6h-dm84j. Do you have the same?

❯ kubectl get pods
NAME                                           READY   STATUS      RESTARTS   AGE
cluster1-backup-hc6h-dm84j                     0/1     Completed   0          4m57s
cluster1-backup-nqwn-mgwrm                     0/1     Completed   0          23s
cluster1-instance1-6dn4-0                      4/4     Running     0          5m41s
cluster1-instance1-jwnr-0                      4/4     Running     0          5m41s
cluster1-instance1-n4qg-0                      4/4     Running     0          5m42s
cluster1-pgbouncer-cd894fcb9-r52ps             2/2     Running     0          5m40s
cluster1-pgbouncer-cd894fcb9-spz9c             2/2     Running     0          5m40s
cluster1-pgbouncer-cd894fcb9-xmddz             2/2     Running     0          5m40s
cluster1-repo-host-0                           2/2     Running     0          5m40s
percona-postgresql-operator-589c76dd68-9p42r   1/1     Running     0          5m51s

❯ kubectl get pg-backup
NAME                         CLUSTER    REPO    DESTINATION   STATUS      TYPE   COMPLETED   AGE
cluster1-backup-hc6h-77jrj   cluster1   repo1                 Succeeded   full   2m5s        4m59s
manual-backup                cluster1   repo1                 Succeeded   full   28s         41s

❯ kubectl get pvc
NAME                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
cluster1-instance1-6dn4-pgdata   Bound    pvc-e4e51005-fb5e-4089-a0e3-c7b7d6ed0e04   1Gi        RWO            standard-rwo   10m
cluster1-instance1-jwnr-pgdata   Bound    pvc-18f402f5-e180-4546-978a-960ebf4a753b   1Gi        RWO            standard-rwo   10m
cluster1-instance1-n4qg-pgdata   Bound    pvc-46ab6abf-1ea1-4b4e-b1a1-019853c104e4   1Gi        RWO            standard-rwo   10m
cluster1-repo1                   Bound    pvc-a32e33eb-6028-4de7-b861-baebde46f7d8   1Gi        RWO            standard-rwo   10m

Please make sure that you have longhorn SC.

Thank you!!

The issue was with the part quoted, it was missing from my manifest.

NAME                                          READY   STATUS      RESTARTS   AGE
cluster1-backup-2xmx-csz45                    0/1     Completed   0          51s
cluster1-backup-7nr6-qkpzf                    0/1     Completed   0          81s

and

NAME                         CLUSTER    REPO    DESTINATION   STATUS      TYPE   COMPLETED   AGE
cluster1-backup-7nr6-vdt2l   cluster1   repo1                 Succeeded   full   48s         77s
manual-backup                cluster1   repo1                 Succeeded   full   27s         4h50m

Awesome!