Backups to repo1 not working

Description:

I am trying to configure minimal backup for testing backup and restore procedures. These will lay the basis for a more solid backup procedure utilizing both local and offsite storage with encryption.

However, I am unable to even create a simple backup to the default cluster1-repo-host-0 and a manual trigger remains starting:

NAME            CLUSTER    REPO    DESTINATION   STATUS     TYPE   COMPLETED   AGE
manual-backup   cluster1   repo1                 Starting                      22m

Steps to Reproduce:

I deployed following the generic kubernetes install guide:

Relevant parts of cr.yaml:

spec:
  backups:
    pgbackrest:
      metadata:
        labels:
      image: percona/percona-postgresql-operator:2.4.1-ppg16.3-pgbackrest2.51-1
      repos:
        - name: repo1
          schedules:
            full: "21 2 * * *"
            differential: "0 1 * * 1-6"
            incremental: "0 1 * * 1-6"
          volume:
            volumeClaimSpec:
              storageClassName: longhorn
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 4Gi

The on-demand.yaml trigger:

---
apiVersion: pgv2.percona.com/v2
kind: PerconaPGBackup
metadata:
  name: manual-backup
  namespace: postgres-operator
spec:
  pgCluster: cluster1
  repoName: repo1
  options:
  - --type=full

Version:

2.4.1

Logs:

2024-09-08T11:25:01.981Z	INFO	Waiting for backup to start	{"controller": "perconapgbackup", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGBackup", "PerconaPGBackup": {"name":"manual-backup","namespace":"postgres-operator"}, "namespace": "postgres-operator", "name": "manual-backup", "reconcileID": "53e87ecf-a574-47cc-a67d-740b43651966", "request": {"name":"manual-backup","namespace":"postgres-operator"}}
2024-09-08T11:25:04.599Z	ERROR	get latest backup	{"controller": "perconapgcluster", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGCluster", "PerconaPGCluster": {"name":"cluster1","namespace":"postgres-operator"}, "namespace": "postgres-operator", "name": "cluster1", "reconcileID": "5b1f134b-55fa-430c-8ec0-0985782f2921", "error": "no completed backups found", "errorVerbose": "no completed backups found\ngithub.com/percona/percona-postgresql-operator/percona/watcher.getLatestBackup\n\t/go/src/github.com/percona/percona-postgresql-operator/percona/watcher/wal.go:129\ngithub.com/percona/percona-postgresql-operator/percona/watcher.WatchCommitTimestamps\n\t/go/src/github.com/percona/percona-postgresql-operator/percona/watcher/wal.go:65\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1695

Expected Result:

Successfully completing backup to local volume repo1.

Actual Result:

A manual trigger using the on-demand.yml listed above remains starting:

NAME            CLUSTER    REPO    DESTINATION   STATUS     TYPE   COMPLETED   AGE
manual-backup   cluster1   repo1                 Starting                      22m

Additional Information:

The posts on the forum here are not helpful in my situation, and I suspect are outdated.

Appreciate the help, thanks in advance!

Hi @Fragment2
The following part of CR works for me:

  backups:
    pgbackrest:
      image: percona/percona-postgresql-operator:2.4.1-ppg16.3-pgbackrest2.51-1
      repoHost:
        affinity:
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
             - weight: 1
               podAffinityTerm:
                 labelSelector:
                   matchLabels:
                     postgres-operator.crunchydata.com/data: pgbackrest
                 topologyKey: kubernetes.io/hostname
      repos:
      - name: repo1
        schedules:
          full: "0 0 * * 6"
        volume:
          volumeClaimSpec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi

You can ignore that error it shows that the operator can’t get the latest successful backup.

This my list of pods. As you can see I have repo (pgbackrest) pod which will be used for local backups and first completed stanza backup ncluster1-backup-hc6h-dm84j. Do you have the same?

❯ kubectl get pods
NAME                                           READY   STATUS      RESTARTS   AGE
cluster1-backup-hc6h-dm84j                     0/1     Completed   0          4m57s
cluster1-backup-nqwn-mgwrm                     0/1     Completed   0          23s
cluster1-instance1-6dn4-0                      4/4     Running     0          5m41s
cluster1-instance1-jwnr-0                      4/4     Running     0          5m41s
cluster1-instance1-n4qg-0                      4/4     Running     0          5m42s
cluster1-pgbouncer-cd894fcb9-r52ps             2/2     Running     0          5m40s
cluster1-pgbouncer-cd894fcb9-spz9c             2/2     Running     0          5m40s
cluster1-pgbouncer-cd894fcb9-xmddz             2/2     Running     0          5m40s
cluster1-repo-host-0                           2/2     Running     0          5m40s
percona-postgresql-operator-589c76dd68-9p42r   1/1     Running     0          5m51s

❯ kubectl get pg-backup
NAME                         CLUSTER    REPO    DESTINATION   STATUS      TYPE   COMPLETED   AGE
cluster1-backup-hc6h-77jrj   cluster1   repo1                 Succeeded   full   2m5s        4m59s
manual-backup                cluster1   repo1                 Succeeded   full   28s         41s

❯ kubectl get pvc
NAME                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
cluster1-instance1-6dn4-pgdata   Bound    pvc-e4e51005-fb5e-4089-a0e3-c7b7d6ed0e04   1Gi        RWO            standard-rwo   10m
cluster1-instance1-jwnr-pgdata   Bound    pvc-18f402f5-e180-4546-978a-960ebf4a753b   1Gi        RWO            standard-rwo   10m
cluster1-instance1-n4qg-pgdata   Bound    pvc-46ab6abf-1ea1-4b4e-b1a1-019853c104e4   1Gi        RWO            standard-rwo   10m
cluster1-repo1                   Bound    pvc-a32e33eb-6028-4de7-b861-baebde46f7d8   1Gi        RWO            standard-rwo   10m

Please make sure that you have longhorn SC.

1 Like

Thank you!!

The issue was with the part quoted, it was missing from my manifest.

NAME                                          READY   STATUS      RESTARTS   AGE
cluster1-backup-2xmx-csz45                    0/1     Completed   0          51s
cluster1-backup-7nr6-qkpzf                    0/1     Completed   0          81s

and

NAME                         CLUSTER    REPO    DESTINATION   STATUS      TYPE   COMPLETED   AGE
cluster1-backup-7nr6-vdt2l   cluster1   repo1                 Succeeded   full   48s         77s
manual-backup                cluster1   repo1                 Succeeded   full   27s         4h50m

Awesome!