Postgresql backup to S3 Minio from an Istio enabled namespace

I deployed the same perfectly working PostgresCluster from the non istio namespace to an Istio enabled namespace.
Managed to solve all other connection issues but it seems that the S3 backup doesn’t work

I know this is not a percona postgres issue but I thought that someone might have hit the same wall as I did

Anyone that can help ?

Hey @mboncalo ,

can you please tell me more on how I can replicate your setup and what issues you are facing?

I think it is easily resolvable, but not sure how to reproduce.

Hi @Sergey_Pronin ,

Sorry for the late response, actually had an ugly motorcycle accident after my last post.

So, I can start and describe how postgres backup behaves without Istio injection.
This is my sample test backup code for postgres for the namespace without istio injection:

backups:
  pgbackrest:
    image: ""
    configuration:
     - secret:
        name: cluster1-pgbackrest-secrets
    global: {}
    manual:
      repoName: repo2
      options:
      - --type=full
    repos:
    - name: repo1
      schedules:
        full: "5 * * * *"
#        differential: "0 1 * * 1-6"
        # incremental: "* * 2 * *"
      volume:
        volumeClaimSpec:
          storageClassName: "default"
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi
    - name: repo2
      s3:
        bucket: "b948b94bbae997066c0b3a4a6b25be3477b482b6cf23267b94ad1b92"
        endpoint: "https://....."
        region: "default"
      schedules:
        full: "20 * * * *"

No problem here, backup job pods get started at the scheduled time, jobs are completed and waiting for the next scheduled job and the PergonaPGBackups match all the backups done and jobs started(Sounds obviously, but not for Istio injected postgres)

NAME                                CLUSTER           REPO    DESTINATION                                                     STATUS      TYPE          COMPLETED   AGE
service-1-pg-db-backup-ct88-5mv2v   service-1-pg-db   repo1                                                                   Succeeded   incremental   96m         99m
service-1-pg-db-repo1-full-br5c9    service-1-pg-db   repo1                                                                   Succeeded   full          30m         30m
service-1-pg-db-repo1-full-t88h6    service-1-pg-db   repo1                                                                   Succeeded   full          90m         90m
service-1-pg-db-repo2-full-c47kh    service-1-pg-db   repo2   s3://b948b94bbae997066c0b3a4a6b25be3477b482b6cf23267b94ad1b92   Succeeded   full          14m         15m
service-1-pg-db-repo2-full-k5k5n    service-1-pg-db   repo2   s3://b948b94bbae997066c0b3a4a6b25be3477b482b6cf23267b94ad1b92   Succeeded   full          67m         75m

Now, regarding the Istio injected namespace, beside the Istio service entries for and peer authentication all postgres containers and minio endpoint, I have this backup config:

backups:
  pgbackrest:
    image: ""
    configuration:
     - secret:
        name: cluster1-pgbackrest-secrets
    global: {}
    repos:
    - name: repo1
      s3:
        bucket: "2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1"
        endpoint: "https://........."
        region: "default"
      schedules:
        full: "30 * * * *"

This is a new test postgres cluster and the reason why there is only S3 backup at this point, is because if I deploy the cluster with both S3 and PV backup repos, S3 backup won’t work, pgback rest will say that repo1(S3) is not ok when executing pgbackrest info. But If I deploy S3 backup only at the beginning, and make an upgrade and add repo2 with PV backup, they will both “work”.
The issue afterwards, is that when one backup job starts, it will spawn a job pod with 2 containers, istio-proxy and pgbackrest. When the pgbackrest completes, the job won’t get completed and I will have a remaining backup job pod with only istio-proxy running and because of this, no other backup job will start, even if I can see many PerconaPGBackup resources started.

NAME                                CLUSTER           REPO    DESTINATION                                                     STATUS     TYPE          COMPLETED   AGE
service-1-pg-db-backup-29wk-w8kn8   service-1-pg-db   repo1   s3://2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1   Running    incremental               159m
service-1-pg-db-backup-8c7q-s5gkx   service-1-pg-db   repo1   s3://2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1   Running    incremental               156m
service-1-pg-db-backup-b99r-t78d7   service-1-pg-db   repo1   s3://2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1   Running    incremental               144m
service-1-pg-db-backup-dhst-f9hxx   service-1-pg-db   repo1   s3://2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1   Running    incremental               163m
service-1-pg-db-backup-ng54-wnsl7   service-1-pg-db   repo1   s3://2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1   Running    incremental               4h2m
service-1-pg-db-backup-pbjx-nwpsz   service-1-pg-db   repo1   s3://2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1   Running    incremental               49m
service-1-pg-db-repo1-full-c247b    service-1-pg-db   repo1                                                                                                        163m
service-1-pg-db-repo1-full-jjnzc    service-1-pg-db   repo1   s3://2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1   Starting                             3h43m
service-1-pg-db-repo1-full-vjndt    service-1-pg-db   repo1                                                                                                        103m
service-1-pg-db-repo1-full-vq5pd    service-1-pg-db   repo1                                                                                                        43m
service-1-pg-db-repo2-full-fftp9    service-1-pg-db   repo2                                                                                                        143m
service-1-pg-db-repo2-full-jgwfm    service-1-pg-db   repo2                                                                                                        23m
service-1-pg-db-repo2-full-kzfl4    service-1-pg-db   repo2                                                                                                        3h23m
service-1-pg-db-repo2-full-qgmgs    service-1-pg-db   repo2                                                                                                        83m

With S3 backup, same issue like with the PV backup, but in this case, the PerconaPGBackup jobs are stuck in running state and backups jobs are triggered randomly though they are supposed to run only at minute 30. I can see backups on the S3 bucket as well:

3.1MiB  28 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/archive/db/15-1/0000000100000000
3.1MiB  28 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/archive/db/15-1
3.1MiB  30 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/archive/db/15-2/0000000100000000
3.1MiB  30 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/archive/db/15-2
6.2MiB  60 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/archive/db
6.2MiB  60 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/archive
4.0MiB  1270 objects    2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F/pg_data
4.5MiB  1272 objects    2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F
240KiB  22 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-093022I/pg_data
789KiB  24 objects      2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-093022I
1.5KiB  4 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-093420I/pg_data
552KiB  6 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-093420I
1.6KiB  4 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-093731I/pg_data
552KiB  6 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-093731I
1.6KiB  4 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-094925I/pg_data
552KiB  6 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-094925I
1.7KiB  4 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-112411I/pg_data
553KiB  6 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/20240729-081130F_20240729-112411I
384KiB  6 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/backup.history/2024
384KiB  6 objects       2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db/backup.history
7.8MiB  1328 objects    2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup/db
7.8MiB  1328 objects    2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1/backup
14MiB   1388 objects    2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest/repo1
14MiB   1388 objects    2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1/pgbackrest
14MiB   1388 objects    2a013ccfe10dea1a2fae03b8374604f699743d655fbfb8ee010349d1
14MiB   1388 objects

Considering that I only have S3 backup configured, there is no pgbackrest pod to use pgbackrest info and check backups status

Actually, I found out this is a known issue, for which Istio released a fix but it was not implemented in Kubernetes. The fix would bee to call /quitquitquit on envoy api to terminate istio-proxy container after the backup job is completed, on which I am looking for a proper solution at this moment:
More details here:

So the solution would be to have a custom made pgbackrest container that calls Pilot agent /quitquitquit endpoint to terminate istio-proxy

I tried to see how it behaves if I would add sidecar.istio.io/inject: “false” annotation to pgbackrest but nothing happens, no annotation is added to backup pod:

backups:
  pgbackrest:
    metadata:
      annotations:
        sidecar.istio.io/inject: "false"
#    labels:
    image: ""
    configuration:
     - secret:
        name: cluster1-pgbackrest-secrets
...

Issue was solved by setting namespace peer authentication to PERMISIVE and creating a ServiceEntry for postgres resources