Manually restoring multiple times is not working

Description:

I did a manual backup by creating a PerconaPGBackup object. Then, I deleted a table from the database and tried to restore it manually into the cluster which is running by creating a PerconaPGRestore object. It successfully ran the first time, but, after running it more than once, I am getting an error in the operator logs- “Waiting for another restore to finish”.

Steps to Reproduce:

  1. Create a cluster with 1 replica and set repo3 for backup to GCS. There is no pmm and no pgBouncer.
  2. Fill some data in the Postgres in a table.
  3. Create a PerconaPGBackup kind which backs up the database and then apply it. This backs up the data created in step 2 to the GCS bucket.
  4. Delete the data created in step 2 from the Postgres.
  5. Restore the database in the current cluster using a PerconaPGRestore object. This should be the same as that in step 2.
  6. After the restore is completed and the Postgres instance pod is running again, delete the data which was restored in step 5.
  7. Create another PerconaPGRestore object and apply it. We can see an error in the operator logs.

Version:

2.2.0
Using the docker image- percona/percona-postgresql-operator:2.2.0-ppg14-*

Logs:

time=“2023-12-21T11:45:17Z” level=info msg=“Waiting for another restore to finish” PerconaPGRestore=postgres-operator/manual-restore controller=perconapgrestore controllerGroup=pgv2.percona.com controllerKind=PerconaPGRestore name=manual-restore namespace=postgres-operator reconcileID=3969fd5a-9ddd-4c6d-ad36-69e0b8dcfa6b request=postgres-operator/manual-restore version=
time=“2023-12-21T11:45:22Z” level=info msg=“Waiting for another restore to finish” PerconaPGRestore=postgres-operator/manual-restore controller=perconapgrestore controllerGroup=pgv2.percona.com controllerKind=PerconaPGRestore name=manual-restore namespace=postgres-operator reconcileID=f3089a46-5ee8-42b0-b441-bef360b029a0 request=postgres-operator/manual-restore version=

(Repeats indefinitely)

Expected Result:

A new pod is created which restores the data.

Actual Result:

Nothing happens except the logs in the operator.

Additional Information:

Hi @Shaurya_Goel1,

This looks like a bug, and since you have a clear reproducible scenario, I would send all of this to a new Jira ticket:
https://perconadev.atlassian.net/jira/software/c/projects/K8SPG

1 Like

@Shaurya_Goel1 I can’t reproduce this issue :frowning: Could you please provide your CRs. I need to know more about how you back up and restore your cluster.

1 Like

I had the same problem. I solved it by removing the annotation postgres-operator.crunchydata.com/pgbackrest-restore in the resources PostgresCluster and PerconaPGCluster. I also set enabled to false in spec.backups.pgbackrest.restore, but I am not sure if that was necessary.

I am not sure about the cause because I can now issue new restores even without removing the annotations. I initially installed the operator on k8s 1.22 (now on 1.23), which is not supported - maybe this could have had an influence.

@coaler @Shaurya_Goel1 I have reproduced it and created the task [K8SPG-637] - Percona JIRA to fix it. We will include the fix in 2.6.0.

1 Like