Are there cases where Percona XtraBackup Restore can break the cluster?

na50r · March 19, 2026, 10:01am

I have a PXC (PerconaXtraDBCluster) deployed with Helm Chart Version 1.14.5 and then had to upgrade to 1.19.0. The Helm Chart allows setting up a cronjob that creates a backup of the cluster. My cronjob runs every day and the backups themselves are still being saved just fine. The issue occurs when I try to restore from one of these backups into a new cluster.

I create a new fresh cluster
run the backup restore job on it.

The new cluster is created via:

kubectl get pxc my-cluster -o yaml > template.yaml

Then remove the Helm related stuff and status fields. The new cluster should be the same as the running one.

The backup restore job in question is from here: percona-xtradb-cluster-operator/deploy/backup/restore.yaml at main · percona/percona-xtradb-cluster-operator · GitHub

Before the upgrade, this worked fine but after it, running the backup restore job somehow breaks the new cluster. It has 3 running pods and after the job completes, it has 0. The job doesn’t even complete fully because it complaints that it expected 3 but sees 0 and tells me to disable a flag if I do not want this check to be made. (Check is fine, issue is why it went from 3 to 0)

Do you have any ideas why this is happening? I couldn’t find anything on GitHub issues and the forum that mentions how the backup restore job breaks the cluster. Or have I been using the backup restore job wrongly? Perhaps I was not supposed remove Helm-related labels from the YAML?

Before running the backup restore job, there were 3 PVCs and 3 PVs but afterward, there is only 1 PVC and PV. The data from the backup seems to be inside but it is a mystery why it somehow destroys the cluster and the other two PVCs/PVs.

The YAML I used:

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterRestore
metadata:
  name: replica-pxc
  namespace: database
spec:
  pxcCluster: replica-pxc
  backupName: cron-original-db-pxc

cron-original-db-pxc is a PerconaXtraDBClusterBackup

There is also another way to run the backup, where instead of restoring it using the pxc-backup object in K8s, you get it from S3. Neither of them worked for me after I ugpraded.

Only oddity I found was that the image the pxc-backup used before I upgraded the Helm Chart was:

percona/percona-xtradb-cluster-operator:1.14.1-pxc8.0-backup-pxb8.0.35

But after upgrade, it went to:

percona/percona-xtrabackup:8.4.0-5.1

Unsure if this means anything or was just a design choice to separate the operator from the backup.

matthewb · March 20, 2026, 12:57am

Hello @na50r
Per the documentation:

Percona XtraDB Cluster 8.4 is the default version for cluster deployments

What you are experiencing is expected. XtraBackup (PXB) 8.4 cannot restore a backup taken from 8.0. In order to upgrade, you first restore to 8.0, then upgrade the operator which performs a rolling restart of your cluster to 8.4. From that point forward, backups will be taken using PXB 8.4.

You need to rollback your operator version to 1.18 where the PXC and PXB versions are 8.0

na50r · March 20, 2026, 7:16am

Hello @matthewb

Your explanation makes sense but I do not think it can be that because as said, the backups are from a cronjob. The cronjob is daily, so the backups are taken after upgrade and not before. It fails to restore the PXC with the most recent backups, which should be all on the same version.

I experimented a bit and found out that data restoration seems to work properly. What essentially happens:

Create a new cluster with the same YAML as the cluster I want to restore.
Apply the pxc-restore resource on that new cluster
All pods of the cluster get terminated; 2 out 3 PVCs and PVs get deleted
A restore job starts that seems to move data from the backup to the remaining PV it attached to
A prepare job starts that is supposed to prepare the cluster
The pxc-restore fails and complaints about the new PXC having size 0 while size 3 was expected

If I then do a manual scale up via kubectl, I can get the statefulset back online and enter pods and verify if the data is there and indeed it is. I ran mysqldump on the restored cluser and the size of the file was same as the original cluster. (Not the best check but I also ran some selected queries and the result were the same as well). Also, even if statefulset are back online, the PXC resource itself will claim that it is still broken.

So data restoration seems to work fine but the final operation that is supposed to bring back the PXC online after shutting down all its pods is failing. I do not think it can be related to 8.4 / 8.0 mismatch because the backups were taken daily by a cronjob, the backup used here was also from 8.4.

The image of the backup resource is:

percona/percona-xtrabackup:8.4.0-5.1

Which is the same as the new PXC, the PXC image is:

percona/percona-xtradb-cluster:8.4.7-7.1

matthewb · March 20, 2026, 4:07pm

Can you get the logs from the restore job pod, and logs from the operator when it fails?

na50r · March 21, 2026, 10:04am

Created a new PXC with name “my-db”; ran restore via kubectl apply -f

Status of the pxc-restore resource is:

Status:
  Comments:  check safe defaults: PXC size must be at least 3. Set spec.unsafeFlags.pxcSize to true to disable this check
  State:     Failed
Events:      <none>

Last entry in the restore job pod that completed:

2026-03-21T09:54:07.805580-00:00 0 [Note] [MY-011825] [Xtrabackup] completed OK!

In the PXC operator, this logs pops up when it fails:

2026-03-21T09:56:04.711Z	ERROR	Reconciler error	{"controller": "pxc-controller", "controllerGroup": "pxc.percona.com", "controllerKind": "PerconaXtraDBCluster", "PerconaXtraDBCluster": {"name":"my-db","namespace":"database"}, "namespace": "database", "name": "my-db", "reconcileID": "8b0b54db-24f4-4e95-bf11-b080a7076d4c", "error": "wrong PXC options: check safe defaults: PXC size must be at least 3. Set spec.unsafeFlags.pxcSize to true to disable this check", "errorVerbose": "PXC size must be at least 3. Set spec.unsafeFlags.pxcSize to true to disable this check\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1.(*PerconaXtraDBCluster).checkSafeDefaults\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1/pxc_types.go:1472\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1.(*PerconaXtraDBCluster).CheckNSetDefaults\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1/pxc_types.go:1130\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).Reconcile\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:267\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:216\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:461\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:296\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1693\ncheck safe defaults\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1.(*PerconaXtraDBCluster).CheckNSetDefaults\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1/pxc_types.go:1131\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).Reconcile\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:267\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:216\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:461\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:296\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1693\nwrong PXC options\ngithub.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc.(*ReconcilePerconaXtraDBCluster).Reconcile\n\t/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxc/controller.go:269\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:216\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:461\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/controller/controller.go:296\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1693"}

In the prepare job that starts after the restore job completes, we have:

Defaulted container "mysqld" out of: mysqld, pxc-init (init)
2026-03-21T09:54:33.242464Z 0 [System] [MY-015015] [Server] MySQL Server - start.
2026-03-21T09:54:33.509156Z 0 [Warning] [MY-011070] [Server] 'binlog_format' is deprecated and will be removed in a future release.
2026-03-21T09:54:33.509179Z 0 [Warning] [MY-011068] [Server] The syntax 'wsrep_slave_threads' is deprecated and will be removed in a future release. Please use wsrep_applier_threads instead.
2026-03-21T09:54:33.511005Z 0 [Warning] [MY-010097] [Server] Insecure configuration for --secure-log-path: Current value does not restrict location of generated files. Consider setting it to a valid, non-empty path.
2026-03-21T09:54:33.511627Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.4.7-7.1) starting as process 11
2026-03-21T09:54:33.612533Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2026-03-21T09:54:33.612589Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2026-03-21T09:54:33.612985Z 0 [Note] [MY-000000] [Galera] Loading provider /usr/lib64/galera4/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2026-03-21T09:54:33.613014Z 0 [Note] [MY-000000] [Galera] wsrep_load(): loading provider library '/usr/lib64/galera4/libgalera_smm.so'
2026-03-21T09:54:33.614151Z 0 [Note] [MY-000000] [Galera] wsrep_load(): Galera 4.24(a430f07) by Codership Oy <info@codership.com> (modified by Percona <https://percona.com/>) loaded successfully.
2026-03-21T09:54:33.614166Z 0 [Note] [MY-000000] [Galera] Resolved symbol 'wsrep_node_isolation_mode_set_v1'
2026-03-21T09:54:33.614171Z 0 [Note] [MY-000000] [Galera] Resolved symbol 'wsrep_certify_v1'
2026-03-21T09:54:33.614176Z 0 [Note] [MY-000000] [Galera] Initializing config service v2
2026-03-21T09:54:33.614502Z 0 [Note] [MY-000000] [Galera] Deinitializing config service v2
2026-03-21T09:54:33.614525Z 0 [Note] [MY-000000] [Galera] CRC-32C: using 64-bit x86 acceleration.
2026-03-21T09:54:33.614699Z 0 [Note] [MY-000000] [Galera] not using SSL compression
2026-03-21T09:54:33.615335Z 0 [Warning] [MY-000000] [Galera] Could not open state file for reading: '/var/lib/mysql//grastate.dat'
2026-03-21T09:54:33.615347Z 0 [Warning] [MY-000000] [Galera] No persistent state found. Bootstraping with default state
2026-03-21T09:54:33.615397Z 0 [Note] [MY-000000] [Galera] Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2026-03-21T09:54:33.615650Z 0 [Note] [MY-000000] [Galera] Generated new GCache ID: f687c19c-250b-11f1-86f6-53b91ed01bf6
2026-03-21T09:54:33.615664Z 0 [Note] [MY-000000] [Galera] GCache DEBUG: opened preamble:
Version: 0
UUID: 00000000-0000-0000-0000-000000000000
Seqno: -1 - -1
Offset: -1
Synced: 0
EncVersion: 0
Encrypted: 0
MasterKeyConst UUID: f687c19c-250b-11f1-86f6-53b91ed01bf6
MasterKey UUID: 00000000-0000-0000-0000-000000000000
MasterKey ID: 0
2026-03-21T09:54:33.615669Z 0 [Note] [MY-000000] [Galera] Skipped GCache ring buffer recovery: could not determine history UUID.
2026-03-21T09:54:33.616843Z 0 [Note] [MY-000000] [Galera] Passing config to GCS: allocator.disk_pages_encryption = no; allocator.encryption_cache_page_size = 32K; allocator.encryption_cache_size = 16777216; base_dir = /var/lib/mysql/; base_host = 10.30.1.88; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = no; debug = no; evs.auto_evict = 0; evs.causal_keepalive_period = PT1S; evs.debug_log_mask = 0x1; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.join_retrans_period = PT1S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 10; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 4; evs.version = 1; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.encryption = no; gcache.encryption_cache_page_size = 32K; gcache.encryption_cache_size = 16777216; gcache.freeze_purge_at_seqno = -1; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.check_appl_proto = 1; gcs.fc_auto_evict_threshold = 0.75; gcs.fc_auto_evict_window = 0; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 100; gcs.fc_master_slave = no; gcs.fc_single_primary = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.wait_restored_prim_timeout = PT0S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 11; socket.checksum = 2; socket.recv_buf_size = auto; socket.send_buf_size = auto; socket.ssl = YES; socket.ssl_ca = ca.pem; socket.ssl_cert = server-cert.pem; socket.ssl_cipher = ; socket.ssl_key = server-key.pem; socket.ssl_reload = 1;
2026-03-21T09:54:33.632433Z 0 [Note] [MY-000000] [WSREP] Starting replication
2026-03-21T09:54:33.632477Z 0 [Note] [MY-000000] [Galera] Connecting with bootstrap option: 1
2026-03-21T09:54:33.632489Z 0 [Note] [MY-000000] [Galera] Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2026-03-21T09:54:33.632532Z 0 [Note] [MY-000000] [Galera] protonet asio version 0
2026-03-21T09:54:33.633450Z 0 [Note] [MY-000000] [Galera] Using CRC-32C for message checksums.
2026-03-21T09:54:33.633465Z 0 [Note] [MY-000000] [Galera] backend: asio
2026-03-21T09:54:33.633651Z 0 [Note] [MY-000000] [Galera] gcomm thread scheduling priority set to other:0
2026-03-21T09:54:33.634099Z 0 [Note] [MY-000000] [Galera] Fail to access the file (/var/lib/mysql//gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown
2026-03-21T09:54:33.634176Z 0 [Note] [MY-000000] [Galera] Restoring primary-component from disk failed. Either node is booting for first time or re-booting after a graceful shutdown
2026-03-21T09:54:33.634435Z 0 [Note] [MY-000000] [Galera] GMCast version 0
2026-03-21T09:54:33.634584Z 0 [Note] [MY-000000] [Galera] (f68a98be-ae39, 'ssl://0.0.0.0:4567') listening at ssl://0.0.0.0:4567
2026-03-21T09:54:33.634602Z 0 [Note] [MY-000000] [Galera] (f68a98be-ae39, 'ssl://0.0.0.0:4567') multicast: , ttl: 1
2026-03-21T09:54:33.635038Z 0 [Note] [MY-000000] [Galera] EVS version 1
2026-03-21T09:54:33.635150Z 0 [Note] [MY-000000] [Galera] gcomm: bootstrapping new group 'noname'
2026-03-21T09:54:33.635209Z 0 [Note] [MY-000000] [Galera] start_prim is enabled, turn off pc_recovery
2026-03-21T09:54:33.635506Z 0 [Note] [MY-000000] [Galera] EVS version upgrade 0 -> 1
2026-03-21T09:54:33.635537Z 0 [Note] [MY-000000] [Galera] PC protocol upgrade 0 -> 1
2026-03-21T09:54:33.635571Z 0 [Note] [MY-000000] [Galera] Node f68a98be-ae39 state primary
2026-03-21T09:54:33.635598Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,f68a98be-ae39,1)
memb {
	f68a98be-ae39,0
	}
joined {
	}
left {
	}
partitioned {
	}
)
2026-03-21T09:54:33.635623Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2026-03-21T09:54:33.636800Z 0 [Note] [MY-000000] [Galera] gcomm: connected
2026-03-21T09:54:33.636848Z 0 [Note] [MY-000000] [Galera] Changing maximum packet size to 64500, resulting msg size: 32636
2026-03-21T09:54:33.636995Z 0 [Note] [MY-000000] [Galera] Shifting CLOSED -> OPEN (TO: 0)
2026-03-21T09:54:33.637006Z 0 [Note] [MY-000000] [Galera] Opened channel 'noname'
2026-03-21T09:54:33.637300Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
2026-03-21T09:54:33.637482Z 0 [Note] [MY-000000] [Galera] Starting new group from scratch: f68b172e-250b-11f1-9f1f-73954993d619
2026-03-21T09:54:33.637544Z 0 [Note] [MY-000000] [Galera] STATE_EXCHANGE: sent state UUID: f68b189b-250b-11f1-bf2a-3e3b036eb0b1
2026-03-21T09:54:33.637555Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: f68b189b-250b-11f1-bf2a-3e3b036eb0b1
2026-03-21T09:54:33.637566Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: f68b189b-250b-11f1-bf2a-3e3b036eb0b1 from 0 (prepare-job-my-db-my-db-g5xl5)
2026-03-21T09:54:33.637580Z 0 [Note] [MY-000000] [Galera] Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 0,
	members    = 1/1 (primary/total),
	act_id     = 0,
	last_appl. = 0,
	protocols  = 5/11/4 (gcs/repl/appl),
	vote policy= 0,
	group UUID = f68b172e-250b-11f1-9f1f-73954993d619
2026-03-21T09:54:33.637615Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]
2026-03-21T09:54:33.637623Z 0 [Note] [MY-000000] [Galera] Restored state OPEN -> JOINED (1)
2026-03-21T09:54:33.637652Z 0 [Note] [MY-000000] [Galera] Member 0.0 (prepare-job-my-db-my-db-g5xl5) synced with group.
2026-03-21T09:54:33.637660Z 0 [Note] [MY-000000] [Galera] Shifting JOINED -> SYNCED (TO: 1)
2026-03-21T09:54:33.637649Z 1 [Note] [MY-000000] [WSREP] Starting rollbacker thread 1
2026-03-21T09:54:33.637947Z 2 [Note] [MY-000000] [WSREP] Starting applier thread 2
2026-03-21T09:54:33.638102Z 2 [Note] [MY-000000] [Galera] ####### processing CC 1, local, ordered
2026-03-21T09:54:33.638134Z 2 [Note] [MY-000000] [Galera] Maybe drain monitors from -1 upto current CC event 1 upto:-1
2026-03-21T09:54:33.638160Z 2 [Note] [MY-000000] [Galera] Drain monitors from -1 up to -1
2026-03-21T09:54:33.638177Z 2 [Note] [MY-000000] [Galera] Process first view: f68b172e-250b-11f1-9f1f-73954993d619 my uuid: f68a98be-250b-11f1-ae39-2a62d236dfb0
2026-03-21T09:54:33.638238Z 2 [Note] [MY-000000] [Galera] Server prepare-job-my-db-my-db-g5xl5 connected to cluster at position f68b172e-250b-11f1-9f1f-73954993d619:1 with ID f68a98be-250b-11f1-ae39-2a62d236dfb0
2026-03-21T09:54:33.638252Z 2 [Note] [MY-000000] [WSREP] Server status change disconnected -> connected
2026-03-21T09:54:33.638274Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2026-03-21T09:54:33.638328Z 2 [Note] [MY-000000] [Galera] ####### My UUID: f68a98be-250b-11f1-ae39-2a62d236dfb0
2026-03-21T09:54:33.638345Z 2 [Note] [MY-000000] [Galera] Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 11), state transfer needed: no
2026-03-21T09:54:33.638526Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2026-03-21T09:54:33.638636Z 2 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
2026-03-21T09:54:33.638652Z 2 [Note] [MY-000000] [Galera] REPL Protocols: 11 (6)
2026-03-21T09:54:33.638664Z 2 [Note] [MY-000000] [Galera] ####### Adjusting cert position: -1 -> 1
2026-03-21T09:54:33.638696Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2026-03-21T09:54:33.639650Z 2 [Note] [MY-000000] [Galera] GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> f68b172e-250b-11f1-9f1f-73954993d619:0
2026-03-21T09:54:33.640439Z 2 [Note] [MY-000000] [Galera] ================================================
View:
  id: f68b172e-250b-11f1-9f1f-73954993d619:1
  status: primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 0
  members(1):
	0: f68a98be-250b-11f1-ae39-2a62d236dfb0, prepare-job-my-db-wisest
=================================================
2026-03-21T09:54:33.640460Z 2 [Note] [MY-000000] [WSREP] Server status change connected -> joiner
2026-03-21T09:54:33.640469Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2026-03-21T09:54:33.640492Z 2 [Note] [MY-000000] [WSREP] Server status change joiner -> initializing
2026-03-21T09:54:33.640500Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2026-03-21T09:54:33.641835Z 0 [Warning] [MY-010075] [Server] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: f68bbe7a-250b-11f1-98ed-92e6bd9047fa.
2026-03-21T09:54:33.645742Z 3 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2026-03-21T09:54:34.009193Z 3 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2026-03-21T09:54:34.130326Z 3 [Note] [MY-000000] [WSREP] wsrep_init_schema_and_SR (nil)
2026-03-21T09:54:34.143594Z 3 [System] [MY-000000] [WSREP] PXC upgrade completed successfully
2026-03-21T09:54:34.174696Z 0 [Note] [MY-000000] [WSREP] Before binlog recovery (wsrep position: e3b35cfb-1098-11f1-91b6-ce49836b4d55:932)
2026-03-21T09:54:34.174742Z 0 [Note] [MY-000000] [WSREP] After binlog recovery (wsrep position: e3b35cfb-1098-11f1-91b6-ce49836b4d55:932)
2026-03-21T09:54:34.174755Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
2026-03-21T09:54:34.188364Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
2026-03-21T09:54:34.343139Z 0 [Note] [MY-000000] [WSREP] Initialized wsrep sidno 6
2026-03-21T09:54:34.343159Z 0 [Note] [MY-000000] [Galera] Server initialized
2026-03-21T09:54:34.343166Z 0 [Note] [MY-000000] [WSREP] Server status change initializing -> initialized
2026-03-21T09:54:34.343180Z 0 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2026-03-21T09:54:34.343265Z 2 [Note] [MY-000000] [Galera] Bootstrapping a new cluster, setting initial position to 00000000-0000-0000-0000-000000000000:-1
2026-03-21T09:54:34.345240Z 10 [Note] [MY-000000] [WSREP] Starting applier thread 10
2026-03-21T09:54:34.346216Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: /var/lib/mysql/mysqlx.sock
2026-03-21T09:54:34.346355Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.4.7-7.1'  socket: '/tmp/mysql.sock'  port: 0  Percona XtraDB Cluster (GPL), Release rel7, Revision a1001c9, WSREP version 26.1.4.3.
2026-03-21T09:54:34.348266Z 9 [Note] [MY-000000] [WSREP] Recovered cluster id e3b35cfb-1098-11f1-91b6-ce49836b4d55
2026-03-21T09:54:34.349663Z 2 [Note] [MY-000000] [WSREP] Server status change initialized -> joined
2026-03-21T09:54:34.349688Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2026-03-21T09:54:34.349709Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2026-03-21T09:54:34.355874Z 2 [Note] [MY-000000] [Galera] Recording CC from group: 1
2026-03-21T09:54:34.355901Z 2 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from group: 1
2026-03-21T09:54:34.355915Z 2 [Note] [MY-000000] [Galera] Min available from gcache for CC from group: 1
2026-03-21T09:54:34.355947Z 2 [Note] [MY-000000] [Galera] Server prepare-job-my-db-my-db-g5xl5 synced with group
2026-03-21T09:54:34.355972Z 2 [Note] [MY-000000] [WSREP] Server status change joined -> synced
2026-03-21T09:54:34.355981Z 2 [Note] [MY-000000] [WSREP] Synchronized with group, ready for connections
2026-03-21T09:54:34.355991Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.

Based on the operator logs, it seems to be fail because the it did not restore the 3 pods it deleted before starting the restore job. The data restoration seems to have worked and there will be a PV which has the data from the backup. The prepare pod also completed and did not fail, although the logs mention failures.

The question is why the cluster did not come back online despite both restore and prepare pods completing. pxc-restor and operator are mainly complaining about the cluster being dead to my understanding.

Yunus · March 24, 2026, 2:39am

Was this PXC size reduced to 1 before restoring?

If you are trying to restore on single node, you might want to use that unsafeFlag option suggested.

Could you share describe output of the new pxc cluster that you have ran restore on?

na50r · March 26, 2026, 9:47am

UPDATE:

Thanks to @Yunus , I found the root cause.

When I tried to scale down the cluster, I realized I did not have this unsafe flag:

~ kubectl explain perconaxtradbcluster.spec.unsafeFlags
GROUP:      pxc.percona.com
KIND:       PerconaXtraDBCluster
VERSION:    v1

error: field "unsafeFlags" does not exist

And it turned out that the CRDs did not upgrade properly with the Helm Release.

After upgrading the CRDs, the Restore job ran without any issues.

So cause was just an incomplete upgrade on my side. So if anyone has similar errors popping up, make sure that the CRD version matches the Helm Release.

Topic		Replies	Views
Not able to restore backup in new Percona mysql clustrer Percona XtraDB Cluster 8.x community , mysql , percona	0	157	August 3, 2024
"Observed a panic in reconciler" on the backup restore request [perconaxtradbclusterrestore.pxc.percona.com] Percona Operator for MySQL mysql , percona , kubernetes	7	636	January 13, 2025
Not able to create PerconaXtraDBClusterRestore resource on kubernetes Percona XtraDB Cluster 5.x	5	933	March 30, 2023
Unable to Restore PXC Percona Operator for MySQL mysql	5	1711	July 10, 2023
Percona XtraDB Cluster on Kubernetes mysql operator (Point-In-Time Recovery) not happening Percona Operator for MySQL community , troubleshooting , mysql , percona	13	2865	April 18, 2022

Are there cases where Percona XtraBackup Restore can break the cluster?

Related topics