Setup 3 node XtraDB Cluster in kubernetes using Disk Snapshots

lreyes · February 21, 2021, 11:26pm

We’re currently planning to move our MySQL databases to Percona XtraDB Cluster in Kubernetes for high availability. However, the database size is around 6TB and it takes a very long time to setup a 3 node cluster. Adding a single node to the cluster takes around 3-4 hours for SST to complete. We currently have MySQL disk snapshots of our database and I’m trying to look for ways if it’s possible to attach the snapshots to the 3 nodes and point XtraDB mysqld config to use the InnoDB binary files in the snapshots during initial setup so that synching the 3 nodes is significantly faster.

Also, our plan is to quickly add more nodes to the cluster which can be used in a few minutes instead of needing to wait 3-4 hours for SST to complete.

Please let me know if this is possible and if you can point me to any documentation that can help with how to do this as I have not been able to find a good solution yet after doing extensive search.

Thanks in advance.

vadimtk · February 22, 2021, 12:37am

We never tested this way as an official and recommended , but it should work.
You just need to make sure for the initial setup all volumes have identical snapshots.

lreyes · February 23, 2021, 3:43am

Thank you for your reply vadimtk. I tried the approach but I’m getting a complete cluster crash and the only way I found to solve it is to delete the disk of the other nodes and perform SST. Do you have an alternative recommended way to restore disk snapshots from a VM instance or configuration settings that would help lessen the time to do SST?

Thanks in advance.

vadimtk · February 23, 2021, 11:54am

For 6TB size the only recommendation I have is to use 40Gb network, this is the only way I see would help with the transfer time.
But I am curious what storage do you use for snapshots? is it some kind of SAN ?

lreyes · February 23, 2021, 10:45pm

We’re using GCP snapshots because our mysql servers currently run on a VM instance.

I think I have resolved the issue by doing the following steps:

Take a new snapshot from the VM instance.
Verify that the owner of the files is 1001:1001.
Remove auto.cnf if using GTID replication to generate a new UUID if preferred.
Create a PV and PVC from the snapshot. The PVC should be named datadir-cluster1-pxc-0.
Start the cluster with a single node so that PXC can initialize using the MySQL binaries.
Create the PXC users manually if needed.
Verify there are no errors.
Take a snapshot of the disk attached to the PVC datadir-cluster1-pxc-0.
Create two disks from the snapshot for the two nodes that will be created.
Create the PV and PVC from the two disks with PVC name datadir-cluster1-pxc-1 and datadir-cluster1-pxc-2.
Resize the cluster to start with 3 nodes.

After the above steps I saw that the new nodes got initialized one by one and performed IST instead of full SST.

vadimtk · February 24, 2021, 1:02am

@lreyes
Right, this would work for the initial setup, but if you have SST in case a node crashed and SST performed automatically, then old way with xtrabackup will be used.
It is possible to modify SST scripts to use snapshots instead of xtrabackup, but we do not have this yet.
Might be a good feature request.

lreyes · February 24, 2021, 1:45pm

I realize that and agree that this will be a good feature request. Will add to Jira when I have the chance.

Thanks vadimtk!

Kannan_DR · February 25, 2021, 5:27pm

@lreyes @vadimtk what does it take to change the SST scripts to use snapshots? Is it possible for us to do that or wait for this to be added as a feature? Curious to see if we can contribute to this in anyway.

vadimtk · February 25, 2021, 5:43pm

We do not have immediate plan to implement snapshot SST, so contributions are appreciated!

Topic		Replies	Views
Crash recovery - failed cluster restart with missing volume (Kubernetes, Docker, MySQL 5.x) Percona XtraDB Cluster 5.x	6	999	September 14, 2020
Adding Nodes With Minimal Downtime? Percona XtraDB Cluster 5.x	1	702	April 8, 2019
SST (State Snapshot Transfer) process Time mysql , percona	3	108	August 19, 2024
Restore percona xtradb cluster by XtraBackup on GCP. Got some problem Percona XtraBackup	16	2808	April 22, 2023
Resolving IST Utilization Issues in a Percona XtraDB Cluster and Adding a New Node Percona XtraDB Cluster 5.x	3	628	October 24, 2023

Setup 3 node XtraDB Cluster in kubernetes using Disk Snapshots

Related topics