Setup 3 node XtraDB Cluster in kubernetes using Disk Snapshots

We’re currently planning to move our MySQL databases to Percona XtraDB Cluster in Kubernetes for high availability. However, the database size is around 6TB and it takes a very long time to setup a 3 node cluster. Adding a single node to the cluster takes around 3-4 hours for SST to complete. We currently have MySQL disk snapshots of our database and I’m trying to look for ways if it’s possible to attach the snapshots to the 3 nodes and point XtraDB mysqld config to use the InnoDB binary files in the snapshots during initial setup so that synching the 3 nodes is significantly faster.

Also, our plan is to quickly add more nodes to the cluster which can be used in a few minutes instead of needing to wait 3-4 hours for SST to complete.

Please let me know if this is possible and if you can point me to any documentation that can help with how to do this as I have not been able to find a good solution yet after doing extensive search.

Thanks in advance.

2 Likes

We never tested this way as an official and recommended , but it should work.
You just need to make sure for the initial setup all volumes have identical snapshots.

2 Likes

Thank you for your reply vadimtk. I tried the approach but I’m getting a complete cluster crash and the only way I found to solve it is to delete the disk of the other nodes and perform SST. Do you have an alternative recommended way to restore disk snapshots from a VM instance or configuration settings that would help lessen the time to do SST?

Thanks in advance.

1 Like

For 6TB size the only recommendation I have is to use 40Gb network, this is the only way I see would help with the transfer time.
But I am curious what storage do you use for snapshots? is it some kind of SAN ?

1 Like

We’re using GCP snapshots because our mysql servers currently run on a VM instance.

I think I have resolved the issue by doing the following steps:

  • Take a new snapshot from the VM instance.
  • Verify that the owner of the files is 1001:1001.
  • Remove auto.cnf if using GTID replication to generate a new UUID if preferred.
  • Create a PV and PVC from the snapshot. The PVC should be named datadir-cluster1-pxc-0.
  • Start the cluster with a single node so that PXC can initialize using the MySQL binaries.
  • Create the PXC users manually if needed.
  • Verify there are no errors.
  • Take a snapshot of the disk attached to the PVC datadir-cluster1-pxc-0.
  • Create two disks from the snapshot for the two nodes that will be created.
  • Create the PV and PVC from the two disks with PVC name datadir-cluster1-pxc-1 and datadir-cluster1-pxc-2.
  • Resize the cluster to start with 3 nodes.

After the above steps I saw that the new nodes got initialized one by one and performed IST instead of full SST.

2 Likes

@lreyes
Right, this would work for the initial setup, but if you have SST in case a node crashed and SST performed automatically, then old way with xtrabackup will be used.
It is possible to modify SST scripts to use snapshots instead of xtrabackup, but we do not have this yet.
Might be a good feature request.

1 Like

I realize that and agree that this will be a good feature request. Will add to Jira when I have the chance.

Thanks vadimtk!

1 Like

@lreyes @vadimtk what does it take to change the SST scripts to use snapshots? Is it possible for us to do that or wait for this to be added as a feature? Curious to see if we can contribute to this in anyway.

1 Like

We do not have immediate plan to implement snapshot SST, so contributions are appreciated!

2 Likes