Percona XtraDB Cluster Operator cross-site replication failover

Hi everyone,

I deployed two PXCs on different Kubernetes clusters as DC and DR and set up cross-site replication with the PXC operator as mentioned in the documentation [1]. Then I imported a DB from a dump file to an instance of the source PXC (DC). How could I verify the data replicated to all instances in both DC and DR?

As I’m new to PXC, I would like to know the proper procedure to fail over to the async replica PXC in DR. However, when DC is up, the data needs to be synced back from DR, and the applications need to swing the connection to DC since it’s the primary site. I couldn’t find the references for that. Do I need to change the isSource and sourcesList of replicationChannels manually to sync the updated data from DR to DC? Is there any automated way for that?

[1] Multi-cluster and multi-region deployment - Percona Operator for MySQL based on Percona XtraDB Cluster

Simply point your applications (or reconfigure your proxy/router) to the IP/hostname of the cluster in the DR.

Yes, you would need to perform this step manually. Replication is uni-directional. You need to stop the replication flowing from DC->DR and reverse it to bring any changes written to DR. After that sync has finished, then you can flip your proxy/router back to DC and then, again, reconfigure replication to go DC->DR.

Many have attempted to automate this and all have ended up with more complexity, or more headaches.

Thanks @matthewb. I just configured the DR site as the source and the DC site as the replica, but it was not working. Although I can login to the DR master by using the replication user, the slave status is showing the error below.

‘error connecting to master ‘replication@10.89.0.203:3306’ - retry-time: 60 retries: 2 message: Authentication plugin ‘caching_sha2_password’ reported error: Authentication requires secure connection.’

Is switching sources the only thing to do, or is any further action required? Do I need to backup the databases from the DR and restore them to DC before enabling replication?

You need to configure SSL replication.

You shouldn’t need to restore anything, unless you’ve actually lost data.