XtraDB Clusters with Automatic Asynchronous Replication Connection Failover

Jimmy_Chen · January 30, 2025, 6:43pm

Hello. I’m testing XtraDB cluster 8.0.39 setup with two 3-node clusters in different regions. They’re provisioned on AlmaLinux 9 VMs. I also configured asynchronous replication channels with failover for cross-site multi-master replication. However, I could not find any specific information on the best practice of how to set this up when there are multiple nodes in a cluster, and it’s hard to formulate a specific search on Google for what I’m looking for.

For example, I have 2 XtraDB clusters, A and B. Each cluster has 3 nodes, A1, A2, A3, B1, B2, B3. I have configured the replication channels on A1 to replicate from B1, B2 and B3. Similarly, I have configured B1 to replicate from A1, A2, A3. Now this configuration appears to be local to A1 and B1 only. So in theory, if A1 and/or B1 goes down, then there may be issues in keeping the data up-to-date. Am I supposed to configure A2, A3, B2, B3 in the same exact way so that all 3 nodes for each cluster are all attempting to replicate? Would this cause issues with writes or contention?

matthewb · January 30, 2025, 11:34pm

Hello @Jimmy_Chen ,

This will probably cause conflicts, and/or duplicate data. You should have only 1 replication channel from cluster A to cluster B, and 1 channel from B to A.

PXC does not use async replication’s GTID. Meaning a commit on A1 is going to replicate as A1, and not replicate as ‘A’. This is going to cause issues when B receives this commits it locally, and then sends it back to A1. Since the GTID will not be from ‘A’, it will execute again, and cause a huge loop.

You need to configure the server_id of A1/2/3 to be the same, and configure B1/2/3 to have the same server_id, but different from A. This way, when A1 gets a replicated transaction from “B”, it will see the server_id of the trx matches itself and it will ignore it.

Jimmy_Chen · January 31, 2025, 1:37am

Thank you for the reply @matthewb. Just so I fully understand. GTID needs to be turned off. I should configure async replication between A and B clusters, with server-id unique per cluster but identical between the nodes? This would mean I cannot configure multiple replication channels even if it’s on the same node such as B1 → A1, B2 → A1, B3 → A1? And if that’s the case I also cannot configure async replication failover?

matthewb · January 31, 2025, 3:16am

No. I didn’t say that. GTID should be on. I was simply pointing out that PXC does not utilize async’s GTID methodology. The way you configure source/source async replication is easier because of GTID. If you are familiar with that, I was letting you know that’s not the case with PXC.

Correct

You would never do this in the first place. As I said earlier, you should have MAX 2 channels. One channel is A1 → B1, and another channel is B1->A1. That’s it. No other channels anywhere else in the clusters.

Assuming you set up those two channels, a write on A3 would Galera-replicate to A1, write to A1’s binlog, replicate to B1, apply B1, write B1 binlog, replicate to A1. Since the trx coming back from B1 to A1 has A1’s server_id, A1 ignores it.

You can configure this, yes. But this does not mean you have additional channels.
https://dev.mysql.com/doc/refman/8.0/en/replication-asynchronous-connection-failover-replica.html

Even when configuring async replica failover, you still only have 1 channel. That channel is managed/moved by MySQL automatically.

Topic		Replies	Views
Async replication between 2 XtraDB clusters. Percona XtraDB Cluster 5.x	0	871	May 19, 2014
Replication from Primary Xtradb 8 Cluster to DR Xtradb 8 Cluster Percona XtraDB Cluster 8.x percona	2	660	September 20, 2022
Bi-directional replication between 2 XtraDB clusters Percona XtraDB Cluster 5.x	1	951	June 25, 2017
It's it possible to using XtraDB cluster as a slave? Percona XtraDB Cluster 5.x	6	4317	May 15, 2023
The topology design of XtraDB cluster 8.0.19 Percona XtraDB Cluster 8.x community , mysql , percona	12	1376	August 28, 2020

XtraDB Clusters with Automatic Asynchronous Replication Connection Failover

Related topics