Cross-site two way (Or more than two) replication

Description:

I want to test out percona operator for my use-case. I’m looking for a multi master (within the same cluster) + two way (Or more than two) cross-site replication. ie, All pods in asia, us & eu regions should accept read/write and then replicate to the other two sites. Is this possible ? I have checked the doc Multi-cluster and multi-region deployment - Percona Operator for MySQL based on Percona XtraDB Cluster and it specifies only one way replication from one cluster to another.

Hey @jithin_devaws,
This is a bad design decision, in general. Your application will experience extremely high lag/latency because every write must be synchronously replicated to other nodes. Typically applications have a single point for writes, and replicas for reading locally.

Apps (dc1) will face latency only if it did not receive an ACK for the write operation from the local DC (dc1). And if I’m correct master (dc1) => slave (dc2) replication can be done in asynchronous way, so that the apps will get immediate ACK from local DC (dc1) even if the replication breaks/delayed with slave. So I don’t think that will be a problem.

Maybe I misunderstood what you said, but above you said you wanted PXC pods (ie: nodes) in each DC: asia, us, and eu. That’s 3 nodes in geographically dispersed regions. Any writes to either of these 3 nodes will require synchronous replication. This means that the ACK in US will not be received by the app in US until after it is replicated to EU and ASIA.

Perhaps a diagram of what you are envisioning would be helpful? I like https://excalidraw.com/ for quick, easy diagrams.

@matthewb
Thank you for the quick diagram tool :slight_smile: Here is the high level view of the db setup I’m trying to achieve.

Let’s replace the US and EU region based DCs with DCs within the same region because ours are private DCs located 300-400 miles from one another. They are;

asia-dc1 (k8s) - db-cluster-1 (3 master replicas)
asia-dc2 (k8s) - db-cluster-2 (3 master replicas)
asia-dc3 (k8s) - db-cluster-3 (3 master replicas)

In every k8s cluster, we have the stateless apps deployed which connects to the DB endpoint db-cluster-ip-svc locally where this SVC endpoints load balance write/read request to any of the healthy master nodes. Eg, in dc-1, the apps will connect to any of asia-dc1-master-1, asia-dc1-master-2, asia-dc1-master-3 via the svc. This health is decided by the k8s probes. Similarly, apps hosted in respective clusters will be connecting only to the local db end point db-cluster-ip-svc . I believe this sync within the local dc is synchronous which might not cause any serious latency if I’m correct.

Now for the cross-site replication, we can expose these master pods via svc called replication-svc-nodeport/lb. As you can see from the diagram, there are two dotted lines originated from a single cluster to other two clusters. Eg; from db-cluster-1 two dotted blue lines connecting to replication-svc of db-cluster-2 and db-cluster-3 for replicating data from those two clusters. Similarly every other dc is connected to other two DCs. If I understood this correctly, these cross site replication is something we can control as asynchronous, so that it won’t interrupt any application writes to the local dc and then replicate to other DC’s in an asynchronous way.

I may be wrong here. But if you feel like this is not the right way to implement HA for database within multiple DCs, you could share any useful docs/diagram.

Hello @jithin_devaws,
Thanks for the drawing. What you have detailed out still amounts to ‘circular replication’, which is a huge “don’t do that”. You can find many blogs about how MySQL circular replication will cause more harm than good. Nobody at Percona would ever recommend a circular-replication topology.

Instead, to achieve the multi-DC HA, you would have 1 PXC node in each DC (3 nodes total), all connected as the same cluster. The apps in each DC would connect to a local load balancer (ie: proxysql) and that proxysql would send all writes to DC1. Reads would go to the local PXC node.

If DC1 (or just the pxc1 pod in DC1) goes down/offline, the proxysql pods in DC1, DC2, and DC3 detect this. A new leader is elected from the surviving 2 PXC nodes, and writes are now sent to this node.

When K8S in DC1 restores pxc1-pod, it will join the cluster and sync any missing data before accepting connections.

Even if you did not have K8S, this would still be the recommendation for DC-HA with PXC.

Based on the design you shared, in a normal scenario with 3 DCs the apps hosted in the DC3 will have to connect to DC1 node over proxsql for all write operations (Considering DC1 node is the master). In my opinion, this design will add latency on every single write operation from DC2 and DC3 due to the geo separation.
Anywya, we have the 3 way circular replication implemented already with mariadb galera but that too helm based approach but no operators. This is leading to many operational challenges especially when it comes to crash recovery/unintentional node failures in k8s.