I’m looking for best practices, lessons learned, benchmarks, etc regarding connecting two (or more) clusters using async replication over a WAN.
The recent trend at my company is to use Cassandra without really evaluating the data requirements, and I’m trying to demonstrate that MySQL is a viable alternative in at least some (if not most) cases. The #1 reason people want to use Cassandra over MySQL is because it “supports multi-region” from the points of view of both availability (failure of an entire region) and data locality (users in Europe get their data from EU servers while US users get their data from US servers). The typical Cassandra setup for this would use async replication between regions, so that’s what I want to use for my MySQL topology:
US cluster: Node1 <-> Node2 <-> Node3
.| <-- Node3 and Node6 using two-day async replication
EU cluster: Node4 <-> Node5 <-> Node6
wsrep_sync_wait to enforce that. But with async replication between regions, there is potential for transaction conflicts when the same record is updated in multiple regions. Cassandra handles this using “last update wins.” Any thoughts regarding how best to solve this two-way async replication with MySQL would be great (MySQL replication, Tungsten, etc).