We are setting up a Percona XtraDB Cluster with three nodes, each running on an AWS EC2 instance. To handle query routing, we plan to use ProxySQL, also deployed on an EC2 instance.
However, in this setup, ProxySQL becomes a single point of failure. To ensure high availability, we considered deploying Keepalived on two ProxySQL nodes. But since Keepalived relies on VRRP (multicast), which isn’t supported in AWS, we’re unsure if this is a viable solution.
Could you advise whether Keepalived can be used in this setup? If not, what would be the best approach to achieve high availability for ProxySQL in an AWS environment?
Hey @Faisal_Hassan,
just for my understanding, the two ProxySQL nodes are in the same network (VPC)?
If yes you can setup a pacemaker cluster with a virtual ip and the proxysql as a resource and add those two in a resource group. In that setup you configure your client to use the cluster ip of the pacemaker cluster for communication. In case one node goes offline the other node takes over the cluster ip and your cluster is still available.
Hey @Jostrus,
Firstly, thanks for your response! And yes, the two ProxySQL nodes are in the same VPC.
I’ll try setting up Pacemaker as you suggested. Do you have any reference or guide for the setup?
What about Keepalived? Will it work in this setup?
I prefer HAProxy but that’s just my personal opinion. Keepalived will work as well with that setup. I also use keepalived and even ipvsadm for different use case scenarios but i just like haproxy a bit more.
In case you want to use keepalived as an Loadbalancer instead of haproxy you have you use the systemd pacemaker resource: pcs resource create haproxy systemd:keepalived
I don’t know how much you know about pacemaker/corosync cluster so if you need further clarification or in case you are interested in deeper insights about pacemaker just hit me up.
First of all, thanks for the references and for sharing your opinion! I appreciate the insights.
Actually, I’m new to this setup and would love to learn more about Pacemaker/Corosync.
Could you share deeper insights on how they work in a high-availability setup? Any best practices or gotchas to watch out for?
Hey @Faisal_Hassan,
best practice for a pacemaker/corosync setup is quorum but can also be configured for a two node setup but isn’t recommended. Here you have to check stonith (shoot the other node in the head) and set quorum policy to ignore for a two node cluster.
As always you can also setup a witness node in case you don’t want a third node that can take over the whole traffic.
Lastly that i can think of is a dedicated NIC for Heartbeat messages but that depends on the amount of traffic that is already sending and receiving over your main NIC. It’s not required as you can still have a pretty solid setup without it but it’s definitely best practice.
Sorry to reopen an old thread, but we are looking at exactly the same scenario. Three node XtraDB cluster in AWS, traffic routed through either ProxySQL or HAproxy (is there a Percona preferred option?) and are looking for the Proxy not to be the single point of failure. Is there a benefit over ELB or EC2 scaling groups?
For clarity, we expect that traffic may dictate a fourth read-only node on the XtraDB cluster (MySQL) in the short-term, and the software has been designed with that intention.
EC2 scaling groups are not really HA/failover in this sense. I think you’d be better with 2 EC2s running ProxySQL (recommended), and a ELB in front to balance traffic to ProxySQL.
Thanks that was our view and plan. Presumably the EC2 for ProxySQL dont need to be big instances? I’m struggling to find anything on recommendations - memory optimised presumably again.