We are setting up a Percona XtraDB Cluster with three nodes, each running on an AWS EC2 instance. To handle query routing, we plan to use ProxySQL, also deployed on an EC2 instance.
However, in this setup, ProxySQL becomes a single point of failure. To ensure high availability, we considered deploying Keepalived on two ProxySQL nodes. But since Keepalived relies on VRRP (multicast), which isn’t supported in AWS, we’re unsure if this is a viable solution.
Could you advise whether Keepalived can be used in this setup? If not, what would be the best approach to achieve high availability for ProxySQL in an AWS environment?
Hey @Faisal_Hassan,
just for my understanding, the two ProxySQL nodes are in the same network (VPC)?
If yes you can setup a pacemaker cluster with a virtual ip and the proxysql as a resource and add those two in a resource group. In that setup you configure your client to use the cluster ip of the pacemaker cluster for communication. In case one node goes offline the other node takes over the cluster ip and your cluster is still available.
Hey @Jostrus,
Firstly, thanks for your response! And yes, the two ProxySQL nodes are in the same VPC.
I’ll try setting up Pacemaker as you suggested. Do you have any reference or guide for the setup?
What about Keepalived? Will it work in this setup?
I prefer HAProxy but that’s just my personal opinion. Keepalived will work as well with that setup. I also use keepalived and even ipvsadm for different use case scenarios but i just like haproxy a bit more.
In case you want to use keepalived as an Loadbalancer instead of haproxy you have you use the systemd pacemaker resource: pcs resource create haproxy systemd:keepalived
I don’t know how much you know about pacemaker/corosync cluster so if you need further clarification or in case you are interested in deeper insights about pacemaker just hit me up.
First of all, thanks for the references and for sharing your opinion! I appreciate the insights.
Actually, I’m new to this setup and would love to learn more about Pacemaker/Corosync.
Could you share deeper insights on how they work in a high-availability setup? Any best practices or gotchas to watch out for?
Hey @Faisal_Hassan,
best practice for a pacemaker/corosync setup is quorum but can also be configured for a two node setup but isn’t recommended. Here you have to check stonith (shoot the other node in the head) and set quorum policy to ignore for a two node cluster.
As always you can also setup a witness node in case you don’t want a third node that can take over the whole traffic.
Lastly that i can think of is a dedicated NIC for Heartbeat messages but that depends on the amount of traffic that is already sending and receiving over your main NIC. It’s not required as you can still have a pretty solid setup without it but it’s definitely best practice.