Patroni -Postgres- HA cluster over 2 DC

Ishan_Chawla · November 12, 2024, 8:36am

What we want to establish is a redundant setup spread over our two availability zones. With a automatic failover when a server or a availability zone fails.
The zone hosting the HA Proxy servers is a stretch over both availability zones (network and hosting).
Requirements:
• no manual interference needed for failover
• database stays writable during failover

Need help in configuring the setup. Any config example in this regard would be helpful and validating the design .
I am attaching the High level design of what we want.
We are planning on having two HAproxy servers, one in each datacenter and in each DC a set of 2 Patroni / PostgreSQL server, where Patroni ensures the HA part, together with etcd.
Need help in validating the design and how to configure the setup ,especially replication among intrasite and intersite postgres nodes.

anil.joshi · December 3, 2024, 9:08am

@Ishan_Chawla

You can setup the Standby Cluster on other DC [DC2] which will be syncing asynchronously from the Primary Cluster [DC1] although for the failover you have to rely on the manual approach as automatic promotion is not possible across the DC.

List item

https://patroni.readthedocs.io/en/latest/ha_multi_dc.html#asynchronous-replication

Well if you use a single cluster(Primary) while the nodes allocated over different DC/Network then you can achieve such automatic failover thingy.

List item

https://patroni.readthedocs.io/en/latest/ha_multi_dc.html#synchronous-replication

You have to remove the **standby cluster changes ** from the stand-by configuration - /etc/patroni/cluster2-0.yml in order to perform the manual failover.

-standby_cluster:
- host: xxx
- port: 5432
- primary_slot_name: standbyclust

The point is doing such failover especially from one network/zone to another could to lead to performance issues, network instability or replication lag/stall data problem. The stand-by should be consider for a disaster recovery scenario.

Are you considering using the stand-By[DC2] for some traffic or it is just for the DR solution ?

The other layers Haproxy/VIP should work fine if they able to recognize the DC2 once the Stand-by Promoted to Primary Leader.

Note - If your application belongs to different network/DC then the latency problem could arise while connecting to the DC2 so take this also in the consideration. If you have the application deployed on same DR then this would not be a blocker.

Topic		Replies	Views
Postgres Operator Multi-Cluster auto failover Percona Operator for PostgreSQL	1	74	March 17, 2025
High availabitlty implementation in non-shared architecture PostgreSQL	7	817	December 18, 2024
PGQL patroni cluster	3	17	August 5, 2025
Percona HA not working over two availability zones	2	84	June 17, 2024
Configure High availability PostgreSQL cluster on OpenShift with the operator Percona Distribution for PostgreSQL	3	1150	July 31, 2022

Patroni -Postgres- HA cluster over 2 DC

Related topics