We have created 3 node postgresql, patroni, etcd High Availability and Failover cluster. For this cluster, patroni configuration is as below -
namespace: ${NAMESPACE}
scope: ${SCOPE}
name: ${NODE_NAME}
restapi:
listen: 0.0.0.0:8008
connect_address: ${NODE_IP}:8008
etcd3:
host: ${NODE_IP}:2379
bootstrap:
this section will be written into Etcd:///config after initializing new cluster
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
slots:
percona_cluster_1:
type: physical
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
max_connections: 200
wal_level: replica
hot_standby: on
wal_keep_segments: 10
max_wal_senders: 5
max_replication_slots: 10
wal_log_hints: on
logging_collector: 'on'
some desired options for ‘initdb’
initdb: # Note: It needs to be a list (some options need values, others are switches)
- encoding: UTF8
- data-checksums
pg_hba: # Add following lines to pg_hba.conf after running ‘initdb’
- host replication replicator 127.0.0.1/32 trust
- host replication replicator 0.0.0.0/0 md5
- host all all 0.0.0.0/0 md5
- host all all ::0/0 md5
Some additional users which needs to be created after initializing new cluster
users:
admin:
password: pwd123
options:
- createrole
- createdb
percona:
password: pwd123
options: - createrole
- createdb
postgresql:
cluster_name: cluster_1
listen: 0.0.0.0:5432
connect_address: ${NODE_IP}:5432
data_dir: ${DATA_DIR}
bin_dir: ${PG_BIN_DIR}
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: pwd987
superuser:
username: postgres
password: pwd789
parameters:
unix_socket_directories: /var/run/postgresql/
create_replica_methods:
- basebackup
basebackup:
checkpoint: 'fast'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
Cluster is running on this configuration. Physical replication slot named as “percona_cluster_1”, created in this patroni configuration has retained wal of size 3 TB and this slot is orphaned or inactive.
How to resolve this issue and prevent it from future occurance.