Configuration of High Availability not working | Patroni

Hi Team,
I’ve tried deploying HA(High Availability) for the Percona PostgreSql(14.15) as per the below wiki

In the " Configure Patroni: phase, The patroni service failed to start on all the nodes(1,2,3)

 sudo systemctl restart patroni
root@node1:/etc/systemd/system# sudo systemctl status patroni
× patroni.service - PostgreSQL high-availability manager
     Loaded: loaded (/lib/systemd/system/patroni.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2025-01-20 18:30:33 IST; 7s ago
    Process: 3228190 ExecStart=/usr/bin/patroni ${PATRONI_CONFIG_LOCATION} (code=exited, status=1/FAILURE)
   Main PID: 3228190 (code=exited, status=1/FAILURE)
        CPU: 326ms

 1月 20 18:30:33 node2 systemd[1]: patroni.service: Scheduled restart job, restart counter is at 5.
 1月 20 18:30:33 node2 systemd[1]: Stopped PostgreSQL high-availability manager.
 1月 20 18:30:33 node2 systemd[1]: patroni.service: Start request repeated too quickly.
 1月 20 18:30:33 node2 systemd[1]: patroni.service: Failed with result 'exit-code'.
 1月 20 18:30:33 node2 systemd[1]: Failed to start PostgreSQL high-availability manager.

And this is the error for “sudo journalctl -fu patroni”

sudo journalctl -fu patroni
 1月 20 18:30:32 node2 patroni[3228190]:     raise ScannerError(None, None,
 1月 20 18:30:32 node2 patroni[3228190]: yaml.scanner.ScannerError: mapping values are not allowed here
 1月 20 18:30:32 node2 patroni[3228190]:   in "/etc/patroni/patroni.yml", line 2, column 10
 1月 20 18:30:32 node2 systemd[1]: patroni.service: Main process exited, code=exited, status=1/FAILURE
 1月 20 18:30:32 node2 systemd[1]: patroni.service: Failed with result 'exit-code'.
 1月 20 18:30:33 node2 systemd[1]: patroni.service: Scheduled restart job, restart counter is at 5.
 1月 20 18:30:33 node2 systemd[1]: Stopped PostgreSQL high-availability manager.
 1月 20 18:30:33 node2 systemd[1]: patroni.service: Start request repeated too quickly.
 1月 20 18:30:33 node2 systemd[1]: patroni.service: Failed with result 'exit-code'.
 1月 20 18:30:33 node2 systemd[1]: Failed to start PostgreSQL high-availability manager

Seems this was an yaml indexing issue,

 yamllint  /etc/patroni/patroni.yml
/etc/patroni/patroni.yml
  2:10      error    syntax error: mapping values are not allowed here (syntax)
  14:81     error    line too long (101 > 80 characters)  (line-length)
  38:81     error    line too long (88 > 80 characters)  (line-length)
  48:81     error    line too long (116 > 80 characters)  (line-length)
  50:81     error    line too long (88 > 80 characters)  (line-length)
  61:25     error    trailing spaces  (trailing-spaces)

Have changed /etc/patroni/patroni.yml file as below

---
echo "
namespace:percona_lab
scope:cluster_1
name:node1
restapi:{listen:0.0.0.0:8008,connect_address:127.0.0.1:8008}
etcd3:{host:127.0.0.1:2379}
bootstrap:{dcs:{ttl:30,loop_wait:10,retry_timeout:10,maximum_lag_on_failover:1048576,
  postgresql:{use_pg_rewind:true,use_slots:true,parameters:{wal_level:replica,hot_standby:'on',
  wal_keep_segments:10,max_wal_senders:5,max_replication_slots:10,wal_log_hints:'on',
  logging_collector:'on',max_wal_size:'10GB',archive_mode:'on',archive_timeout:600s,
  archive_command:'cp -f %p /home/postgres/archived/%f'}}},
  initdb:[{encoding:UTF8},{data-checksums}],
  pg_hba:[{host:replication,replicator:'127.0.0.1/32',method:trust},
          {host:replication,replicator:'0.0.0.0/0',method:md5},
          {host:all,all:'0.0.0.0/0',method:md5},
          {host:all,all:'::0/0',method:md5}],
  users:{admin:{password:qaz123,options:[createrole,createdb]},
         percona:{password:qaz123,options:[createrole,createdb]}}}
postgresql:{cluster_name:cluster_1,listen:0.0.0.0:5432,connect_address:127.0.0.1:5432,
  data_dir:${DATA_DIR},bin_dir:${PG_BIN_DIR},pgpass:/tmp/pgpass0,
  authentication:{replication:{username:replicator,password:replPasswd},
                  superuser:{username:postgres,password:qaz123}},
  parameters:{unix_socket_directories:'/var/run/postgresql/'},
  create_replica_methods:[basebackup],
  basebackup:{checkpoint:'fast'}}
tags:{nofailover:false,noloadbalance:false,clonefrom:false,nosync:false}
" | sudo tee /etc/patroni/patroni.yml

Now, I’ve solved the yaml indexing. After this I’m getting this below error while starting the service after re-enabling the service

Kindly help us starting this service

Deployed on 3 ubuntu machines

Also, here there was no proper guidance if the patroni.service is not created under/etc/systemd/system directory

Need clear guidlines to start the patroni service

HI,
Let’s see if we can simplify the landscape:

  • Is this a PRIMARY or REPLICA, it makes a difference?
  • Did you validate the configuration file, similar to the following invocation
su - postgres -c "/usr/local/bin/patroni --validate-config /etc/patroni/postgres.yml"
  • Start patroni from the command line and NOT systemd. This will give you useful messages in the console. Patroni does not record the most intuitive in log messages by default sometimes. Remember to use CTRL-C when you want to shut down the process.

Hope this helps.

Hi Robert Bernier,
Thanks for your reply, This is the primary node(node1). Also, we got the same error traces on all other nodes(node2, node3)

There was no postgres.yml file presented
image

The below error shows, there was a yaml indexing/mapping issue for the patroni.yml file (Same error while starting patroni from command line)

Hence we modified the patroni.yml as below (To fix the yaml indexing/mapping issue)

---
echo "
namespace:percona_lab
scope:cluster_1
name:node1
restapi:{listen:0.0.0.0:8008,connect_address:127.0.0.1:8008}
etcd3:{host:127.0.0.1:2379}
bootstrap:{dcs:{ttl:30,loop_wait:10,retry_timeout:10,maximum_lag_on_failover:1048576,
  postgresql:{use_pg_rewind:true,use_slots:true,parameters:{wal_level:replica,hot_standby:'on',
  wal_keep_segments:10,max_wal_senders:5,max_replication_slots:10,wal_log_hints:'on',
  logging_collector:'on',max_wal_size:'10GB',archive_mode:'on',archive_timeout:600s,
  archive_command:'cp -f %p /home/postgres/archived/%f'}}},
  initdb:[{encoding:UTF8},{data-checksums}],
  pg_hba:[{host:replication,replicator:'127.0.0.1/32',method:trust},
          {host:replication,replicator:'0.0.0.0/0',method:md5},
          {host:all,all:'0.0.0.0/0',method:md5},
          {host:all,all:'::0/0',method:md5}],
  users:{admin:{password:qaz123,options:[createrole,createdb]},
         percona:{password:qaz123,options:[createrole,createdb]}}}
postgresql:{cluster_name:cluster_1,listen:0.0.0.0:5432,connect_address:127.0.0.1:5432,
  data_dir:${DATA_DIR},bin_dir:${PG_BIN_DIR},pgpass:/tmp/pgpass0,
  authentication:{replication:{username:replicator,password:replPasswd},
                  superuser:{username:postgres,password:qaz123}},
  parameters:{unix_socket_directories:'/var/run/postgresql/'},
  create_replica_methods:[basebackup],
  basebackup:{checkpoint:'fast'}}
tags:{nofailover:false,noloadbalance:false,clonefrom:false,nosync:false}
" | sudo tee /etc/patroni/patroni.yml

The Yaml Indexing/mapping issue has been resolved now, When I validate the configuration file using the below command

su - postgres -c "/bin/patroni --validate-config /etc/patroni/patroni.yml"

we got these below traces now

Also, Starting the patroni service from the command line failed due to the same error traces

Additional traces when validating postgres.yml

root@machineName:/etc/patroni# su - postgres -c "/bin/patroni --validate-config /etc/patroni/postgres.yml"
name  is not defined.
scope  is not defined.
restapi  is not defined.
consul  is not defined.
etcd  is not defined.
etcd3  is not defined.
exhibitor  is not defined.
kubernetes  is not defined.
raft  is not defined.
zookeeper  is not defined.
postgresql  is not defined.

Kindly let me know, What needs to be done to resolve this

Hi,
You seem to have a bit of a mess on your hands :no_mouth:

First things first:

  • I have understood that you have located your patroni configuration file as
    /etc/patroni/postgres.yml
  • Focus all debugging using the patroni validation command. Forget everything else you’ve written.
  • Consider restarting again with a single node i.e. the PRIMARY. Do not include any REPLICA until after you have successfully started patroni.
  • Archive your patroni configuration file and start over again with a configuration clean file i.e. the patroni example file that should be part of your installation.
  • Run the validation script against the new patroni configuration file. Edit the configuration file only after you’ve confirmed it validates correctly. A good validation is when nothing is returned.
  • Once you have a validated configuration file “slowly” build out the parameters i.e. basic stuff like postgres paths.
  • Cycle between validating and filling out the configuration file. At about this time test the single node patroni cluster by starting it from the command line and read the output. Do not use systemd until after you’ve confirmed everything works.

Good Luck, May the Force Be With You. :zap: