How is logical replication supposed to work with Patroni?

Hello!
I followed this guide to setup Patroni: Deploying on RHEL or derivatives - Percona Distribution for PostgreSQL
And it works great, but I wanted to start using logical replication, so I found this post: How Patroni Addresses the Problem of the Logical Replication Slot Failover in a PostgreSQL Cluster
But I can’t seem to understand how is this supposed to work. In my patroni.yml I have this:

initdb:
- encoding: UTF8
- data-checksums
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    archive_command: cp %p /opt/pgbackrest/%f #ignore this thing, attempts were made
    archive_mode: 'on'
    hot_standby: true
    hot_standby_feedback: 'on'
    logging_collector: 'on'
    max_logical_replication_workers: 10
    max_replication_slots: 10
    max_wal_senders: 10
    max_wal_size: 10GB
    wal_keep_segments: 15
    wal_level: logical
    wal_log_hints: true
  use_pg_rewind: true
  use_slots: true
retry_timeout: 10
slots:
  repslot:
    database: thedb
    plugin: pgoutput
    type: logical
ttl: 30

According to the post, I need to create a new subscription (on the leader? not sure), but not sure which address should I specify either, currently I added (192.168.0.2 being the replica’s address):

\c bigdb
DROP PUBLICATION bigdbpub;
CREATE PUBLICATION bigdbpub FOR ALL TABLES;
CREATE SUBSCRIPTION bigdbsub CONNECTION 'host=192.168.0.2 port=5432 password=qwerty user=replicator dbname=bigdb' PUBLICATION bigdbpub WITH (copy_data = true, create_slot=false, enabled=true, slot_name=repslot);

After this I believe I did some switchover, but currently I have this as a result on my replica:

bigdb=# select * from pg_replication_slots;
                slot_name                |  plugin  | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn
 | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase | conflicting 
-----------------------------------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+------------
-+---------------------+------------+---------------+-----------+-------------
 repslot                             | pgoutput | logical   |  16391 | bigdb     | f         | t      |     547246 |      |    740953175 | 42/376DEE68
 | 42/386E7F00         | reserved   |               | f         | f
 pg_16480_sync_16394_7358104072011291410 | pgoutput | logical   |  16391 | bigdb     | f         | t      |     560886 |      |    740953250 | 42/376F1240
 |                     | reserved   |               | f         | f
 pg_16480_sync_16397_7358104072011291410 | pgoutput | logical   |  16391 | bigdb     | f         | t      |     560887 |      |    740953250 | 42/376F1240
 |                     | reserved   |               | f         | f
(3 rows)

(what even are those 2 new replication slots?)

And I have this on my master:

bigdb=# select * from pg_replication_slots;
        slot_name        |  plugin  | slot_type | datoid | database | temporary | active | active_pid |   xmin    | catalog_xmin | restart_lsn | confirme
d_flush_lsn | wal_status | safe_wal_size | two_phase | conflicting 
-------------------------+----------+-----------+--------+----------+-----------+--------+------------+-----------+--------------+-------------+---------
------------+------------+---------------+-----------+-------------
 repslot             | pgoutput | logical   |  16391 | bigdb     | f         | f      |            |           |    728472769 | 41/14EFC720 | 41/14FB4
B08         | reserved   |               | f         | f
 node2 |          | physical  |        |          | f         | t      |     559802 | 740819645 |    740819645 | 42/34F45DE8 |         
            | reserved   |               | f         | 
(2 rows)

(why is the physical replication slot still here?)

The replication does seem to work, but I don’t think it is working correctly since I get these messages from time to time:

2024-04-18 17:54:19.780 EEST [553403] LOG:  background worker "logical replication worker" (PID 571107) exited with exit code 1
2024-04-18 17:54:19.780 EEST [571108] ERROR:  duplicate key value violates unique constraint "bigdb_serial_nr_key"
2024-04-18 17:54:19.780 EEST [571108] DETAIL:  Key (serial_nr)=(123456789) already exists.
2024-04-18 17:54:19.780 EEST [571108] CONTEXT:  COPY bigdb, line 1

My question is: how should logical replication be set up?
The biggest confusion is when I specify “192.168.0.2” in the subscription. Wouldn’t the subscription fail when the “leader” has changed? Should the “physical” replication slot be there (I believe it is creating the duplicate key error, does not happen with physical replication only)?

Hey queried1!
Welcome to Percona Community Forum!

Publication should be created on your “source” server, it is basically a definition of what you want to replicate using logical replication.
Subscription should be created on your target server, for each created publication you need one corresponding subscription on your logical replica. The target server could be your leader, but in a different database if you want to replicate it locally, or some remote remote server that will be your logical replica.

To use patroni cluster as a source for your logical replication, you would need to use the haproxy or add all the patroni hosts in your subscription connection string, so after the failover subscription knows where to connect and from where it should replicate the changes.
i.e.

CREATE SUBSCRIPTION asd_sub CONNECTION 'user=replicator password=qwe123 dbname=postgres host=192.168.56.11,192.168.56.12,192.168.56.13 port=5432 target_session_attrs=read-write'
PUBLICATION asd
WITH (
copy_data = true,
create_slot = false,
enabled = true,
connect = true,
slot_name = 'repslot'
);

And the replication slot created by patroni, that you added here:

slots:
  repslot:
    database: thedb
    plugin: pgoutput
    type: logical

will be replicated to all replicas in your patroni cluster, so after the failover it will be already there and the subscription will be able to connect immediately.

About the replication slots that you are seeing: pg_16480_sync_16394_7358104072011291410, etc.
Those are temporary slots responsible for the initial copy of data for replicated objects. Each replicated table will have its own slots, but there will be running 2 at a time always, once first one is done it will close itself and new one will be spawned, to perform initial copy.

Physical replication slot is there because of your patroni configuration “use_slots: true”.
And about your error, it looks like your table on “target server”, your logical replica, was not empty when you created subscription, you should only create empty objects and let the initial copy do the work. There are however some methods to avoid the initial copy, you can find multiple blog posts about it. i.e. Replication Between PostgreSQL Versions Using Logical Replication

Hope it helps. If you have any questions do not hesitate to reach out!
Cheers,
Mateusz