Asynchronous replication erroring out

Description:

We run a read-only replica on k8s (OKD/Openshift) cluster with async replication (Multi-cluster and multi-region deployment - Percona Operator for MySQL or docs.percona.com/percona-operator-for-mysql/pxc/backups-move-from-external-db.html)

All works well until i believe there is temporary network blip and replicating pod (the one with label percona.com/replicationPod=true) can’t talk to its peers. The network partitioning changes wsrep_local_state_comment from “Synced” to something else making wsrep unready. The async replication is not wsrep aware and tries to write into unready wsrep and gets errored out.

The only way out is to run start replica for channel XYZ manually.
Is there a way we can prevent this from happenning? Or maybe operator enhancement like checking periodically if replication has not errored out.

Steps to Reproduce:

  • Spin pxc cluster
  • Setup async replication from external source
  • Break comm between pod with label percona.com/replicationPod=true other pxc pods atleast on port 4567 effectively creating network partition (I am not sure how to do)
  • While that replication pod state is not Synced, write on external source so thats its copied to that pod via async repl.
  • See it erroring out and not recovering without manual intervention.

Version:

I have seen this in atleast v1.13.0 till latest (v1.16.1)

Logs:

Full logs - pxc operator log - Pastebin.com

prod-pxc-0 logs 02-08 17:44:25 {"log":"2025-02-08T17:44:23.885776Z 0 [Warning] [MY-000000] [Galera] last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT3.64276S), skipping check\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:44:32 {"log":"2025-02-08T17:44:25.742123Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') connection to peer 80860039-b723 with addr ssl://10.50.7.245:4567 timed out, no messages seen in PT3S, socket stats: rtt: 7468 rttvar: 10959 rto: 208000 lost: 0 last_data_recv: 182 cwnd: 10 last_queued_since: 1338267463 last_delivered_since: 4707667117 send_queue_length: 3 send_queue_bytes: 216 segment: 0 messages: 3 (gmcast.peer_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:44:32 {"log":"2025-02-08T17:44:26.400995Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') connection to peer c657a5f2-be89 with addr ssl://10.50.8.36:4567 timed out, no messages seen in PT3S, socket stats: rtt: 9113 rttvar: 12785 rto: 210000 lost: 0 last_data_recv: 1397 cwnd: 10 last_queued_since: 28177666 last_delivered_since: 4918820083 send_queue_length: 4 send_queue_bytes: 428 segment: 0 messages: 4 (gmcast.peer_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:44:34 {"log":"2025-02-08T17:44:28.038487Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: ssl://10.50.7.245:4567 ssl://10.50.8.36:4567 \n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:44:36 {"log":"2025-02-08T17:44:32.426952Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') reconnecting to 80860039-b723 (ssl://10.50.7.245:4567), attempt 0\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:44:41 {"log":"2025-02-08T17:44:35.788457Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') reconnecting to c657a5f2-be89 (ssl://10.50.8.36:4567), attempt 0\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:44:57 {"log":"2025-02-08T17:44:42.763698Z 0 [Warning] [MY-000000] [Galera] last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT17.2336S), skipping check\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:46:22 {"log":"2025-02-08T17:46:10.888927Z 0 [Note] [MY-000000] [Galera] Failed to establish connection: End of file\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:01 {"log":"2025-02-08T17:46:22.899995Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') connection to peer 00000000-0000 with addr ssl://10.50.7.245:4567 timed out, no messages seen in PT3S, socket stats: rtt: 59752 rttvar: 118571 rto: 535000 lost: 0 last_data_recv: 35256 cwnd: 10 last_queued_since: 6520995083366014 last_delivered_since: 6520995083366014 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:01 {"log":"2025-02-08T17:46:43.056845Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') connection to peer 00000000-0000 with addr ssl://10.50.8.36:4567 timed out, no messages seen in PT3S, socket stats: rtt: 140149 rttvar: 57147 rto: 415000 lost: 0 last_data_recv: 38143 cwnd: 10 last_queued_since: 6521006060605922 last_delivered_since: 6521006060605922 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:01.748598Z 0 [Warning] [MY-000000] [Galera] last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT3M20.8192S), skipping check\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:01.860390Z 0 [Note] [MY-000000] [Galera] Failed to establish connection: Operation aborted.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:01.973680Z 0 [Note] [MY-000000] [Galera] Failed to establish connection: Operation aborted.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.049924Z 0 [Note] [MY-000000] [Galera] Failed to establish connection: End of file\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.114521Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') connection established to c657a5f2-be89 ssl://10.50.8.36:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.124313Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') connection established to 80860039-b723 ssl://10.50.7.245:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.354293Z 0 [Note] [MY-000000] [Galera] declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.442080Z 0 [Note] [MY-000000] [Galera] declaring node with index 0 inactive (evs.inactive_timeout) \n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.442518Z 0 [Note] [MY-000000] [Galera] declaring node with index 2 suspected, timeout PT5S (evs.suspect_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.449341Z 0 [Note] [MY-000000] [Galera] declaring node with index 2 inactive (evs.inactive_timeout) \n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:02 {"log":"2025-02-08T17:48:02.450147Z 0 [Note] [MY-000000] [Galera] evs::proto(824d48c1-9250, OPERATIONAL, view_id(REG,80860039-b723,23)) detected inactive node: 80860039-b723\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:02.451813Z 0 [Note] [MY-000000] [Galera] evs::proto(824d48c1-9250, OPERATIONAL, view_id(REG,80860039-b723,23)) detected inactive node: c657a5f2-be89\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.457564Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node\nview (view_id(NON_PRIM,80860039-b723,23)\nmemb {\n\t824d48c1-9250,0\n\t}\njoined {\n\t}\nleft {\n\t}\npartitioned {\n\t80860039-b723,0\n\tc657a5f2-be89,0\n\t}\n)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.461645Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node\nview (view_id(NON_PRIM,824d48c1-9250,24)\nmemb {\n\t824d48c1-9250,0\n\t}\njoined {\n\t}\nleft {\n\t}\npartitioned {\n\t80860039-b723,0\n\tc657a5f2-be89,0\n\t}\n)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.461841Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.474156Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.474264Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.474282Z 0 [Note] [MY-000000] [Galera] Shifting SYNCED -> OPEN (TO: 4616706)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.474656Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.474747Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.474766Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.475453Z 2 [Note] [MY-000000] [Galera] Maybe drain monitors from 4616706 upto current CC event 4616706 upto:4616706\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.475977Z 2 [Note] [MY-000000] [Galera] Drain monitors from 4616706 up to 4616706\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.478204Z 2 [Note] [MY-000000] [Galera] ================================================\nView:\n  id: 58d61cd4-d899-11ef-815b-cabbdc4fe69e:4616706\n  status: non-primary\n  protocol_version: 4\n  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO\n  final: no\n  own_index: 0\n  members(1):\n\t0: 824d48c1-e4dc-11ef-9250-d23a5995ec9e, prod-pxc-0\n=================================================\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.478546Z 2 [Note] [MY-000000] [Galera] Non-primary view\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.478660Z 2 [Note] [MY-000000] [WSREP] Server status change synced -> connected\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.479344Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.479683Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.479959Z 2 [Note] [MY-000000] [Galera] Maybe drain monitors from 4616706 upto current CC event 4616706 upto:4616706\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.479975Z 2 [Note] [MY-000000] [Galera] Drain monitors from 4616706 up to 4616706\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.479997Z 2 [Note] [MY-000000] [Galera] ================================================\nView:\n  id: 58d61cd4-d899-11ef-815b-cabbdc4fe69e:4616706\n  status: non-primary\n  protocol_version: 4\n  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO\n  final: no\n  own_index: 0\n  members(1):\n\t0: 824d48c1-e4dc-11ef-9250-d23a5995ec9e, prod-pxc-0\n=================================================\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.480008Z 2 [Note] [MY-000000] [Galera] Non-primary view\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.480018Z 2 [Note] [MY-000000] [WSREP] Server status change connected -> connected\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:03 {"log":"2025-02-08T17:48:03.480027Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.482269Z 0 [Note] [MY-000000] [Galera] declaring 80860039-b723 at ssl://10.50.7.245:4567 stable\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.482436Z 0 [Note] [MY-000000] [Galera] declaring c657a5f2-be89 at ssl://10.50.8.36:4567 stable\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.485068Z 0 [Note] [MY-000000] [Galera] Node 80860039-b723 state primary\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.489522Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node\nview (view_id(PRIM,80860039-b723,25)\nmemb {\n\t80860039-b723,0\n\t824d48c1-9250,0\n\tc657a5f2-be89,0\n\t}\njoined {\n\t}\nleft {\n\t}\npartitioned {\n\t}\n)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.490740Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.501133Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.501296Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: Waiting for state UUID.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.503605Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: d905e79e-e644-11ef-a6ec-d297982e38f5\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.506091Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: d905e79e-e644-11ef-a6ec-d297982e38f5 from 0 (prod-pxc-1)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.508607Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: d905e79e-e644-11ef-a6ec-d297982e38f5 from 1 (prod-pxc-0)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.511670Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: d905e79e-e644-11ef-a6ec-d297982e38f5 from 2 (prod-pxc-2)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.514757Z 0 [Note] [MY-000000] [Galera] Quorum results:\n\tversion    = 6,\n\tcomponent  = PRIMARY,\n\tconf_id    = 22,\n\tmembers    = 2/3 (primary/total),\n\tact_id     = 4616707,\n\tlast_appl. = 4616692,\n\tprotocols  = 4/11/4 (gcs/repl/appl),\n\tvote policy= 0,\n\tgroup UUID = 58d61cd4-d899-11ef-815b-cabbdc4fe69e\n","file":"/var/lib/mysql/mysqld-error.log"}
----------- errored async replication
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.515880Z 42 [ERROR] [MY-010584] [Repl] Replica SQL for channel 'prod_external': Worker 1 failed executing transaction '3d5f89c2-1c37-11ee-81d5-0050568ff6c2:155380565' at source log binlog.000044, end_log_pos 876499626; Error 'WSREP has not yet prepared node for application use' on query. Default database: 'db18'. Query: 'BEGIN', Error_code: MY-001047\n","file":"/var/lib/mysql/mysqld-error.log"}
----------- errored async replication end
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.516490Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [173, 173]\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.518959Z 0 [Note] [MY-000000] [Galera] Shifting OPEN -> PRIMARY (TO: 4616708)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.519137Z 2 [Note] [MY-000000] [Galera] ####### processing CC 4616708, local, ordered\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.519224Z 2 [Note] [MY-000000] [Galera] Maybe drain monitors from 4616706 upto current CC event 4616708 upto:4616706\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.519251Z 2 [Note] [MY-000000] [Galera] Drain monitors from 4616706 up to 4616706\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.519271Z 2 [Note] [MY-000000] [Galera] ####### My UUID: 824d48c1-e4dc-11ef-9250-d23a5995ec9e\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.519291Z 2 [Note] [MY-000000] [Galera] Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 11), state transfer needed: yes\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.522457Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.525350Z 2 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527007Z 2 [Note] [MY-000000] [Galera] State transfer required: \n\tGroup state: 58d61cd4-d899-11ef-815b-cabbdc4fe69e:4616708\n\tLocal state: 58d61cd4-d899-11ef-815b-cabbdc4fe69e:4616706\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527061Z 2 [Note] [MY-000000] [WSREP] Server status change connected -> joiner\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527076Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527181Z 2 [Note] [MY-000000] [WSREP] WSREP: You have configured 'xtrabackup-v2' state snapshot transfer method which cannot be performed on a running server. Wsrep provider won't be able to fall back to it if other means of state transfer are unavailable. In that case you will need to restart the server.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527228Z 2 [Note] [MY-000000] [Galera] Check if state gap can be serviced using IST\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527262Z 2 [Note] [MY-000000] [Galera] ####### IST uuid:58d61cd4-d899-11ef-815b-cabbdc4fe69e f: 4616707, l: 4616708, STRv: 3\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527867Z 2 [Note] [MY-000000] [Galera] IST receiver addr using ssl://10.50.35.96:4568\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.527966Z 2 [Note] [MY-000000] [Galera] IST receiver using ssl\n","file":"/var/lib/mysql/mysqld-error.log"}
----------- errored async replication cont.
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.525874Z 41 [ERROR] [MY-010586] [Repl] Error running query, replica SQL thread aborted. Fix the problem, and restart the replica SQL thread with \"START REPLICA\". We stopped at log 'binlog.000044' position 876499468\n","file":"/var/lib/mysql/mysqld-error.log"}
----------- errored async replication cont. end
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.531486Z 2 [Note] [MY-000000] [Galera] Prepared IST receiver for 4616707-4616708, listening at: ssl://10.50.35.96:4568\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.536798Z 0 [Note] [MY-000000] [Galera] Member 1.0 (prod-pxc-0) requested state transfer from 'prod-pxc-1,'. Selected 0.0 (prod-pxc-1)(SYNCED) as donor.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.536855Z 0 [Note] [MY-000000] [Galera] Shifting PRIMARY -> JOINER (TO: 4616708)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:04 {"log":"2025-02-08T17:48:04.537135Z 2 [Note] [MY-000000] [Galera] Requesting state transfer: success, donor: 0\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.537215Z 2 [Note] [MY-000000] [Galera] Receiving IST: 2 writesets, seqnos 4616707-4616708\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.570377Z 0 [Note] [MY-000000] [Galera] 0.0 (prod-pxc-1): State transfer to 1.0 (prod-pxc-0) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.571148Z 0 [Note] [MY-000000] [Galera] ####### IST applying starts with 4616707\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.573050Z 0 [Note] [MY-000000] [Galera] ####### IST current seqno initialized to 4616693\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.575002Z 0 [Note] [MY-000000] [Galera] Receiving IST... 0.0% (0/16 events) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.575488Z 0 [Note] [MY-000000] [Galera] IST preload starting at 4616693\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.577034Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.577349Z 0 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:4616692, protocol version: 6\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.577407Z 0 [Note] [MY-000000] [Galera] Member 0.0 (prod-pxc-1) synced with group.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.578701Z 0 [Note] [MY-000000] [Galera] ####### Passing IST CC 4616707, must_apply: 1, preload: true\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.578783Z 0 [Note] [MY-000000] [Galera] REPL Protocols: 11 (6)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.578801Z 0 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 4616706 -> 4616707\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.578969Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.579328Z 0 [Note] [MY-000000] [Galera] Recording CC from ist: 4616707\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.579370Z 0 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from ist: 4616693\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.579384Z 0 [Note] [MY-000000] [Galera] Min available from gcache for CC from ist: 4512326\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.579558Z 0 [Note] [MY-000000] [Galera] ####### Passing IST CC 4616708, must_apply: 1, preload: true\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.582579Z 11 [Note] [MY-000000] [Galera] ================================================\nView:\n  id: 58d61cd4-d899-11ef-815b-cabbdc4fe69e:4616707\n  status: primary\n  protocol_version: 4\n  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO\n  final: no\n  own_index: -1\n  members(2):\n\t0: 80860039-d899-11ef-b723-87e32c8c2f58, prod-pxc-1\n\t1: c657a5f2-d89a-11ef-be89-166795724667, prod-pxc-2\n=================================================\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.587345Z 11 [Note] [MY-000000] [WSREP] Server status change joiner -> joined\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.587829Z 11 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.589214Z 11 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.617184Z 0 [Note] [MY-000000] [Galera] REPL Protocols: 11 (6)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.617343Z 0 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 4616707 -> 4616708\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.617517Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.617734Z 0 [Note] [MY-000000] [Galera] Recording CC from ist: 4616708\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.617947Z 0 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from ist: 4616693\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.617982Z 0 [Note] [MY-000000] [Galera] Min available from gcache for CC from ist: 4512326\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.618342Z 2 [Note] [MY-000000] [Galera] ================================================\nView:\n  id: 58d61cd4-d899-11ef-815b-cabbdc4fe69e:4616708\n  status: primary\n  protocol_version: 4\n  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO\n  final: no\n  own_index: 1\n  members(3):\n\t0: 80860039-d899-11ef-b723-87e32c8c2f58, prod-pxc-1\n\t1: 824d48c1-e4dc-11ef-9250-d23a5995ec9e, prod-pxc-0\n\t2: c657a5f2-d89a-11ef-be89-166795724667, prod-pxc-2\n=================================================\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.618685Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.619444Z 0 [Note] [MY-000000] [Galera] Receiving IST... 100.0% (16/16 events) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.630696Z 2 [Note] [MY-000000] [Galera] Draining apply monitors after IST up to 4616708\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.633461Z 2 [Note] [MY-000000] [Galera] IST received: 58d61cd4-d899-11ef-815b-cabbdc4fe69e:4616708\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.636396Z 2 [Note] [MY-000000] [Galera] Recording CC from sst: 4616708\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.636474Z 2 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from sst: 4616693\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.636502Z 2 [Note] [MY-000000] [Galera] Min available from gcache for CC from sst: 4512326\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.640244Z 0 [Note] [MY-000000] [Galera] 1.0 (prod-pxc-0): State transfer from 0.0 (prod-pxc-1) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.640356Z 0 [Note] [MY-000000] [Galera] SST leaving flow control\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.640382Z 0 [Note] [MY-000000] [Galera] Shifting JOINER -> JOINED (TO: 4616708)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.640510Z 0 [Note] [MY-000000] [Galera] Processing event queue:... -nan% (0/0 events) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.645372Z 0 [Note] [MY-000000] [Galera] Member 1.0 (prod-pxc-0) synced with group.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.645462Z 0 [Note] [MY-000000] [Galera] Processing event queue:... 100.0% (1/1 events) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.645496Z 0 [Note] [MY-000000] [Galera] Shifting JOINED -> SYNCED (TO: 4616708)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.645596Z 2 [Note] [MY-000000] [Galera] Server prod-pxc-0 synced with group\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.647388Z 2 [Note] [MY-000000] [WSREP] Server status change joined -> synced\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.647430Z 2 [Note] [MY-000000] [WSREP] Synchronized with group, ready for connections\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:04.647448Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-0 logs 02-08 17:48:05 {"log":"2025-02-08T17:48:05.460095Z 0 [Note] [MY-000000] [Galera] (824d48c1-9250, 'ssl://0.0.0.0:4567') turning message relay requesting off\n","file":"/var/lib/mysql/mysqld-error.log"}

Expected Result:

Maybe operator periodically runs replication check and starts it if sql error code is MY-001047 or whatever appropriate.

Actual Result:

Replication is errored out with

Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '3d5f89c2-1c37-11ee-81d5-0050568ff6c2:155380565' at source log binlog.000044, end_log_pos 876499626. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.

Additional Information:

Expected result may look like feature request but I am open to suggestion on other ways to fix.

Hi @ashutosh,
Checkout this parameter and see if this does what you need: MySQL wsrep Options — Galera Cluster Documentation

1 Like

Hi @matthewb , I enabled restart replica and let it run for two weeks for testing.

mysql> show variables like 'wsrep_restart%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| wsrep_restart_replica | ON    |
| wsrep_restart_slave   | ON    |
+-----------------------+-------+
2 rows in set (0.01 sec)

Unfortunately, today, we ended up on same error and as i see it, it didn’t start the replica back

prod-pxc-1 logs 02-25 14:49:32 {"log":"2025-02-25T14:49:32.818501Z 0 [Warning] [MY-011071] [Server] Component percona_telemetry reported: 'Problem during telemetry file write: Permission denied'\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:42.736340Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection to peer 130c6a08-a789 with addr ssl://10.50.14.153:4567 timed out, no messages seen in PT3S, socket stats: rtt: 92795 rttvar: 68233 rto: 400000 lost: 0 last_data_recv: 3295 cwnd: 3 last_queued_since: 202774811 last_delivered_since: 3293913727 send_queue_length: 0 send_queue_bytes: 0 segment: 0 messages: 0 (gmcast.peer_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:42.736819Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: ssl://10.50.14.153:4567 \n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:46.808777Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') reconnecting to 130c6a08-a789 (ssl://10.50.14.153:4567), attempt 0\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:46.809838Z 0 [Warning] [MY-000000] [Galera] last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT4.48362S), skipping check\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:46.851558Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection established to 130c6a08-a789 ssl://10.50.14.153:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:46.878612Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection established to 130c6a08-a789 ssl://10.50.14.153:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:47.310394Z 0 [Note] [MY-000000] [Galera] declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:47.310546Z 0 [Note] [MY-000000] [Galera] evs::proto(87e990f5-971d, OPERATIONAL, view_id(REG,130c6a08-a789,11)) suspecting node: 130c6a08-a789\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:47.310617Z 0 [Note] [MY-000000] [Galera] evs::proto(87e990f5-971d, OPERATIONAL, view_id(REG,130c6a08-a789,11)) suspected node without join message, declaring inactive\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:48.620798Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection established to a8f5ff06-87ad ssl://10.50.48.17:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:49 {"log":"2025-02-25T14:51:48.674628Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection established to a8f5ff06-87ad ssl://10.50.48.17:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:51.811532Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') turning message relay requesting off\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.512427Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node\nview (view_id(NON_PRIM,130c6a08-a789,11)\nmemb {\n\t87e990f5-971d,0\n\t}\njoined {\n\t}\nleft {\n\t}\npartitioned {\n\t130c6a08-a789,0\n\ta8f5ff06-87ad,0\n\t}\n)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.512663Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node\nview (view_id(NON_PRIM,87e990f5-971d,12)\nmemb {\n\t87e990f5-971d,0\n\t}\njoined {\n\t}\nleft {\n\t}\npartitioned {\n\t130c6a08-a789,0\n\ta8f5ff06-87ad,0\n\t}\n)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.512935Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513141Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513155Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513164Z 0 [Note] [MY-000000] [Galera] Shifting SYNCED -> OPEN (TO: 64101)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513210Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513234Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513242Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513540Z 1 [Note] [MY-000000] [Galera] Maybe drain monitors from 64101 upto current CC event 64101 upto:64101\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513742Z 1 [Note] [MY-000000] [Galera] Drain monitors from 64101 up to 64101\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513843Z 1 [Note] [MY-000000] [Galera] ================================================\nView:\n  id: a8f6d309-ea2d-11ef-8c7a-ca1e9f56d5d4:64101\n  status: non-primary\n  protocol_version: 4\n  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO\n  final: no\n  own_index: 0\n  members(1):\n\t0: 87e990f5-f2be-11ef-971d-0b935e518ea7, prod-pxc-1\n=================================================\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513916Z 1 [Note] [MY-000000] [Galera] Non-primary view\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.513955Z 1 [Note] [MY-000000] [WSREP] Server status change synced -> connected\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514162Z 1 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514272Z 1 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514309Z 1 [Note] [MY-000000] [Galera] Maybe drain monitors from 64101 upto current CC event 64101 upto:64101\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514321Z 1 [Note] [MY-000000] [Galera] Drain monitors from 64101 up to 64101\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514340Z 1 [Note] [MY-000000] [Galera] ================================================\nView:\n  id: a8f6d309-ea2d-11ef-8c7a-ca1e9f56d5d4:64101\n  status: non-primary\n  protocol_version: 4\n  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO\n  final: no\n  own_index: 0\n  members(1):\n\t0: 87e990f5-f2be-11ef-971d-0b935e518ea7, prod-pxc-1\n=================================================\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514365Z 1 [Note] [MY-000000] [Galera] Non-primary view\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514373Z 1 [Note] [MY-000000] [WSREP] Server status change connected -> connected\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:53 {"log":"2025-02-25T14:51:52.514382Z 1 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
>>>>>>>>>> errored async replication
prod-pxc-1 logs 02-25 14:51:55 {"log":"2025-02-25T14:51:54.739967Z 42 [ERROR] [MY-010584] [Repl] Replica SQL for channel 'prod_external': Worker 1 failed executing transaction '803ecab2-5eed-11ef-80d9-2e1a74bb9946:1043850' at source log mysql-bin.000104, end_log_pos 2406919; Error 'WSREP has not yet prepared node for application use' on query. Default database: 'db7'. Query: 'BEGIN', Error_code: MY-001047\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:55 {"log":"2025-02-25T14:51:54.741353Z 41 [ERROR] [MY-010586] [Repl] Error running query, replica SQL thread aborted. Fix the problem, and restart the replica SQL thread with \"START REPLICA\". We stopped at log 'mysql-bin.000104' position 2406762\n","file":"/var/lib/mysql/mysqld-error.log"}
>>>>>>>>>> errored async replication end
prod-pxc-1 logs 02-25 14:51:56 {"log":"2025-02-25T14:51:55.031443Z 0 [Note] [MY-000000] [Galera] declaring 130c6a08-a789 at ssl://10.50.14.153:4567 stable\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:56 {"log":"2025-02-25T14:51:55.031530Z 0 [Note] [MY-000000] [Galera] declaring a8f5ff06-87ad at ssl://10.50.48.17:4567 stable\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:56 {"log":"2025-02-25T14:51:55.148616Z 0 [Note] [MY-000000] [Galera] re-bootstrapping prim from partitioned components\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:56 {"log":"2025-02-25T14:51:55.619421Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node\nview (view_id(PRIM,130c6a08-a789,13)\nmemb {\n\t130c6a08-a789,0\n\t87e990f5-971d,0\n\ta8f5ff06-87ad,0\n\t}\njoined {\n\t}\nleft {\n\t}\npartitioned {\n\t}\n)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:56 {"log":"2025-02-25T14:51:55.619520Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:56.676409Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:56.676684Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: Waiting for state UUID.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:56.722190Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: 0e824bb6-f388-11ef-a7a9-220b21c91c2c\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:57.681873Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 0e824bb6-f388-11ef-a7a9-220b21c91c2c from 1 (prod-pxc-1)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.006781Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 0e824bb6-f388-11ef-a7a9-220b21c91c2c from 2 (prod-pxc-0)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.006837Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 0e824bb6-f388-11ef-a7a9-220b21c91c2c from 0 (prod-pxc-2)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.006870Z 0 [Note] [MY-000000] [Galera] Quorum: No node with complete state:\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"\tVersion      : 6\n\tFlags        : 0x3\n\tProtocols    : 4 / 11 / 4\n\tState        : NON-PRIMARY\n\tDesync count : 0\n\tPrim state   : SYNCED\n\tPrim UUID    : 883bba1b-f2be-11ef-9c33-cb3ff2a7a046\n\tPrim  seqno  : 9\n\tFirst seqno  : 3585\n\tLast  seqno  : 64101\n\tCommit cut   : 64006\n\tLast vote    : -1.0\n\tVote policy  : 0\n\tPrim JOINED  : 3\n\tState UUID   : 0e824bb6-f388-11ef-a7a9-220b21c91c2c\n\tGroup UUID   : a8f6d309-ea2d-11ef-8c7a-ca1e9f56d5d4\n\tName         : 'prod-pxc-2'\n\tIncoming addr: 'prod-pxc-2.prod-pxc.prod-v17.svc.cluster.local:3306'\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"\tVersion      : 6\n\tFlags        : 0x2\n\tProtocols    : 4 / 11 / 4\n\tState        : NON-PRIMARY\n\tDesync count : 0\n\tPrim state   : SYNCED\n\tPrim UUID    : 883bba1b-f2be-11ef-9c33-cb3ff2a7a046\n\tPrim  seqno  : 9\n\tFirst seqno  : 3585\n\tLast  seqno  : 64101\n\tCommit cut   : 64006\n\tLast vote    : -1.0\n\tVote policy  : 0\n\tPrim JOINED  : 3\n\tState UUID   : 0e824bb6-f388-11ef-a7a9-220b21c91c2c\n\tGroup UUID   : a8f6d309-ea2d-11ef-8c7a-ca1e9f56d5d4\n\tName         : 'prod-pxc-1'\n\tIncoming addr: 'prod-pxc-1.prod-pxc.prod-v17.svc.cluster.local:3306'\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"\tVersion      : 6\n\tFlags        : 0x2\n\tProtocols    : 4 / 11 / 4\n\tState        : NON-PRIMARY\n\tDesync count : 0\n\tPrim state   : SYNCED\n\tPrim UUID    : 883bba1b-f2be-11ef-9c33-cb3ff2a7a046\n\tPrim  seqno  : 9\n\tFirst seqno  : 1\n\tLast  seqno  : 64101\n\tCommit cut   : 64006\n\tLast vote    : -1.0\n\tVote policy  : 0\n\tPrim JOINED  : 3\n\tState UUID   : 0e824bb6-f388-11ef-a7a9-220b21c91c2c\n\tGroup UUID   : a8f6d309-ea2d-11ef-8c7a-ca1e9f56d5d4\n\tName         : 'prod-pxc-0'\n\tIncoming addr: 'prod-pxc-0.prod-pxc.prod-v17.svc.cluster.local:3306'\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.006904Z 0 [Note] [MY-000000] [Galera] Full re-merge of primary 883bba1b-f2be-11ef-9c33-cb3ff2a7a046 found: 3 of 3.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.006935Z 0 [Note] [MY-000000] [Galera] Quorum results:\n\tversion    = 6,\n\tcomponent  = PRIMARY,\n\tconf_id    = 9,\n\tmembers    = 3/3 (primary/total),\n\tact_id     = 64101,\n\tlast_appl. = 64006,\n\tprotocols  = 4/11/4 (gcs/repl/appl),\n\tvote policy= 0,\n\tgroup UUID = a8f6d309-ea2d-11ef-8c7a-ca1e9f56d5d4\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007030Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [173, 173]\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007072Z 0 [Note] [MY-000000] [Galera] Restored state OPEN -> SYNCED (64102)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007226Z 1 [Note] [MY-000000] [Galera] ####### processing CC 64102, local, ordered\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007305Z 1 [Note] [MY-000000] [Galera] Maybe drain monitors from 64101 upto current CC event 64102 upto:64101\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007327Z 1 [Note] [MY-000000] [Galera] Drain monitors from 64101 up to 64101\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007342Z 1 [Note] [MY-000000] [Galera] ####### My UUID: 87e990f5-f2be-11ef-971d-0b935e518ea7\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007354Z 1 [Note] [MY-000000] [Galera] Skipping cert index reset\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007366Z 1 [Note] [MY-000000] [Galera] REPL Protocols: 11 (6)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007380Z 1 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 64101 -> 64102\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:58 {"log":"2025-02-25T14:51:58.007474Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:58.774263Z 1 [Note] [MY-000000] [Galera] Server prod-pxc-1 synced with group\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:58.875330Z 1 [Note] [MY-000000] [WSREP] Server status change connected -> synced\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:58.875385Z 1 [Note] [MY-000000] [WSREP] Synchronized with group, ready for connections\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:58.875403Z 1 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:58.972348Z 1 [Note] [MY-000000] [Galera] ================================================\nView:\n  id: a8f6d309-ea2d-11ef-8c7a-ca1e9f56d5d4:64102\n  status: primary\n  protocol_version: 4\n  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO\n  final: no\n  own_index: 1\n  members(3):\n\t0: 130c6a08-ea2e-11ef-a789-7382c085f799, prod-pxc-2\n\t1: 87e990f5-f2be-11ef-971d-0b935e518ea7, prod-pxc-1\n\t2: a8f5ff06-ea2d-11ef-87ad-6651859379dd, prod-pxc-0\n=================================================\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:59.011367Z 1 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:59.290210Z 1 [Note] [MY-000000] [Galera] Recording CC from group: 64102\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:51:59 {"log":"2025-02-25T14:51:59.416093Z 1 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from group: 64007\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:52:00 {"log":"2025-02-25T14:51:59.416174Z 1 [Note] [MY-000000] [Galera] Min available from gcache for CC from group: 3585\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:29.206613Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection to peer 130c6a08-a789 with addr ssl://10.50.14.153:4567 timed out, no messages seen in PT3S, socket stats: rtt: 49865 rttvar: 26961 rto: 250000 lost: 0 last_data_recv: 3130 cwnd: 4 last_queued_since: 121227748 last_delivered_since: 3129538087 send_queue_length: 0 send_queue_bytes: 0 segment: 0 messages: 0 (gmcast.peer_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:29.207014Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: ssl://10.50.14.153:4567 \n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:31.534005Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') reconnecting to 130c6a08-a789 (ssl://10.50.14.153:4567), attempt 0\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:31.534477Z 0 [Warning] [MY-000000] [Galera] last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT2.82809S), skipping check\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:31.615892Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection established to 130c6a08-a789 ssl://10.50.14.153:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:31.668941Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection established to 130c6a08-a789 ssl://10.50.14.153:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:35.027794Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') turning message relay requesting off\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:40.529880Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection to peer a8f5ff06-87ad with addr ssl://10.50.48.17:4567 timed out, no messages seen in PT3S, socket stats: rtt: 59766 rttvar: 53637 rto: 380000 lost: 0 last_data_recv: 3428 cwnd: 4 last_queued_since: 149143617 last_delivered_since: 3427982998 send_queue_length: 0 send_queue_bytes: 0 segment: 0 messages: 0 (gmcast.peer_timeout)\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:42 {"log":"2025-02-25T14:53:40.822783Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: ssl://10.50.48.17:4567 \n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:43 {"log":"2025-02-25T14:53:41.973886Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') connection established to a8f5ff06-87ad ssl://10.50.48.17:4567\n","file":"/var/lib/mysql/mysqld-error.log"}
prod-pxc-1 logs 02-25 14:53:55 {"log":"2025-02-25T14:53:45.323058Z 0 [Note] [MY-000000] [Galera] (87e990f5-971d, 'ssl://0.0.0.0:4567') turning message relay requesting off\n","file":"/var/lib/mysql/mysqld-error.log"}
mysql> pager grep error
PAGER set to 'grep error'
mysql> show replica status\G
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '803ecab2-5eed-11ef-80d9-2e1a74bb9946:1043850' at source log mysql-bin.000104, end_log_pos 2406919. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
               Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '803ecab2-5eed-11ef-80d9-2e1a74bb9946:1043850' at source log mysql-bin.000104, end_log_pos 2406919. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
1 row in set (0.00 sec)

Replication tried to auto-restart, but the WSREP components were not in primary state.

declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)

network timeouts.

[WSREP] Server status change connected → connected

The server connected back, but it hasn’t sync’d it’s state, thus replication cannot resume.

I might consider filing a bug report on this https://jira.percona.com as the process which should restart replication should pay attention to the state of wsrep, and wait for it to be in PRIMARY before attempting to restart replication.

Thanks for clarification. When I looked at parameter, I assumed it will restart replica only if cluster state is “Synced”, which is clearly not the case.

Thank you for raising jira, much appreciated. I assume you mean this is more of PXC bugfix rather than pxc-operator?

Also, do you have any other suggestion that I could try for my initial problem?
I could do a sidecar which would check replication peridocally, but unfortunately I couldn’t find the way to figure out which pod has label percona.com/replicationPod (since its added later and not at pod creation) and restart replica there.

Just to clarify, I did not create the report. You will need to fill out a new bug report on this, since you have the repeatable test case.