We have recently seen this type of error in our various Percona clusters on different nodes. These is the logs from the most recent one. Essentially it sets wsrep_ready to OFF and then ejects the node from the cluster. Would this happen to be a known issue/bug with the software? Is anyone else encountering this?
2024-10-14T13:49:48.289577Z 23 [ERROR] [MY-000000] [Galera] Failed to apply write set: gtid: bf0fd310-ddc1-11ee-b08a-56f7c4b63281:672824661 server_id: d6636fa0-7543-11ef-98e6-333d82fcb296 client_id: 18446744073709551615 trx_id: 601552680 flags: 20 (rollback | pa_unsafe)
2024-10-14T13:49:48.291474Z 23 [Note] [MY-000000] [Galera] Closing send monitor…
2024-10-14T13:49:48.291489Z 23 [Note] [MY-000000] [Galera] Closed send monitor.
2024-10-14T13:49:48.292032Z 23 [Note] [MY-000000] [Galera] gcomm: terminating thread
2024-10-14T13:49:48.292529Z 23 [Note] [MY-000000] [Galera] gcomm: joining thread
2024-10-14T13:49:48.293518Z 23 [Note] [MY-000000] [Galera] gcomm: closing backend
2024-10-14T13:49:48.797958Z 23 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,013128a8-9978,126)
memb {
** d6636fa0-98e6,2**
** }**
joined {
** }**
left {
** }**
partitioned {
** 013128a8-9978,0**
** 1b4b870c-bc73,2**
** 317889e9-b66a,1**
** 8aa87f00-8ba4,1**
** }**
)
2024-10-14T13:49:48.798030Z 23 [Note] [MY-000000] [Galera] PC protocol downgrade 1 → 0
2024-10-14T13:49:48.798041Z 23 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2024-10-14T13:49:48.802080Z 23 [Note] [MY-000000] [Galera] gcomm: closed
2024-10-14T13:49:48.802191Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2024-10-14T13:49:48.802259Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [64, 64]
2024-10-14T13:49:48.802265Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.
2024-10-14T13:49:48.802269Z 0 [Note] [MY-000000] [Galera] Shifting SYNCED → OPEN (TO: 672824661)
2024-10-14T13:49:48.802285Z 0 [Note] [MY-000000] [Galera] New SELF-LEAVE.
2024-10-14T13:49:48.802363Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [0, 0]
2024-10-14T13:49:48.802376Z 0 [Note] [MY-000000] [Galera] Received SELF-LEAVE. Closing connection.
2024-10-14T13:49:48.802395Z 0 [Note] [MY-000000] [Galera] Shifting OPEN → CLOSED (TO: 672824661)
2024-10-14T13:49:48.802436Z 0 [Note] [MY-000000] [Galera] RECV thread exiting 0: Success
2024-10-14T13:49:48.802464Z 13 [Note] [MY-000000] [Galera] ================================================
View:
** id: bf0fd310-ddc1-11ee-b08a-56f7c4b63281:672824661**
** status: non-primary**
** protocol_version: 4**
** capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO**
** final: no**
** own_index: 0**
** members(1):**
** 0: d6636fa0-7543-11ef-98e6-333d82fcb296, node1.db3.cluster**
=================================================
2024-10-14T13:49:48.802491Z 13 [Note] [MY-000000] [Galera] Non-primary view
2024-10-14T13:49:48.802498Z 13 [Note] [MY-000000] [WSREP] Server status change synced → connected
2024-10-14T13:49:48.802815Z 23 [Note] [MY-000000] [Galera] recv_thread() joined.
2024-10-14T13:49:48.802834Z 23 [Note] [MY-000000] [Galera] Closing replication queue.
2024-10-14T13:49:48.802841Z 23 [Note] [MY-000000] [Galera] Closing slave action queue.
2024-10-14T13:49:48.803461Z 13 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-10-14T13:49:48.804351Z 13 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-10-14T13:49:48.804394Z 13 [Note] [MY-000000] [Galera] ================================================
View:
** id: bf0fd310-ddc1-11ee-b08a-56f7c4b63281:672824661**
** status: non-primary**
** protocol_version: 4**
** capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO**
** final: yes**
** own_index: -1**
** members(0):**
=================================================
2024-10-14T13:49:48.804401Z 13 [Note] [MY-000000] [Galera] Non-primary view
2024-10-14T13:49:48.804407Z 13 [Note] [MY-000000] [WSREP] Server status change connected → disconnected
2024-10-14T13:49:48.804411Z 13 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-10-14T13:49:48.804417Z 13 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-10-14T13:49:48.804427Z 13 [Note] [MY-000000] [Galera] Waiting 600 seconds for 16 receivers to finish
2024-10-14T13:49:48.812469Z 12 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812499Z 15 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812465Z 19 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812504Z 10 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812609Z 19 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 19
2024-10-14T13:49:48.812507Z 12 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 12
2024-10-14T13:49:48.812508Z 22 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812471Z 18 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812648Z 1 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812655Z 18 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 18
2024-10-14T13:49:48.812544Z 20 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812668Z 1 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 1
2024-10-14T13:49:48.812550Z 16 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812571Z 21 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812693Z 16 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 16
2024-10-14T13:49:48.812701Z 21 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 21
2024-10-14T13:49:48.812593Z 24 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812608Z 14 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812730Z 24 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 24
2024-10-14T13:49:48.812518Z 11 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812786Z 11 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 11
2024-10-14T13:49:48.812634Z 22 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 22
2024-10-14T13:49:48.812537Z 17 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812679Z 20 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 20
2024-10-14T13:49:48.812889Z 17 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 17
2024-10-14T13:49:48.812591Z 15 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 15
2024-10-14T13:49:48.812740Z 14 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 14
2024-10-14T13:49:48.812925Z 23 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 6
2024-10-14T13:49:48.812622Z 10 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 10
2024-10-14T13:49:48.812958Z 23 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 6 thd: 23
2024-10-14T13:49:48.815867Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2024-10-14T13:49:48.815902Z 13 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: 6
2024-10-14T13:49:48.815915Z 13 [Note] [MY-000000] [Galera] Slave thread exit. Return code: 0
2024-10-14T13:49:48.815923Z 13 [Note] [MY-000000] [WSREP] Applier thread exiting ret: 0 thd: 13