Hi again
the update did not change anything for me. The error happened again.
I am able to reproduce the problem somewhat reliably now (doesn’t occur 100% of the time, but about every other time). It seems to happen if the server is rebooted under load.
My setup to reproduce is to run sysbench towards data node 1 and then, while sysbench is still running, reboot the machine.
sysbench --tables=100 --table_size=100000 --time=600 --mysql-user=root --mysql-db=sysbench --mysql-ignore-errors=all oltp_write_only run
This reproduction gives more log. While the production system log ends at
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
the test system does manage to produce the backtrace. Here it is:
2025-02-05T09:38:32.499845Z 0 [System] [MY-013172] [Server] Received SHUTDOWN from user <via user signal>. Shutting down mysqld (Version: 8.0.40-31.1).
2025-02-05T09:38:32.499883Z 0 [Note] [MY-000000] [WSREP] Received shutdown signal. Will sleep for 10 secs before initiating shutdown. pxc_maint_mode switched to SHUTDOWN
2025-02-05T09:38:42.507635Z 0 [Note] [MY-000000] [WSREP] Shutdown replication
2025-02-05T09:38:42.508228Z 0 [Note] [MY-000000] [WSREP] Server status change synced -> disconnecting
2025-02-05T09:38:42.508388Z 0 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2025-02-05T09:38:42.508922Z 0 [Note] [MY-000000] [Galera] Closing send monitor...
2025-02-05T09:38:42.509037Z 0 [Note] [MY-000000] [Galera] Closed send monitor.
2025-02-05T09:38:42.509226Z 0 [Note] [MY-000000] [Galera] gcomm: terminating thread
2025-02-05T09:38:42.509248Z 0 [Note] [MY-000000] [Galera] gcomm: joining thread
2025-02-05T09:38:42.509873Z 0 [Note] [MY-000000] [Galera] gcomm: closing backend
2025-02-05T09:38:42.511036Z 553 [ERROR] [MY-000000] [Galera] FSM: no such a transition APPLYING -> COMMITTED
2025-02-05T09:38:42.511080Z 553 [Note] [MY-000000] [WSREP] Initiating SST cancellation
2025-02-05T09:38:42Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=924bf8212e79f806f39a524ce1be6bbbbee2b634
Server Version: 8.0.40-31.1 Percona XtraDB Cluster (GPL), Release rel31, Revision 4b32153, WSREP version 26.1.4.3, wsrep_26.1.4.3
Thread pointer: 0x7f2a441c85d0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
2025-02-05T09:38:42.511987Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,989bf1a3-8ada,139)
memb {
989bf1a3-8ada,0
}
joined {
}
left {
}
partitioned {
9a535999-ab28,0
d99e331e-be18,0
}
)
2025-02-05T09:38:42.512211Z 0 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0
2025-02-05T09:38:42.512224Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2025-02-05T09:38:42.512553Z 0 [Note] [MY-000000] [Galera] gcomm: closed
2025-02-05T09:38:42.512624Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2025-02-05T09:38:42.512836Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]
2025-02-05T09:38:42.512847Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.
2025-02-05T09:38:42.512854Z 0 [Note] [MY-000000] [Galera] Shifting SYNCED -> OPEN (TO: 7867188)
2025-02-05T09:38:42.512864Z 0 [Note] [MY-000000] [Galera] New SELF-LEAVE.
2025-02-05T09:38:42.512876Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [0, 0]
2025-02-05T09:38:42.512881Z 0 [Note] [MY-000000] [Galera] Received SELF-LEAVE. Closing connection.
2025-02-05T09:38:42.512887Z 0 [Note] [MY-000000] [Galera] Shifting OPEN -> CLOSED (TO: 7867188)
2025-02-05T09:38:42.512897Z 0 [Note] [MY-000000] [Galera] RECV thread exiting 0: Success
2025-02-05T09:38:42.513248Z 0 [Note] [MY-000000] [Galera] recv_thread() joined.
2025-02-05T09:38:42.513271Z 0 [Note] [MY-000000] [Galera] Closing send queue.
2025-02-05T09:38:42.513277Z 0 [Note] [MY-000000] [Galera] Closing receive queue.
2025-02-05T09:38:42.513400Z 11 [Note] [MY-000000] [Galera] Maybe drain monitors from 7867187 upto current CC event 7867188 upto:7867188
2025-02-05T09:38:42.513412Z 11 [Note] [MY-000000] [Galera] Drain monitors from 7867187 up to 7867188
stack_bottom = 7f2a31ffabc0 thread_stack 0x100000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x56213043fa9e]
/usr/sbin/mysqld(print_fatal_signal(int)+0x29a) [0x56212f526a5a]
/usr/sbin/mysqld(handle_fatal_signal+0xd7) [0x56212f526c27]
/lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x7f2b34a5b050]
/lib/x86_64-linux-gnu/libc.so.6(+0x8aebc) [0x7f2b34aa9ebc]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x12) [0x7f2b34a5afb2]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7f2b34a45472]
/usr/lib/galera4/libgalera_smm.so(+0x54fc2) [0x7f2b28054fc2]
/usr/lib/galera4/libgalera_smm.so(+0x71ffd) [0x7f2b28071ffd]
/usr/lib/galera4/libgalera_smm.so(+0x57215) [0x7f2b28057215]
/usr/sbin/mysqld(wsrep::wsrep_provider_v26::commit_order_leave(wsrep::ws_handle const&, wsrep::ws_meta const&, wsrep::mutable_buffer const&)+0x6d) [0x562130d760ad]
/usr/sbin/mysqld(wsrep::transaction::ordered_commit()+0xb9) [0x562130d6e439]
/usr/sbin/mysqld(trx_commit_complete_for_mysql(trx_t*)+0x4e) [0x56213066fcfe]
/usr/sbin/mysqld(+0x233a583) [0x5621304a6583]
/usr/sbin/mysqld(ha_commit_low(THD*, bool, bool)+0x496) [0x56212f0a0306]
/usr/sbin/mysqld(trx_coordinator::commit_in_engines(THD*, bool, bool)+0x2d) [0x56212f4d304d]
/usr/sbin/mysqld(wsrep_thd_binlog_commit(THD*, bool)+0x4c) [0x56213009695c]
/usr/sbin/mysqld(ha_commit_trans(THD*, bool, bool)+0x5b9) [0x56212f0a1d39]
/usr/sbin/mysqld(trans_commit(THD*, bool)+0x4b) [0x56212f4d622b]
/usr/sbin/mysqld(mysql_execute_command(THD*, bool)+0x2903) [0x56212f3a7423]
/usr/sbin/mysqld(Prepared_statement::execute(THD*, String*, bool)+0x64d) [0x56212f3d9ead]
/usr/sbin/mysqld(Prepared_statement::execute_loop(THD*, String*, bool)+0x125) [0x56212f3de775]
/usr/sbin/mysqld(mysqld_stmt_execute(THD*, Prepared_statement*, bool, unsigned long, PS_PARAM*)+0x1c4) [0x56212f3dee64]
/usr/sbin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x22a6) [0x56212f3ae046]
/usr/sbin/mysqld(do_command(THD*)+0x297) [0x56212f3afc57]
/usr/sbin/mysqld(+0x13ab080) [0x56212f517080]
/usr/sbin/mysqld(+0x2735f40) [0x5621308a1f40]
/lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7f2b34aa81c4]
/lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7f2b34b2885c]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f2a443e56e8): COMMIT
Connection ID (thread ID): 553
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
Any idea on what to do next?
Thanks
Dominik