Percona XtraDB Cluster crashes after updating to version 8.0.35-27.1

Hello.

We have a Percona XtraDB Cluster with 2 nodes and an arbitrator that has been running successfully for years. Starting a few weeks ago, and apparently coinciding with the update to version 8.0.35-27.1 (from version 8.0.34), the database randomly crashes about once or twice a day.

The database load is very reasonable, queries run fast and without any problem, and we have not made any significant change in the application code recently.

This is the error that is printed on mysql.log:

2024-02-14T04:55:52.828659Z 0 [ERROR] [MY-012872] [InnoDB] [FATAL] Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung.
2024-02-14T04:55:52.828815Z 0 [ERROR] [MY-013183] [InnoDB] Assertion failure: srv0srv.cc:2107:ib::fatal triggered thread 139954514572864
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2024-02-14T04:55:52Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=82e0381828780a878c13947ae9217abad3e30d93
Server Version: 8.0.35-27.1 Percona XtraDB Cluster (GPL), Release rel27, Revision 84d9464, WSREP version 26.1.4.3, wsrep_26.1.4.3

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x100000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x5637c36c2ce1]
/usr/sbin/mysqld(print_fatal_signal(int)+0x39f) [0x5637c26edb2f]
/usr/sbin/mysqld(my_server_abort()+0x7e) [0x5637c26edcee]
/usr/sbin/mysqld(my_abort()+0xe) [0x5637c36bca5e]
/usr/sbin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x33a) [0x5637c392919a]
/usr/sbin/mysqld(ib::fatal::~fatal()+0xc8) [0x5637c392bbd8]
/usr/sbin/mysqld(srv_error_monitor_thread()+0x7c2) [0x5637c38c19a2]
/usr/sbin/mysqld(void Detached_thread::operator()<void (*)()>(void (*&&)())+0xca) [0x5637c37ea2ca]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f531ca48253]
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f531c6d0ac3]
/lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7f531c762850]
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
2024-02-14T04:55:52.846105Z 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation
Log of wsrep recovery (--wsrep-recover):
 INFO: WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery_verbose.FagVnH' --pid-file='/var/lib/mysql/jon-recover.pid'
 INFO: WSREP: Recovered position 75be42dc-c0e9-11ee-90c2-325ccd2c2b0f:523887463

Any ideas? Is there any known bug that could be causing this?

Thanks!
Oscar

Hello @ofrias,
https://perconadev.atlassian.net/browse/PXC-4367
If you can provide a good coredump to this open ticket, that would really help out our enginers.

Thanks a lot for the link to the related issue. I have added these lines to our server configuration and will send the coredump as soon as we get it:

core_file
innodb_buffer_pool_in_core_file=OFF

Be sure to also enable core dumps in your kernel, usually in /etc/security/limits.conf

This article should be helpful:

Quick workaround:
set log_replica_updates=OFF in all servers’ configuration files and restart servers.
Of course this is only an option if PXC cluster is not the replica in async replication chain or/and binlogging of replicated events is not necessary

The fix will be provided in 8.0.36.