Cluster 8.0.32 fails after binlog rotate

Hi,

we have a cluster setup which worked like a charm.

Yesterday we made the update

from percona-xtradb-cluster-server-8.0.31-23.1.el7.x86_64

to percona-xtradb-cluster-server-8.0.32-24.1.el7.x86_64

Afterwards we got into some trouble:

2023-04-18T19:58:37.747113Z 180333 [ERROR] [MY-011072] [Server] Binary logging not possible. Message: Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened., while
 rotating the binlog. Aborting the server.
2023-04-18T19:58:37Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=dc7878f0a8487d1a97285fe8a3604caea70667a4
Server Version: 8.0.32-24.1 Percona XtraDB Cluster (GPL), Release rel24, Revision 793b5d9, WSREP version 26.1.4.3, wsrep_26.1.4.3

Thread pointer: 0x7fa1880d0360
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa3f8058b30 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x22014ed]
/usr/sbin/mysqld(print_fatal_signal(int)+0x39f) [0x123d6af]
/usr/sbin/mysqld(my_server_abort()+0x7e) [0x123d86e]
/usr/sbin/mysqld(my_abort()+0xa) [0x21fb4aa]
/usr/sbin/mysqld() [0x1dfc779]
/usr/sbin/mysqld(MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*)+0x692) [0x1e0ef52]
/usr/sbin/mysqld(MYSQL_BIN_LOG::rotate(bool, bool*)+0x6c) [0x1e0fcfc]
/usr/sbin/mysqld(MYSQL_BIN_LOG::ordered_commit(THD*, bool, bool)+0x2bd) [0x1e1982d]
/usr/sbin/mysqld(MYSQL_BIN_LOG::commit(THD*, bool)+0x127e) [0x1e1bf4e]
/usr/sbin/mysqld(ha_commit_trans(THD*, bool, bool)+0x4b1) [0xd8dc81]
/usr/sbin/mysqld(trans_commit(THD*, bool)+0x6b) [0x11e8ddb]
/usr/sbin/mysqld(mysql_execute_command(THD*, bool)+0x3b93) [0x10b0bb3]
/usr/sbin/mysqld(dispatch_sql_command(THD*, Parser_state*, bool)+0x5c0) [0x10b3970]
/usr/sbin/mysqld() [0x10b3f9b]
/usr/sbin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x3757) [0x10b8597]
/usr/sbin/mysqld(do_command(THD*)+0x200) [0x10b8c30]
/usr/sbin/mysqld() [0x122c5b8]
/usr/sbin/mysqld() [0x26da055]
/lib64/libpthread.so.0(+0x7ea5) [0x7fa925435ea5]
/lib64/libc.so.6(clone+0x6d) [0x7fa9237f0b0d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fa1881f5220): COMMIT
Connection ID (thread ID): 180333
Status: NOT_KILLED

You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.

/etc/my.cnf

# Template my.cnf for PXC
# Edit to your requirements.
[client]
socket=/var/lib/mysql/mysql.sock

[mysqld]
enforce_gtid_consistency = ON
gtid_mode       = ON
server-id       = 175021
datadir         = /var/lib/mysql
socket          = /var/lib/mysql/mysql.sock
log-error       = /var/log/mysqld.log
pid-file        = /var/run/mysqld/mysqld.pid

sql_mode        = NO_ENGINE_SUBSTITUTION
# disable json
mysqlx          = 0

max_connections = 3000

# Binary log expiration period is 604800 seconds, which equals 7 days
skip-log-bin  # <--- WORKARAOUND
binlog_expire_logs_seconds      = 604800
max_binlog_size                 = 100M

event_scheduler         = off
key_buffer_size         = 16M
max_allowed_packet      = 16M
thread_stack            = 256K
thread_cache_size       = 8

join_buffer_size        = 32M
sort_buffer_size        = 256k

innodb_flush_log_at_trx_commit  = 2
innodb_file_per_table           = 1
innodb_buffer_pool_size         = 24G

######## wsrep ###############
# Path to Galera library
wsrep_provider=/usr/lib64/galera4/libgalera_smm.so

# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
wsrep_cluster_address=gcomm://172.23.175.21,172.23.175.22,172.23.175.23

# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW

# Slave thread to use
#wsrep_slave_threads=8
wsrep_applier_threads=8

wsrep_log_conflicts

# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2

# Node IP address
wsrep_node_address=172.23.175.21
# Cluster name
wsrep_cluster_name=db-cluster

#If wsrep_node_name is not specified,  then system hostname will be used
wsrep_node_name=db1

#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
pxc_strict_mode=PERMISSIVE

wsrep_sst_method=xtrabackup-v2

# disable encryption
pxc_encrypt_cluster_traffic = OFF

replicate_do_db=<READACTED>
replicate_ignore_db=mysql

# used for pmm2
log_output=file
slow_query_log=ON
long_query_time=0
log_slow_rate_limit=100
log_slow_rate_type=query
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_replica_statements=ON
slow_query_log_always_write_time=1
slow_query_log_use_global_control=all
innodb_monitor_enable=all
userstat=1
innodb_monitor_enable=all
performance_schema=ON

Please note the skip-log-bin as a workaround

Are there any changes between 8.0.31 and 8.0.32 which leads to this error?

The cluster is running on Centos 7.9.2009

Any advice on solving the problem?

With kind regards,
Bernd Brodda.

Hi @b.brodda
Could you please tell a bit more how did you perform the upgrade for PXC cluster?

Hi @Evgeniy_Patlan

update as usual for the last years:

member 1:

systemctl stop mysql
yum update
systemctl start mysql

Waiting for sync
update member 2

Waitung for sync
update member 3

with best regards,
Bernd.

Hit the same issue:

2023-04-25T08:29:44.933996Z 286956 [ERROR] [MY-011072] [Server] Binary logging not possible. Message: Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened., while rotating the binlog. Aborting the server.
2023-04-25T08:29:44Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=dc7878f0a8487d1a97285fe8a3604caea70667a4
Server Version: 8.0.32-24.1 Percona XtraDB Cluster (GPL), Release rel24, Revision 793b5d9, WSREP version 26.1.4.3, wsrep_26.1.4.3

Thread pointer: 0x7f6454034970
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f6476ee2b30 thread_stack 0x100000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x22014ed]
/usr/sbin/mysqld(print_fatal_signal(int)+0x39f) [0x123d6af]
/usr/sbin/mysqld(my_server_abort()+0x7e) [0x123d86e]
/usr/sbin/mysqld(my_abort()+0xa) [0x21fb4aa]
/usr/sbin/mysqld() [0x1dfc779]
/usr/sbin/mysqld(MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*)+0x692) [0x1e0ef52]
/usr/sbin/mysqld(MYSQL_BIN_LOG::rotate(bool, bool*)+0x6c) [0x1e0fcfc]
/usr/sbin/mysqld(MYSQL_BIN_LOG::ordered_commit(THD*, bool, bool)+0x2bd) [0x1e1982d]
/usr/sbin/mysqld(MYSQL_BIN_LOG::commit(THD*, bool)+0x127e) [0x1e1bf4e]
/usr/sbin/mysqld(ha_commit_trans(THD*, bool, bool)+0x4b1) [0xd8dc81]
/usr/sbin/mysqld(trans_commit(THD*, bool)+0x6b) [0x11e8ddb]
/usr/sbin/mysqld(mysql_execute_command(THD*, bool)+0x3b93) [0x10b0bb3]
/usr/sbin/mysqld(dispatch_sql_command(THD*, Parser_state*, bool)+0x5c0) [0x10b3970]
/usr/sbin/mysqld() [0x10b3f9b]
/usr/sbin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x3757) [0x10b8597]
/usr/sbin/mysqld(do_command(THD*)+0x200) [0x10b8c30]
/usr/sbin/mysqld() [0x122c5b8]
/usr/sbin/mysqld() [0x26da055]
/lib64/libpthread.so.0(+0x7ea5) [0x7f65ce1d9ea5]
/lib64/libc.so.6(clone+0x6d) [0x7f65cc594b0d]

The upgrade works like a charm

2023-04-26T07:58:20.901443Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /var/run/mysqld/mysqlx.sock
2023-04-26T07:58:20.901463Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.31'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server - GPL.
2023-04-26T07:59:01.623435Z 0 [System] [MY-013172] [Server] Received SHUTDOWN from user <via user signal>. Shutting down mysqld (Version: 8.0.31).
2023-04-26T07:59:02.884757Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.31)  MySQL Community Server - GPL.
2023-04-26T07:59:58.287628Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.32) starting as process 52167
2023-04-26T07:59:58.403550Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2023-04-26T07:59:58.529592Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2023-04-26T08:00:00.197005Z 4 [System] [MY-013381] [Server] Server upgrade from '80031' to '80032' started.
2023-04-26T08:00:04.457725Z 4 [System] [MY-013381] [Server] Server upgrade from '80031' to '80032' completed.
2023-04-26T08:00:04.609384Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2023-04-26T08:00:04.609444Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.

No issues were noticed. GTID and binary logs both enabled.
Can you post the procedure that you ran for the upgrade?

I tried the below one but could not reproduce,

systemctl stop mysqld
yum update mysql-community-server-8.0.32
systemctl start mysqld

The upgrade was ok. I did upgrade as the documentation says(node by node, waiting for node sync, etc)
Percona was working ok a couple of days. The issue happened when the log rotated(that as the mysql.log says).

Same here.

Update without any problems/warnings.

The crash occurs when the binlog rotates. In our setup after about 30 Minutes.

After a bootstrap the cluster works again but only till the next rotation.

with the

skip-log-bin

everything is running (except the replica due to missing binlogs)

Thats true, I was able to reproduce the error and filed a bug. I missed seeing if it was xtradb cluster.

https://jira.percona.com/browse/PXC-4211

We will see if there is a workaround. Thanks for reporting it.

Hi,

we updated to

percona-xtradb-cluster-server-8.0.32-24.2.el7.x86_64

and enabled the binlog and gtid_mode=on

For now it is running without any problems.

Thank you.

regards,
Bernd.

1 Like

Yes, the fix was released. Great to hear that worked for you.