Cluster re-sync causes much lock

Hello,

We have a 3-nodes Percona Xtradb Cluster running 5.7.x version with a 300 Go dataset.
We had a crash this afternoon but this is not the issue.

We restarted the whole cluster with single node (boostrap mode), that works.

Then when the second node joins (xtrabackup-v2 method), everything goes fine during the first 200 GB, then the running node stops serving queries and piles-up INSERTS/UPDATE against certain tables (InnoDB) with this state :

wsrep: initiating pre-commit for write set (2493327) 

Is that expected behaviour? How to prevent that and maintain MySQL service available during resync?

config :

wsrep_provider=/usr/lib/galera3/libgalera_smm.so
wsrep_cluster_address=gcomm://
wsrep_slave_threads=16
wsrep_log_conflicts
wsrep_node_address=xxxx
wsrep_sst_donor=xxxx
wsrep_cluster_name=xxxx
#If wsrep_node_name is not specified,  then system hostname will be used
#wsrep_node_name=
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth="sstuser:xxxx"
wsrep_replicate_myisam=ON

Thanks,

Gaëtan

2 Likes

Hi @Gaetan_Allart welcome to the Percona Forums!
In order to troubleshoot this issue we’ll need to see the error logs from all three members of the PXC cluster. Can you provide the entries from slightly before through to slightly after the SST?
You’re right to assume this shouldn’t happen. In your scenario, node1 is brought up in bootstrap mode, and node2 should IST or SST from node1. All the while node1 should be serving queries. So I am interested to see the log events in order to diagnose this further. Thanks!

1 Like

I got the same issue too, I have a large table in the database (about 200GB) and running a 3 nodes cluster. When any new node join to the cluster, the donor node is stuck at “wsrep: initiating pre-commit for write set” for any write query (read query run properly). Other nodes (not the donor) serve write queries well.

1 Like