ALTER TABLE locks up cluster

I have a 5-node PXC 5.5.39 on CentOS 6, in a single-write/multiple-read configuration with wsrep_sync_wait=1 for synchronous replication.

Each time I run an ALTER TABLE the entire cluster “locks up”, refuses new connections (even though there are plenty more available on all nodes) and drops existing connections until the operation is finished. It doesn’t matter if it’s a long or short operation, the database or table I run it on, it always does this even if I add a table index which takes only 0.5 sec.

Before enabling wsrep_sync_wait, the cluster behaved normally, even in case of long ALTER’s it accepted new connections and the other ones which ran on the altered table just waited in locked state for the alter to finish and then they resumed.

Some of the relevant wsrep and innodb settings:


[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
default-storage-engine=InnoDB

datadir=/var/lib/mysql
tmpdir=/var/tmp/

socket=/var/lib/mysql/mysql.sock
user=mysql
skip-name-resolve
event_scheduler=OFF

# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
old_passwords=0

# connections
max_connections=700
connect_timeout=10
wait_timeout=300
interactive_timeout=300
lock_wait_timeout=60

# table caches
table_open_cache=2048
table_definition_cache=2048
tmp_table_size=128M
max_heap_table_size=128M
thread_cache_size=256
max_allowed_packet=32M
net_buffer_length=8K

# operating buffers
key_buffer_size=128M
sort_buffer_size=4M
myisam_sort_buffer_size=4M
join_buffer_size=8M
read_buffer_size=4M
read_rnd_buffer_size=8M
query_cache_type=0

server-id=2
wsrep_cluster_name=xxx
wsrep_node_address=10.0.0.1
wsrep_cluster_address=gcomm://10.0.0.1,10.0.0.3,10.0.0.5,10.0.0.6,10.0.0.7
wsrep_provider_options='gcache.size=512M'
wsrep_node_name=server2
wsrep_provider='/usr/lib64/libgalera_smm.so'
wsrep_slave_threads=1
wsrep_auto_increment_control=1
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=xxx:xxx
wsrep_retry_autocommit=2
wsrep_forced_binlog_format=ROW
wsrep_replicate_myisam=0
wsrep_log_conflicts=1
wsrep_sync_wait=1

log-bin=/var/lib/mysql/binary-log
max_binlog_size=256M
sync_binlog=0
expire_logs_days=5
sync_master_info=0
log_bin_trust_function_creators=1
binlog_format=ROW
log_slave_updates=1


innodb_data_home_dir=/var/lib/mysql/
innodb_data_file_path=ibdata1:10M:autoextend
innodb_autoextend_increment=10
innodb_log_group_home_dir=/var/lib/mysql/
innodb_thread_concurrency=0
innodb_checksums
innodb_flush_log_at_trx_commit=2
innodb_fast_shutdown=0
innodb_mirrored_log_groups=1
innodb_log_files_in_group=3
innodb_log_file_size=128M
innodb_log_buffer_size=8M
innodb_buffer_pool_instances=8
innodb_buffer_pool_size=8G
innodb_additional_mem_pool_size=64M
innodb_lock_wait_timeout=60
innodb_flush_method=O_DIRECT
innodb_file_per_table
innodb_stats_on_metadata=0
innodb_file_format=barracuda
innodb_stats_update_need_lock=0
innodb_purge_threads=1

default_week_format = 1

Do you have any ideas what could be causing this?

Thanks.