Not the answer you need?
Register and ask your own question!

Cluster stall during flush logs or auto binlog rotation

NxwJfmNxwJfm EntrantCurrent User Role Beginner
Hi.

I've been using Percona XtraDB Cluster (currently 5.6.24-72.2 on debian 7.0), and for a while now I have been investigating big (60 seconds) regular stalls on the cluster. These stalls happen multiple times a day, and I thought I had one or many rogue scripts doings huge one-line updates that would create this, so only investigated this. But yesterday I noticed a trend: All the querie time I was investigating in the binlog files were always at the end of the file. And lo and behold, seemingly all my stalls happen exactly at the same time as when a binlog file reaches the 100 MB size they are configured at, are when logrotate does the flush-logs at 6:25 in the morning.

I've changed the binlog file size to 1GB as a test, and indeed I now have 10 times less stalls than before. I have also tried putting sync_binlog to 1, thinking maybe it was a flush issue, and while it seemed to have helped a lot yesterday evening when the binlog file reached 1 GB (no stall happened, or it was short enough to be visible in the slow log at 5s), I still got my usual 60s stall this morning at 6:25.

I'm now a bit at a loss of ideas, so I'm now asking you guys what you think could be happening. By moving to a big 1GB file I have at least made it so the stalls could happen way less often than before, but it's still not a good thing when it happens. Since I'm not sure it's something I did wrong, or if it's a bug (and if it's been fixed already are not, and it's difficult for me to upgrade without any good reasons), I'd rather not open a pointless bug report :)

Here is some (hopefully) relevant stuff from the my.cnf file:
log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 2
max_binlog_size = 1G
sync_binlog = 1
binlog_row_image = minimal
log-slave-updates
gtid_mode = ON
gtid_deployment_step = ON
enforce_gtid_consistency = ON

wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_cluster_address=gcomm://(removed)
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_node_address=(removed)
wsrep_sst_method=xtrabackup-v2
wsrep_cluster_name=nexway_xtra_cluster
wsrep_sst_auth=(removed)""
wsrep_auto_increment_control = OFF
wsrep_notify_cmd = /usr/local/bin/galeranotify.py
wsrep_slave_threads = 16
wsrep_provider_options="gcache.size = 5G; gcs.fc_limit=500; gcs.fc_master_slave=YES; gcs.fc_factor=1.0"

Thanks in advance for your help.

Comments

  • dcampanodcampano Entrant Current User Role Beginner
    You never received a response to this but I am seeing the same exact issue on Percona 5.7.10-3. Did you ever find anything that fixed this?
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.