Percona Cluster node goes down.

alecv · April 3, 2014, 1:58am

Hi there,

I have PXC setup in Amazon VPC, all nodes are in same region but one node from three is in different availability zone. One some point in time one node fails without any meaningful output in logs:


2014-04-03 01:30:38 8514 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.68236S), skipping check
2014-04-03 01:30:40 8514 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.53889S), skipping check
140403 03:30:39 mysqld_safe Number of processes running now: 0
140403 03:30:39 mysqld_safe WSREP: not restarting wsrep node automatically
140403 03:30:39 mysqld_safe mysqld from pid file /var/lib/mysql/ip-10-1-7-180.pid ended

This node is the one that in another availability zone.

This is my.cnf file:


[mysqld]
datadir=/var/lib/mysql
user=mysql
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_address=gcomm://10.1.7.180,10.1.8.159,10.1.8.16
binlog_format=ROW
default_storage_engine=InnoDB
innodb_locks_unsafe_for_binlog=1
innodb_buffer_pool_size = 5632M
innodb_log_buffer_size = 4M
max_connect_errors = 10000
key_buffer_size = 2048M
max_allowed_packet = 50M
table_open_cache = 1024
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 80M
myisam_sort_buffer_size = 64M
thread_cache_size = 32
query_cache_size = 32M
innodb_thread_concurrency = 8
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G
innodb_autoinc_lock_mode=2
wsrep_node_address=10.1.7.180
wsrep_sst_method=xtrabackup
wsrep_cluster_name=my_centos_cluster
wsrep_sst_auth="sstuser:s3cret"
max_connections = 4000
[mysql]
prompt=\\u&#64;\\h [\\d]>\\_

The question is how can I investigate the root cause of the failure please? Also another question, what would be if update query will arrive on the node that is in “Joining: receiving State Transfer” state

Thank you in advance.

przemek · April 23, 2014, 3:46am

A message like “140403 03:30:39 mysqld_safe Number of processes running now: 0” without anything logged by mysql prior to that, means your mysqld process was killed, most likely by OOMkiller. Check the system log.
Joining node will refuse to accept connections until it synchronizes with cluster.

Topic		Replies	Views
Cluster Node crached with strange error Percona XtraDB Cluster 5.x	1	734	October 12, 2013
Percona if down two node Percona XtraDB Cluster 8.x	6	418	February 7, 2024
All nodes in the cluster becomes inaccessible Percona XtraDB Cluster 5.x	9	5455	July 31, 2014
[Warning] WSREP: last inactive check more than PT1.5S ago, skipping check Percona XtraDB Cluster 5.x	1	4900	July 28, 2015
Help my percona xtradb cluster stuck Percona XtraDB Cluster 5.x	1	1298	October 4, 2021

Percona Cluster node goes down.

Related topics