Hi,
So let’s start with my setup:
- 3 nodes pxc cluster (up to date CentOS 7 and Percona-XtraDB-Cluster-57 5.7.19-29.22.1.el7)
mysql-test-01 10.0.0.51
mysql-test-02 10.0.0.52
mysql-test-03 10.0.0.53 - I have followed all the guidelines for CentOS (setenforce 0, …)
- I bootstraped the cluster from the node “mysql-test-01” 10.0.0.51 using “systemctl start mysql@bootstrap.service”
- Other nodes joined the cluster without problem
- The cluster was working
I did some tests, I gracefully shutdown node “10.0.0.53”, the cluster kept working, I restarted it with “systemctl start mysql”. It rejoined the cluster and recovered.
I did the same thing (graceful shutdown) with node “10.0.0.51”. Everything OK
The problem occured when I “brutally” shutdown the node “10.0.0.51”. I then started again the virtual machine, and then I use “systemctl start mysql.service” to start the node. It should start and automatically recover but it’s not working. I am pretty sure this is related to “systemd” but I am confused because I didn’t touch the default configuration that come with the package. I was hoping a working systemd configuration.
More informations:
configuration of node “10.0.0.51”
---------- /etc/my.cnf ----------
#
# The Percona XtraDB Cluster 5.7 configuration file.
#
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
# Please make any edits and changes to the appropriate sectional files
# included below.
#
!includedir /etc/my.cnf.d/
!includedir /etc/percona-xtradb-cluster.conf.d/
----------------------------------------
---------- /etc/percona-xtradb-cluster.conf.d/wsrep.cnf ----------
[mysqld]
# Path to Galera library
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so
# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
wsrep_cluster_address=gcomm://10.0.0.51,10.0.0.52,10.0.0.53
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB
# Slave thread to use
wsrep_slave_threads= 8
wsrep_log_conflicts
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node IP address
wsrep_node_address=10.0.0.51
# Cluster name
wsrep_cluster_name=pxc-cluster-test
#If wsrep_node_name is not specified, then system hostname will be used
wsrep_node_name=mysql-test-01
#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
pxc_strict_mode=ENFORCING
# SST method
wsrep_sst_method=xtrabackup-v2
#Authentication for SST method
wsrep_sst_auth="sstuser:password"
----------------------------------------
---------- /etc/percona-xtradb-cluster.conf.d/mysqld.cnf ----------
# MYSQL-TEST-01
# Template my.cnf for PXC
# Edit to your requirements.
[client]
socket=/var/lib/mysql/mysql.sock
[mysqld]
server-id=51
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
log-bin
log_slave_updates
expire_logs_days=7
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Custom config
max-allowed-packet=64M
#
# InnoDB settings
#
innodb_file_per_table
innodb_buffer_pool_size=1024M
innodb_log_file_size=256M
innodb_flush_log_at_trx_commit=1
innodb_flush_method=O_DIRECT
--------------------------------------------------
And finally an extract from systemd journalctl
----------------------------------------
[root@mysql-test-01 ~]# systemctl start mysql.service
Job for mysql.service failed because the control process exited with error code. See "systemctl status mysql.service" and "journalctl -xe" for details.
[root@mysql-test-01 ~]# journalctl -u mysql.service
....
Oct 02 17:21:39 mysql-test-01 systemd[1]: Starting Percona XtraDB Cluster...
Oct 02 17:21:40 mysql-test-01 mysqld_safe[2761]: 2017-10-02T15:21:40.081340Z mysqld_safe Logging to '/var/log/mysqld.log'.
Oct 02 17:21:40 mysql-test-01 mysqld_safe[2761]: 2017-10-02T15:21:40.085011Z mysqld_safe Logging to '/var/log/mysqld.log'.
Oct 02 17:21:40 mysql-test-01 mysqld_safe[2761]: 2017-10-02T15:21:40.119964Z mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Oct 02 17:21:40 mysql-test-01 mysqld_safe[2761]: 2017-10-02T15:21:40.140406Z mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.IG3oHj' --pid-file='/var/lib/mysql/mysql-test-01-recover.pid'
Oct 02 17:21:43 mysql-test-01 mysql-systemd[2762]: /usr/bin/mysql-systemd: line 140: kill: (2761) - No such process
Oct 02 17:21:43 mysql-test-01 mysql-systemd[2762]: ERROR! mysqld_safe with PID 2761 has already exited: FAILURE
Oct 02 17:21:43 mysql-test-01 systemd[1]: mysql.service: control process exited, code=exited status=1
Oct 02 17:21:43 mysql-test-01 mysql-systemd[3332]: WARNING: mysql pid file /var/lib/mysql/mysqld.pid empty or not readable
Oct 02 17:21:43 mysql-test-01 mysql-systemd[3332]: ERROR! mysql already dead
Oct 02 17:21:43 mysql-test-01 systemd[1]: mysql.service: control process exited, code=exited status=2
----------------------------------------
I have read a lot about systemd problems with mysql and pxc, but I can’t find a clue about mine. I am stuck because I have simulated a failure and I can’t restart my cluster, it’s a dev environnement but I won’t continue further until I find a solution.
Any ideas ?
Regards,