ERROR! MySQL (Percona XtraDB Cluster) is not running, but PID file exists

I am trying to startup node two of a three node percona cluster. I am by no means a mysql database administrator, but maybe someone can identify what happened based on this error:


Thread pointer: 0x7f483c000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong…
stack_bottom = 7f48523b8988 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7ee025]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x6c0db4]
/lib64/libpthread.so.0(+0xf710)[0x7f4865b44710]
/lib64/libc.so.6(gsignal+0x35)[0x7f48641a0625]
/lib64/libc.so.6(abort+0x175)[0x7f48641a1e05]
/usr/sbin/mysqld[0x8f7239]
/usr/sbin/mysqld[0x8f85b9]
/usr/sbin/mysqld[0x8fc28b]
/usr/sbin/mysqld[0x8fc53b]
/usr/sbin/mysqld[0x82eaf3]
/usr/sbin/mysqld[0x831fe4]
/usr/sbin/mysqld[0x801c74]
/usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEPK14Relay_log_info+0x1e4)[0x766574]
/usr/sbin/mysqld(_ZN21Update_rows_log_event11do_exec_rowEPK14Relay_log_info+0xa5)[0x766a95]
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x267)[0x76d1c7]
/usr/sbin/mysqld(_Z14wsrep_apply_cbPvPKvmjPK14wsrep_trx_meta+0x6a5)[0x67d0b5]
/usr/lib64/libgalera_smm.so(+0x1a3699)[0x7f48613ce699]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0x273)[0x7f48613cfc13]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM11process_trxEPvPNS_9TrxHandleE+0x45)[0x7f48613d04e5]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_actionRb+0x2dc)[0x7f48613aa86c]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPvRb+0x63)[0x7f48613aaf13]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x93)[0x7f48613ca1e3]
/usr/lib64/libgalera_smm.so(galera_recv+0x23)[0x7f48613dd5d3]
/usr/sbin/mysqld[0x67df11]
/usr/sbin/mysqld(start_wsrep_THD+0x2ee)[0x5215de]
/lib64/libpthread.so.0(+0x79d1)[0x7f4865b3c9d1]
/lib64/libc.so.6(clone+0x6d)[0x7f48642568fd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 2
Status: NOT_KILLED

You may download the Percona XtraDB Cluster operations manual by visiting
[url]http://www.percona.com/software/percona-xtradb-cluster/[/url]. You may find information
in the manual which will help you identify the cause of the crash.
151116 01:03:18 mysqld_safe Number of processes running now: 0
151116 01:03:18 mysqld_safe WSREP: not restarting wsrep node automatically
151116 01:03:18 mysqld_safe mysqld from pid file /var/lib/mysql/web.web.com .pid ended


Looking at the database service on node 2:

I looked at the service
[root@db subsys]# ps ax | grep mysql
25218 pts/0 S+ 0:00 grep mysql


When I try to start the service back up, I get this error message:

[root@db2 mysql]# sudo /etc/init.d/mysql start
ERROR! MySQL (Percona XtraDB Cluster) is not running, but PID file exists


I could not find the specific answer how to fix this issue online (only regarding locked PIDs). Does anyone know what I should do to fix this issue?

Thanks!

provide output of “SHOW STATUS LIKE ‘wsrep%’” from bootstrapped node, also attach a copy of my.cnf and full error log from node2 (use pastebin or gist)

[TABLE=“class: table_results ajax pma_table”]
[TR]
[TD] [/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_local_state_uuid[/TD]
[TD=“class: data text”]a8e8a277-6f03-11e2-0800-5896d9f10d3c[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_protocol_version[/TD]
[TD=“class: data text”]4[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_last_committed[/TD]
[TD=“class: data text”]93828567[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_replicated[/TD]
[TD=“class: data text”]3305034[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_replicated_bytes[/TD]
[TD=“class: data text”]7215828937[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_received[/TD]
[TD=“class: data text”]4690946[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_received_bytes[/TD]
[TD=“class: data text”]10461552800[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_local_commits[/TD]
[TD=“class: data text”]3300961[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_local_cert_failures[/TD]
[TD=“class: data text”]3027[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_local_replays[/TD]
[TD=“class: data text”]0[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_local_send_queue[/TD]
[TD=“class: data text”]0[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_local_send_queue_avg[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_local_recv_queue[/TD]
[TD=“class: data text”]0[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_local_recv_queue_avg[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_flow_control_paused[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_flow_control_sent[/TD]
[TD=“class: data text”]0[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_flow_control_recv[/TD]
[TD=“class: data text”]0[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_cert_deps_distance[/TD]
[TD=“class: data text”]87.147799[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_apply_oooe[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_apply_oool[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_apply_window[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_commit_oooe[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_commit_oool[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_commit_window[/TD]
[TD=“class: data text”]0.000000[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_local_state[/TD]
[TD=“class: data text”]4[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_local_state_comment[/TD]
[TD=“class: data text”]Synced[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_cert_index_size[/TD]
[TD=“class: data text”]1293[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_causal_reads[/TD]
[TD=“class: data text”]0[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_incoming_addresses[/TD]
[TD=“class: data text”]111.111.111.11:1111,111.111.111.111:111[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_cluster_conf_id[/TD]
[TD=“class: data text”]15[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_cluster_size[/TD]
[TD=“class: data text”]2[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_cluster_state_uuid[/TD]
[TD=“class: data text”]a8e8a277-6f03-11e2-0800-5896d9f10d3c[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_cluster_status[/TD]
[TD=“class: data text”]Primary[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_connected[/TD]
[TD=“class: data text”]ON[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_local_bf_aborts[/TD]
[TD=“class: data text”]181[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_local_index[/TD]
[TD=“class: data text”]0[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_provider_name[/TD]
[TD=“class: data text”]Galera[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_provider_vendor[/TD]
[TD=“class: data text”]Codership Oy <info@codership.com>[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_provider_version[/TD]
[TD=“class: data text”]2.11(r318911d)[/TD]
[/TR]
[TR=“class: even”]
[TD=“class: data not_null text”]wsrep_ready[/TD]
[TD=“class: data text”]ON[/TD]
[/TR]
[TR=“class: odd”]
[TD=“class: data not_null text”]wsrep_thread_count[/TD]
[TD=“class: data text”]9[/TD]
[/TR]
[/TABLE]
[root@db02 etc]# cat my.cnf

The following options will be passed to all MySQL clients

[client]
#password = ################
port = 3306
socket = /var/lib/mysql/mysql.sock

The MySQL server

[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock

Set up SSL

ssl-ca = /etc/mysql/certs/ca-cert.pem
ssl-cert = /etc/mysql/certs/server-cert.pem
ssl-key = /etc/mysql/certs/server-key.pem
ssl-cipher = AES256-GCM-SHA384:AES256-SHA256:AES256-SHA

skip-external-locking

log-error = /var/log/mysql/mysqld.log
slow_query_log_file = /var/log/mysql/mysql-slow.log

key_buffer_size = 32M
max_allowed_packet = 16M
table_open_cache = 2048
thread_cache_size = 50
query_cache_type = 0
query_cache_size = 0
table_definition_cache = 4096

tmp_table_size = 32M
max_heap_table_size = 32M
long_query_time = 2
slow_query_log = 1
wait_timeout = 30
interactive_timeout = 300
max_connections = 450
open_files_limit = 65535
innodb_stats_on_metadata = 0

Percona Settings

wsrep_cluster_address = gcomm:(deleted this part)
wsrep_provider = /usr/lib64/libgalera_smm.so
wsrep_slave_threads = 8
wsrep_sst_method = xtrabackup-v2
wsrep_sst_auth =
wsrep_cluster_name = the_web
wsrep_node_name = web2
wsrep_provider_options = “pc.weight=2;socket.ssl_cert=/etc/mysql/certs/server-cert.pem;socket.ssl_key=/etc/mysql/certs/server-key.pem;socket.ssl_ca=/etc/mysql/certs/ca-cert.pem;”
innodb_autoinc_lock_mode = 2
innodb_locks_unsafe_for_binlog = 1
datadir = /var/lib/mysql
binlog_format=ROW
log_slave_updates

Replication Master Server (default)

binary logging is required for replication

#log-bin=mysql-bin

required unique id between 1 and 2^32 - 1

defaults to 1 if master-host is not set

but will not function as a master if omitted

server-id = 2

innodb_buffer_pool_size = 10G
innodb_log_file_size = 256M

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

Remove the next comment character if you are not familiar with SQL

#safe-updates

[myisamchk]
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

[mysqlhotcopy]
interactive-timeout

[sst]
encrypt=3
tkey=/etc/mysql/certs/server-key.pem
tcert=/etc/mysql/certs/server-cert.pem

Issue was solved by renaming the PID file:

[root@db2 mysql]# cd /var/lib/mysql

[root@db2 mysql]# ps -ef | grep 6511

[root@db2 mysql]# fuser -u web2.web.com.pid

[root@db2 mysql]# mv web2.web.com.pid web2.web.com.pid.old

[root@db2 mysql]# sudo /etc/init.d/mysql start