All of nodes down in cluster

road2me · August 19, 2016, 7:17am

Hello,

Today, I had issue with cluster. All of nodes have been shutdown in the same time. I have 3 nodes cluster. In logs I sppotted that wsrep initial shutdown:


2016-08-19 10:01:14 1830 [Warning] WSREP: certification interval for trx source: b9e7c19f-611b-11e6-a681-f74f157327be version: 3 local: 1 state: CERTIFYING flags: 1 conn_id: 71
297 trx_id: 32216209837 seqnos (l: 253333669, g: 6904092724, s: 6904073643, d: -1, ts: 1451339341459664) exceeds the limit of 16384
2016-08-19 10:01:14 1830 [Warning] WSREP: certification interval for trx source: b9e7c19f-611b-11e6-a681-f74f157327be version: 3 local: 1 state: CERTIFYING flags: 65 conn_id: 1
45554 trx_id: -1 seqnos (l: 253333670, g: 6904092725, s: 6904074034, d: -1, ts: 1451344454512438) exceeds the limit of 16384
2016-08-19 10:01:14 1830 [ERROR] WSREP: Certification failed for TO isolated action: source: b9e7c19f-611b-11e6-a681-f74f157327be version: 3 local: 1 state: CERTIFYING flags: 6
5 conn_id: 145554 trx_id: -1 seqnos (l: 253333670, g: 6904092725, s: 6904074034, d: -1, ts: 1451344454512438)
2016-08-19 10:01:14 1830 [Note] WSREP: Closing send monitor...
2016-08-19 10:01:14 1830 [Note] WSREP: Closed send monitor.
2016-08-19 10:01:14 1830 [Note] WSREP: gcomm: terminating thread
2016-08-19 10:01:14 1830 [Note] WSREP: gcomm: joining thread
2016-08-19 10:01:14 1830 [Note] WSREP: gcomm: closing backend
2016-08-19 10:01:15 1830 [Note] WSREP: view(view_id(NON_PRIM,9e529808,214) memb {
b9e7c19f,0
} joined {
} left {
} partitioned {
9e529808,0
})
2016-08-19 10:01:15 1830 [Note] WSREP: view((empty))
2016-08-19 10:01:15 1830 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2016-08-19 10:01:15 1830 [Note] WSREP: gcomm: closed
2016-08-19 10:01:15 1830 [Note] WSREP: Flow-control interval: [512, 512]
2016-08-19 10:01:15 1830 [Note] WSREP: Received NON-PRIMARY.
2016-08-19 10:01:15 1830 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 6904092891)
2016-08-19 10:01:15 1830 [Note] WSREP: Received self-leave message.
2016-08-19 10:01:15 1830 [Note] WSREP: Flow-control interval: [0, 0]
2016-08-19 10:01:15 1830 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2016-08-19 10:01:15 1830 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 6904092891)
2016-08-19 10:01:15 1830 [Note] WSREP: RECV thread exiting 0: Success
2016-08-19 10:01:15 1830 [Note] WSREP: recv_thread() joined.
2016-08-19 10:01:15 1830 [Note] WSREP: Closing replication queue.
2016-08-19 10:01:15 1830 [Note] WSREP: Closing slave action queue.
2016-08-19 10:01:15 1830 [Note] WSREP: /usr/sbin/mysqld: Terminated.

The same situation was on other nodes. Mysql has treid to start, but I think that couldn’t, becasuse needed bootstraping node.

What could be a reason of that problem. Version whish actually used is 5.6.30-25.16-1.xenial

Best regards,

dcherniv · February 14, 2017, 12:17am

Just had the same problem. Attempted to restart one of the nodes in the cluster. It stopped without issues. But when i started it the entire cluster died. The following is in the logs on the remaining two nodes:
Node1:

2017-02-14 04:19:55 511 [ERROR] WSREP: Certification failed for TO isolated action: source: 51b4cd25-ee0c-11e6-bb59-cb3899784a82 version: 3 local: 1 state: CERTIFYING flags: 65 conn_id: 154891 trx_id: -1 seqnos (l: 304378409, g: 437781286, s: 437781074, d: -1, ts: 546068701680385)
2017-02-14 04:19:55 511 [Note] WSREP: /usr/sbin/mysqld: Terminated.

Node3:

2017-02-14 04:19:55 5961 [ERROR] WSREP: Certification failed for TO isolated action: source: 51b4cd25-ee0c-11e6-bb59-cb3899784a82 version: 3 local: 0 state: CERTIFYING flags: 65 conn_id: 154891 trx_id: -1 seqnos (l: 63366038, g: 437781286, s: 437781074, d: -1, ts: 546068701680385)
2017-02-14 04:19:55 5961 [Note] WSREP: /usr/sbin/mysqld: Terminated.

I should say i have myisam replication enabled and also 2 myisam tables in addition to mysql system tables.

Topic		Replies	Views
unexpected shutdown all nodes Percona Server for MySQL 5.7	0	492	March 16, 2017
WSREP: cluster conflict due to certification failure Percona Server for MySQL 5.7 mysql , percona	1	1713	November 23, 2020
Percona Cluster node goes down. Percona XtraDB Cluster 5.x	1	2155	April 23, 2014
Cluster down with 1/3 node down Percona XtraDB Cluster 5.x	4	1633	February 11, 2014
Certification failed for TO isolated action Percona XtraDB Cluster 5.x	1	1091	September 27, 2017

All of nodes down in cluster

Related topics