slave sql thrtead got stopped automatically .

ravi · June 9, 2014, 6:02am

hi,
we have replication between two perconat xtradb cluster setups.

in each group we have 3 nodes.

slave sql thread got stopped rarely and we have not found cause.

below error messages we found from error log.

140609 3:37:14 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) turning message relay requesting on, nonlive peers: tcp://10.174.10.163:4567
140609 3:37:15 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) reconnecting to 8d7cae5c-7207-11e3-a546-db0fa85eeb02 (tcp://10.174.10.163:4567), attempt 0
140609 3:37:15 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) cleaning up duplicate 0x2ab328435790 after established 0x2ab328051590
140609 3:37:15 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
140609 5:34:03 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) turning message relay requesting on, nonlive peers: tcp://10.174.10.162:4567 tcp://10.174.10.163:4567
140609 5:34:04 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) reconnecting to df4de7f2-7205-11e3-8865-c6bdf5daf743 (tcp://10.174.10.162:4567), attempt 0
140609 5:34:04 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) cleaning up established 0x3491f510 which is duplicate of 0x348b73d0
140609 5:34:04 [Note] WSREP: (afc8dae2-7204-11e3-8876-275ef829891c, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
140609 11:16:25 [Note] Slave SQL thread exiting, replication stopped in log ‘ff-clusterdb03-lhr.000045’ at position 181715488
140609 11:50:47 [Note] WSREP: ready state reached
140609 11:50:47 [Note] Slave SQL thread initialized, starting replication in lf.000045’ at position 181715488, relay log ‘/u02/mysql/binlogs/ff–relay-bin.000108’ position: 170203363

przemek · June 9, 2014, 9:46am

Is this the full error log or you truncated some parts?
The 10.174.10.163 is the remote master for this node?

ravi · June 9, 2014, 7:56pm

yes this is a full error,same error repeated many times in log.
10.274.10.163/162/161 are in cluster group.
we got same errors in all nodes .

przemek · June 11, 2014, 8:29am

Can you outline how exactly your replication topology looks like?
Also, the “turning message relay requesting on, nonlive peers” message means there was either some networking problem between nodes or a node crashed or was killed.

ravi · June 11, 2014, 10:43pm

hanks Mike for your prompt response.

our replication setup is like below.

in c node slave getting stopped frequently .
Please help to resolve it.

photoid=16885|attachment

Topic		Replies	Views
Have an issue with cluster 2 Nodes keep dropping offline every day & rejoin issues. Percona XtraDB Cluster 5.x	2	1696	June 19, 2015
Some Help with Percona Cluster please Other MySQL® Questions	3	922	November 4, 2015
Percona XtraDB Cluster 5.6.x Percona XtraDB Cluster 5.x	1	514	September 6, 2016
One node in the cluster gets stopped Percona XtraDB Cluster 5.x	7	5389	July 31, 2014
Percona XtraDB, Question to the experts!!! ;) Percona XtraDB Cluster 5.x	3	891	August 10, 2015

slave sql thrtead got stopped automatically .

Related topics