I have a cluster of 9 nodes and i have nodes connected over corporate wan (fiber) . 2 days back i had timeouts between a node and IP of cluster address . and after some time the Node’s Mysql service stopped . Though i have wsrep_dirty_reads already on but still node stopped the mysql service . Following are the logs
2018-05-23T06:24:19.616148Z 0 [Note] WSREP: (eaab4c3a, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2018-05-23T06:24:47.121864Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)
2018-05-23T06:24:47.121938Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
2018-05-23T06:24:47.122148Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1514: Failed to open channel ‘pbx-cluster’ at ‘gcomm://172.20.1.43’: -110 (Connection timed out)
2018-05-23T06:24:47.122178Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
2018-05-23T06:24:47.122187Z 0 [ERROR] WSREP: Provider/Node (gcomm://172.20.1.43) failed to establish connection with cluster (reason: 7)
2018-05-23T06:24:47.122191Z 0 [ERROR] Aborting
2018-05-23T06:24:47.122196Z 0 [Note] Giving 0 client threads a chance to die gracefully
2018-05-23T06:24:47.122201Z 0 [Note] WSREP: Waiting for active wsrep applier to exit
2018-05-23T06:24:47.122205Z 0 [Note] WSREP: Service disconnected.
2018-05-23T06:24:47.122209Z 0 [Note] WSREP: Waiting to close threads…
2018-05-23T06:24:52.122368Z 0 [Note] WSREP: Some threads may fail to exit.
2018-05-23T06:24:52.122424Z 0 [Note] Binlog end
2018-05-23T06:24:52.132323Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
My question is how can avoid auto stop of Mysql service . as i usually get the timeouts over the wan .
Any help will be much appreciated.