I set up a cluster of 3 database nodes using a percona 5.6 cluster. Everything works, all nodes are replicated. I configured SSL for state transfer. The cluster recently changed its topography, because we had to migrate 2 nodes to new machines. The cluster was completely restarted recently with configuration files that only contain the 3 current cluster nodes node3, node8, nodeA (don’t mind the naming). Earlier we also had node1 but this machine left the cluster and does not run any mysql any more.
Now the problem: In the log files of node3 and node8 I find every couple of seconds an error log that looks like this:
2015-04-20 11:06:24 22546 [ERROR] WSREP: handshake with remote endpoint ssl://<PUBLIC.node1.IP>:57953 failed: 1: 'End of file.' ( )
<PUBLIC.node1.IP> is the address of the old node1, which is by no means any more part of the cluster. I looked through the my.cnf config files of all cluster nodes, but all traces of node1 are gone from it. How does it come the node3 and node8 try to contact the outdated node? Note, that this is after a complete bootstrapping of the cluster starting from nodeA, which does not show the strange error messages in the log.
I found an error description of a very similar error here: https://mariadb.com/kb/en/mariadb/st…-in-error-log/
There it was suggested, that this may be due to a broken network configuration where nodes could only ping from one node to the other, but not the reverse direction. But the given reason/solution does not fit in my case, since all nodes can ping each other fine. Also, it does not explain at all the fact, that my cluster nodes try to contact a node that is not part of the cluster any more.
Here is the relevant part of the congiguration file (from node8)
### PERCONA CLUSTER STUFF wsrep_provider=/usr/lib/libgalera_smm.so wsrep_cluster_address=gcomm://<PUBLIC.nodeA.IP>,<PUBLIC.node3.IP>,<PUBLIC.node 8. IP> #wsrep_cluster_address=gcomm:// wsrep_node_address=<PUBLIC.node8.IP> wsrep_slave_threads=8 wsrep_sst_method=xtrabackup-v2 binlog_format=ROW default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 wsrep_cluster_name=betdata_cluster wsrep_provider_options=gcache.size=3G;socket.ssl_cert=/etc/mysql/cert.pem;socket.ssl_key=/etc/mysql/key.pem wsrep_sst_donor=node3,nodeA wsrep_node_name=node8 wsrep_sst_auth=sstuser:blahblubb wsrep_sst_receive_address=<PUBLIC.node8.IP>:14444 query_cache_size=0 query_cache_type=0
Note. This is a cross posting from http://dba.stackexchange.com/questio…ndpoint-is-not