I am running a cluster with 3 nodes in ubuntu 12.04.3, had a hard time in setting up this cluster!, well everything was working fine until yesterday one of the node(node1) got rebooted automatically (becoz of some sys err). and just now recovered from failover in a different way!.
This is what I tried.
First tried with starting the node1 with /etc/init.d/mysql start ,I thought It would do IST becoz during the failover only few transaction happened!, but it started SST (instead of IST). and removed everything in datadir and failed stating “cannot perform SST: operation not permitted” I got confused becoz I made this node(node1) as primary and bootstrapped it earlier.
Then there were only 2 nodes left with data, so I decided to make node3 as primary, and tried to start node1, SST started but got paused for few seconds and then node1 throwed error
“WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)”.
I didn’t understood why this error was coming! as when I did telnet to other nodes and it got connected and after 2 seconds connection got closed by foreign host (is this normal or something that should be worried ?).
- Despite of trying many times to start mysql got same error, even tried with deleting some log files(ib_logfile*,galera.cache…), was no luck,
4)Then I realized percona backup will also support rsync!, and did rsync manually from node3 datadir to node1 & node2 datadir, (I stopped node2 also becoz I wanted make all data identical), then started node1 and node2 and everything started correctly.
The above setup was a testing server so time was not a problem, if it were production servers then cannot afford this much downtime!. need quick and easy way to get all nodes ready.
So I am asking all the percona users and developers is there any standard procedure/steps/methods anywhere written completely including like during failover when and what should be the gcomm values and can we make all nodes primary? if yes how many we can make,what files needs to be deleted/modified in which node etc…
These type of Q&A summarized and available anywhere…? if not can we create one (may be thread in this forum).