Hi
I have a cluster with three nodes and I cant reach them when ever a power failure happen to them I want to start the cluster after each startup How can I do that ? I can successfully run the cluster manually I notice that in my log file for each nod I can see that establish connection is failed even I try to change the /usr/bin/mysql-systemd script line 270 to change the seqno value to a number that never going to happen but nothing will happen after 5 second which is set by the service file mine cluster will and the error I can see in the logs tells me that the node can’t find the primary view of the cluster while the wsrep_recover position is set and even the grstate.dat is existed.
Do you have any idea?
Hello @puria_zandy,
What kind of cluster? PXC? GR? Source/Replica? Is this with Everest or using our K8S Operator?
Right, you have an all-down cluster which must be bootstrapped. Ensure the value of pc.recovery=1 in your my.cnf for the wsrep_provider_options This is enabled by default, but might be you have it off for some reason.
You might want to increase pc.wait_prim_timeout
Please don’t modify mysql-systemd script. You might need to increase the 5s timeout to something much larger, like 30 or 60s to give pc.recovery enough time to work.
Hi thx to ur response I read your advise and check my service and I got that my gvwstate is missing after each power failure and I dont know why I also have another cluster with old version with no problem!! to clear the problem the power failure happened and I have three cluster I also check the pc.recovery value and that was one as I expected and even fix the mysql-systemd script as you told me and even check the pc.wait_prim_timeout and the value was 30 and I got connection refused !!! in my error log and I also setup the wsrep_options_debug to true to have more information I also test that with bootstrapping the cluster manually no error generated but when they want to communicate with each other they cant so finding the primary view of cluster will fail and they cant signal the donor node to setup the cluster and so on This is mysql log file also my installation is on ubuntu OS with no firewall rules my firewall chain is open for three nodes I also remove the apparmore
=====================LOG(gvwstate)===============
2024-11-11T12:52:23.636525Z 0 [Note] [MY-000000] [Galera] /mnt/jenkins/workspace/pxc80-autobuild-RELEASE/test/percona-xtradb-cluster-8.0.37-29/percona-xtradb-cluster-galera/gcomm/src/view.cpp:read_file():422: Fail to access the file (/var/lib/mysql//gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown
2024-11-11T12:52:23.636682Z 0 [Note] [MY-000000] [Galera] /mnt/jenkins/workspace/pxc80-autobuild-RELEASE/test/percona-xtradb-cluster-8.0.37-29/percona-xtradb-cluster-galera/gcomm/src/pc.cpp:PC():280: Restoring primary-component from disk failed. Either node is booting for first time or re-booting after a graceful shutdown.
This is my lab I have three nodes in virtual environment no Fire wall is between them no iptable rule is set and apparmor has been removed,also I run the mysql in earlier version with the same config and no extra thing I just changed the mysql script that I mentioned which is not related to this problem that line is for checking the grstate file. I will glad if you help me to find out whats going on in this scenario.
no answer ? if anyone see this post and has a similar problem please write it to find the problem percona doc is good but dosent contain everuthin this is why this problem is not solved.
Hi @puria_zandy,
Looks like this post was marked Solved. Please ensure you have updated to the very latest versions of our operator, and PXC, as there have been numerous bugs, and issues fixed. 8.0.37 is over 9 months old. After you upgrade, if you still have issues, please open a new thread with full logs, details, etc so that our volunteers can best assist you.