Cluster won't start after rebooting all servers

We setup a cluster on 3 nodes. Tested everything.

On next day, we rebooted all 3 nodes and now cluster won’t start.

We have tried starting one node with --wsrep-cluster-address=“gcomm://” but it does not work.

Logs
==> /var/log/mysqld.log <==
130726 10:31:15 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130726 10:31:15 mysqld_safe WSREP: Running position recovery with --log_error= --pid-file=/var/lib/mysql/node1-recover.pid
130726 10:31:38 mysqld_safe WSREP: Failed to recover position:

==> //var/lib/mysql/node1.err <==
130726 10:31:15 InnoDB: Initializing buffer pool, size = 128.0G
130726 10:31:24 InnoDB: Completed initialization of buffer pool
130726 10:31:24 InnoDB: highest supported file format is Barracuda.
130726 10:31:30 InnoDB: Waiting for the background threads to start
130726 10:31:31 Percona XtraDB (http://www.percona.com) 5.5.31-rel30.3 started; log sequence number 1598630
130726 10:31:31 [Note] WSREP: Recovered position: 90fd09f4-f40e-11e2-8efc-f65252f54aba:13
130726 10:31:31 InnoDB: Starting shutdown…
130726 10:31:38 InnoDB: Shutdown completed; log sequence number 1598630
130726 10:31:38 [Note] /usr/sbin/mysqld: Shutdown complete

[root@node1 mysql]# rpm -qa | grep Percona
Percona-Server-shared-51-5.1.70-rel14.8.580.rhel6.x86_64
Percona-XtraDB-Cluster-shared-5.5.31-23.7.5.438.rhel6.x86_64
Percona-XtraDB-Cluster-galera-2.6-1.152.rhel6.x86_64
Percona-XtraDB-Cluster-client-5.5.31-23.7.5.438.rhel6.x86_64
Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64

Did you shut down the nodes gracefully with an init script or just kill the process?
Have you tried bootstrapping all 3 nodes? It’s possible the one you’re trying to bootstrap with gcomm:// has bad data. Try another one.

Servers were rebooted using reboot command so I believe OS shut down cluster gracefully using init script.

We have tried starting all 3 nodes using gcomm:// one by one but none of them start.

This seems to a big problem as we now fear that this problem could occur in production and we won’t be able to start cluster. I wonder if Percona cluster even fit for production use?

I can confirm that Selinux was causing problem.

SELINUX Strikes again! :slight_smile:

Yes, After DISABLED SELinux, it works with the error "140213 12:15:32 mysqld_safe WSREP: Failed to recover position: "