Percona XtraDB cluster 5.7 one of the node fails to replicated with other nodes

Hi I am new to percond xtradb cluster, but have tried to resolve this by reading issues on internet since could not find a solution to resolve do decided to post here . I have 4 node all have Percond XtraDB cluster installed with same method yum . I have already setup the my.cnf files on all 4 as per template and three servers sync and replicated well just one of the server does not replicate and has following issues .And all servers on for all nodes have exact setup and centos 7 .

  1. when i start with follwing command “systemctl start mysql@bootstrap.service” the server starts but I dont see replication as tables and databases created on other notes now show up here were as other nodes show the updates and creations .

  2. When i start with “systemctl start mysql.service” it fails to start and follwing are the output from journalctl -xe same way node2 need to be started with “systemctl start mysql@bootstrap.service without bootstart it not starts but with “systemctl start mysql@bootstrap.service it starts and also replicates without any issue but node 3 , and node4 starts by simple “systemctl start mysql.service” so the other question is what is the right way to start all of the nodes .

journalctl -xe ( output )

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- 

-- Unit mysql@bootstrap.service has finished shutting down.

Nov 27 02:29:26 db1 systemd[1]: Starting Percona XtraDB Cluster...

-- Subject: Unit mysql.service has begun start-up

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- 

-- Unit mysql.service has begun starting up.

Nov 27 02:29:26 db1 mysql-systemd[28157]: State transfer in progress, setting sleep higher

Nov 27 02:29:26 db1 mysqld_safe[28156]: 2016-11-27T01:29:26.954448Z mysqld_safe Logging to '/var/log/mysqld.log'.

Nov 27 02:29:26 db1 mysqld_safe[28156]: 2016-11-27T01:29:26.968303Z mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

Nov 27 02:29:26 db1 mysqld_safe[28156]: 2016-11-27T01:29:26.973058Z mysqld_safe Skipping wsrep-recover for f3cb74db-b43e-11e6-935f-03da5c8c8e5e:0 pair

Nov 27 02:29:26 db1 mysqld_safe[28156]: 2016-11-27T01:29:26.973894Z mysqld_safe Assigning f3cb74db-b43e-11e6-935f-03da5c8c8e5e:0 to wsrep_start_position

Nov 27 02:29:28 db1 mysqld_safe[28156]: /usr/bin/mysqld_safe: line 191: 28553 Aborted nohup /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin

Nov 27 02:29:28 db1 mysqld_safe[28156]: 2016-11-27T01:29:28.234777Z mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

Nov 27 02:29:36 db1 mysql-systemd[28157]: /usr/bin/mysql-systemd: line 137: kill: (28156) - No such process

Nov 27 02:29:36 db1 mysql-systemd[28157]: ERROR! mysqld_safe with PID 28156 has already exited: FAILURE

Nov 27 02:29:36 db1 systemd[1]: [B]mysql.service: control process exited, code=exited status=1[/B]

Nov 27 02:29:36 db1 mysql-systemd[28863]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable

Nov 27 02:29:36 db1 mysql-systemd[28863]: ERROR! mysql already dead

Nov 27 02:29:36 db1 systemd[1]: [B]mysql.service: control process exited, code=exited status=2[/B]

Nov 27 02:29:36 db1 mysql-systemd[28911]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable

Nov 27 02:29:36 db1 mysql-systemd[28911]: WARNING: mysql may be already dead

Nov 27 02:29:36 db1 systemd[1]: [B]Failed to start Percona XtraDB Cluster.[/B]

-- Subject: Unit mysql.service has failed

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- 

-- Unit mysql.service has failed.

-- 

-- The result is failed.

Nov 27 02:29:36 db1 systemd[1]: [B]Unit mysql.service entered failed state.[/B]

Nov 27 02:29:36 db1 systemd[1]: [B]mysql.service failed.[/B]

Nov 27 02:29:37 db1 systemd[1]: mysql.service holdoff time over, scheduling restart.

Nov 27 02:29:37 db1 systemd[1]: Starting Percona XtraDB Cluster...

-- Subject: Unit mysql.service has begun start-up

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- 

-- Unit mysql.service has begun starting up.

Nov 27 02:29:37 db1 mysql-systemd[28984]: State transfer in progress, setting sleep higher

Nov 27 02:29:37 db1 mysqld_safe[28983]: 2016-11-27T01:29:37.334782Z mysqld_safe Logging to '/var/log/mysqld.log'.

Nov 27 02:29:37 db1 mysqld_safe[28983]: 2016-11-27T01:29:37.348358Z mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

Nov 27 02:29:37 db1 mysqld_safe[28983]: 2016-11-27T01:29:37.355503Z mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.Mlq3PN' --pi

Nov 27 02:29:43 db1 mysqld_safe[28983]: 2016-11-27T01:29:43.904736Z mysqld_safe WSREP: Recovered position f3cb74db-b43e-11e6-935f-03da5c8c8e5e:0

Nov 27 02:29:45 db1 mysqld_safe[28983]: /usr/bin/mysqld_safe: line 191: 29426 Aborted nohup /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin

Nov 27 02:29:45 db1 mysqld_safe[28983]: 2016-11-27T01:29:45.025760Z mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

  1. First node that you boot will dictate the state of cluster. This also means only the first node should be started in bootstrapped mode.

  2. All other nodes should be started as normal mysql service and they will join and inherit the state from existing cluster node.


“1. when i start with follwing command “systemctl start mysql@bootstrap.service” the server starts but I dont see replication as tables and databases created on other notes now show up here were as other nodes show the updates and creations .”

So you tried starting the first node and it booted up as expected. 2nd part if confusing. This node will not get updates from other nodes or for the fact other nodes are not part of the cluster as you just bootstrapped this node. Can you re-validate this aspect.

Once you node-1 is up just boot other nodes (2, 3, 4) one after another using normal command and they should able to join the cluster. Make sure configuration params are set correctly.