Unable to bootstrap the 1st node

Cris · February 19, 2024, 7:53am

Hi percona community,
I have 3 nodes with percona xtraDB cluster installed on all of them.I am trying to configure the xtraDB cluster and when I run systemctl start mysql@bootstrap.service I get the following error:
Job for mysql@bootstrap.service failed because a timeout was exceeded.
See “systemctl status mysql@bootstrap.service” and “journalctl -xe” for details.

When I run systemctl status mysql@bootstrap.service this is what I get:

[mysqluser@prod-mysql-node01 ~]$ systemctl status mysql@bootstrap.service
● mysql@bootstrap.service - Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap
Loaded: loaded (/usr/lib/systemd/system/mysql@.service; disabled; vendor preset: disabled)
Active: failed (Result: timeout) since Mon 2024-02-19 10:40:36 EAT; 29s ago
Process: 116552 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=3)
Process: 116410 ExecStart=/usr/sbin/mysqld $EXTRA_ARGS $_WSREP_START_POSITION (code=killed, signal=KILL)
Process: 116304 ExecStartPre=/bin/sh -c VAR=bash /usr/bin/mysql-systemd galera-recovery; [ $? -eq 0 ] && systemctl set-en>
Process: 116302 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Process: 116260 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 116410 (code=killed, signal=KILL)
Status: “Server startup in progress”

Feb 19 10:39:06 prod-mysql-node01.ipsl.co.ke systemd[1]: mysql@bootstrap.service: start operation timed out. Terminating.
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke systemd[1]: mysql@bootstrap.service: State ‘stop-sigterm’ timed out. Killing.
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke systemd[1]: mysql@bootstrap.service: Killing process 116410 (mysqld) with signal>
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke systemd[1]: mysql@bootstrap.service: Main process exited, code=killed, status=9/>
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke mysql-systemd[116552]: /usr/bin/mysql-systemd: line 233: kill: (116410) - No suc>
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke mysql-systemd[116552]: WARNING: mysql already dead
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke mysql-systemd[116552]: ERROR! Stale PID file: /var/run/mysqld/mysqld.pid
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke systemd[1]: mysql@bootstrap.service: Control process exited, code=exited status=3
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke systemd[1]: mysql@bootstrap.service: Failed with result ‘timeout’.
Feb 19 10:40:36 prod-mysql-node01.ipsl.co.ke systemd[1]: Failed to start Percona XtraDB Cluster with config /etc/sysconfig/my>
lines 1-21/21 (END)

This is my log file:
[root@prod-mysql-node01 log]# tail -f mysqld.log
2024-02-19T07:37:37.496153Z 0 [Note] [MY-000000] [Galera] Server initialized
2024-02-19T07:37:37.496163Z 0 [Note] [MY-000000] [WSREP] Server status change initializing → initialized
2024-02-19T07:37:37.496180Z 0 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-02-19T07:37:37.496236Z 2 [Note] [MY-000000] [Galera] Bootstrapping a new cluster, setting initial position to 00000000-0000-0000-0000-000000000000:-1
2024-02-19T07:37:37.497612Z 8 [Warning] [MY-013185] [Server] Currently unknown variable ‘clone_valid_donor_list’ was read from the persisted config file.
2024-02-19T07:37:37.497671Z 8 [Note] [MY-000000] [Galera] pause
2024-02-19T07:37:37.500525Z 7 [Note] [MY-000000] [WSREP] Cluster table is empty, not recovering transactions
2024-02-19T07:37:37.500570Z 2 [Note] [MY-000000] [WSREP] Server status change initialized → joined
2024-02-19T07:37:37.500582Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-02-19T07:37:37.500596Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.

What could I be missing?I am running it on rocky linux 8.9

Ivan_Groenewold · February 19, 2024, 1:37pm

Hi, have you changed the location of the pidfile? it seems systemd is timing out waiting for the service to be up.

Cris · February 20, 2024, 5:47am

Hi Ivan_Groenewold,
I have not changed it. My default location is /var/run/mysqld/mysqld.pid.

matthewb · February 20, 2024, 5:01pm

Please remove this unknown parameter from your my.cnf and also from $datadir/mysqld-auto.cnf

Cris · February 21, 2024, 7:53am

Hi matthewb,
After I removed the parameter,the bootstrap did start, however when I try to join in the 2nd node by starting mysql,I get the following error:

[root@prod-mysql-node03 log]# tail -f mysqld.log
2024-02-21T06:19:23.502248Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T06:19:25.001472Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T06:19:26.504612Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T06:19:28.003667Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T06:19:29.502814Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T06:19:31.004831Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T06:24:38.571926Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T06:24:40.571613Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T07:11:32.950841Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error
2024-02-21T07:11:34.943237Z 0 [Warning] [MY-000000] [Galera] Handshake failed: tlsv1 alert decrypt error

What could be the issue here?

matthewb · February 21, 2024, 4:07pm

You need to copy the SSL certificates from node1 over to node2 before starting node2. Or, more simply, disable pxc_encrypt_cluster_traffic on both nodes.

Cris · February 22, 2024, 11:17am

Hi matthewb,
I am now getting this error when I try to bootstrap:
[root@prod-mysql-node01 log]# tail -f mysqld.log
ERROR! WSREP: Failed to recover position:
Log of wsrep recovery (–wsrep-recover):
INFO: WSREP: Running position recovery with --log_error=‘/data01/mysql_data/mysql/wsrep_recovery_verbose.E1jd1l’ --pid-file=‘/data01/mysql_data/mysql/prod-mysql-node01.ipsl.co.ke-recover.pid’
ERROR! WSREP: Failed to recover position:
Log of wsrep recovery (–wsrep-recover):
INFO: WSREP: Running position recovery with --log_error=‘/data01/mysql_data/mysql/wsrep_recovery_verbose.xKt2xs’ --pid-file=‘/data01/mysql_data/mysql/prod-mysql-node01.ipsl.co.ke-recover.pid’
ERROR! WSREP: Failed to recover position:
Log of wsrep recovery (–wsrep-recover):
INFO: WSREP: Running position recovery with --log_error=‘/data01/mysql_data/mysql/wsrep_recovery_verbose.BufNcm’ --pid-file=‘/data01/mysql_data/mysql/prod-mysql-node01.ipsl.co.ke-recover.pid’
ERROR! WSREP: Failed to recover position:

matthewb · February 22, 2024, 4:03pm

Please attach more logs. Those last repeated lines of the same content don’t really help much. Is this on node1 after you disabled the cluster traffic encryption? Please provide a more detailed list of actions/steps/commands taken.

Cris · February 23, 2024, 7:58am

Hi matthewb,
This is after I removed the parameter.I am now trying to bootstrap the 1st node using the bootstrap command: systemctl start mysql@bootstrap.service

And its not giving any other logs apart from the one I shared initially.

matthewb · February 26, 2024, 5:35pm

@Cris Please stop mysql. Ensure all mysqld instances are not running (verify with ps -Af | grep mysqld). Then zero out the mysql error log (ex: echo >/path/to/file.log). Then try bootstrap. Then, attach the entire log file.

Cris · February 27, 2024, 10:49am

Hi matthewb,
I did another clean upgrade from MySQL 8.0.32 to Percona MySQL 8.0.32 and ran the bootstrap command which started well and its currently active. Now on trying to join the 2nd node, I am getting the error as attached on the file named node 2.Its not that descriptive so kind of difficult to troubleshoot.

See if you can assist. I have also attached part of node 1 logfile

matthewb · February 27, 2024, 3:03pm

I see where node2 failed, but the reason is in node1’s log, which you provided an unaligned timestamp picture. Please find the same timestamps in nod1 log and look for errors.

Cris · February 28, 2024, 5:40am

Hi matthewb,
Attached are the two logfiles in a text file from where the run started on the 2 nodes
Node 2 logfile.txt (15.1 KB)
Node 1 logfile.txt (12.1 KB)

matthewb · February 29, 2024, 10:06pm

Both logs show ‘operation canceled’. Make sure ports 4444, 4567, and 4568 are open between both hosts. Make sure SELinux/apparmor is disabled.

Cris · March 1, 2024, 6:25am

Hi matthewb,
Only port 4567 is in use by mysql as attached.SELINUX is DISABLED on both servers.

matthewb · March 1, 2024, 4:32pm

What about firewall? Is that disabled? (iptables/ufw, etc)

Cris · March 5, 2024, 1:09pm

They are all disabled on all the nodes.

matthewb · March 5, 2024, 4:01pm

Is node1 in PRIMARY state? Able to read/write data?

Cris · March 6, 2024, 5:31am

Hi Matthewb,
Node 1 is in primary and can read/write data as shown below

matthewb · March 6, 2024, 4:21pm

So node1 is online, bootstrapped, and in PRIMARY state. But when you try to start node2, it will not join. 9/10 times this is network related. Read over your my.cnf again and check everything lines up. Provide both here if you wish.

Topic		Replies	Views
Unable to bootstrap Node 1 of Pecona XtraDB cluster 8.0 Percona XtraDB Cluster 5.x troubleshooting , mysql , percona	1	4104	October 24, 2020
mysql@bootstrap.service fails on restart Percona XtraDB Cluster 5.x	0	1765	February 1, 2017
Mysql service status on Bootstrapped node Percona XtraDB Cluster 5.x	1	799	December 3, 2021
Cannot bootstrap first node Percona XtraDB Cluster 8.x mysql , percona	7	1700	October 17, 2023
Unable to start mysql service - PXC 8.0.27 Percona XtraDB Cluster 8.x mysql , percona	2	3217	May 13, 2022

Unable to bootstrap the 1st node

Related topics