2nd+ servers Mysql won't start

I’ve been following the configuration steps for XtraDB Cluster 8 on three fresh Ubuntu 20.04 instances.

I bootstrapped the first node per the instructions, and verified that the values for wsrep_local_state_uuid, wsrep_local_state, wsrep_local_state_comment, wsrep_cluster_size, wsrep_cluster_status, wsrep_connected, wsrep_ready match the instruction example (except the UUID is different of course).

When I try to add a second and third node with sudo systemctl start mysqlthe mysql service fails to start. Here’s the output of sudo systemctl status mysql:

● mysql.service - Percona XtraDB Cluster
     Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2020-11-04 02:18:51 UTC; 16min ago
    Process: 48056 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
    Process: 48107 ExecStartPre=/usr/bin/mysql-systemd check-grastate (code=exited, status=0/SUCCESS)
    Process: 48136 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 48138 ExecStartPre=/bin/sh -c VAR=`bash /usr/bin/mysql-systemd galera-recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/S>
    Process: 48186 ExecStart=/usr/sbin/mysqld $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
    Process: 48190 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
   Main PID: 48186 (code=exited, status=1/FAILURE)
     Status: "Server startup in progress"

Nov 04 02:18:18 sa-db-2 systemd[1]: Starting Percona XtraDB Cluster...
Nov 04 02:18:51 sa-db-2 systemd[1]: mysql.service: Main process exited, code=exited, status=1/FAILURE
Nov 04 02:18:51 sa-db-2 mysql-systemd[48190]:  WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Nov 04 02:18:51 sa-db-2 mysql-systemd[48190]:  WARNING: mysql may be already dead
Nov 04 02:18:51 sa-db-2 systemd[1]: mysql.service: Failed with result 'exit-code'.
Nov 04 02:18:51 sa-db-2 systemd[1]: Failed to start Percona XtraDB Cluster.

The failure seems to start with $_WSREP_START_POSITION, but I have no idea what that refers to.
Here’s my /etc/mysql/mysql.conf.d/mysqld.cnffile:

# Template my.cnf for PXC
# Edit to your requirements.


# Binary log expiration period is 604800 seconds, which equals 7 days

######## wsrep ###############
# Path to Galera library

# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node

# In order for Galera to work correctly binlog format should be ROW

# Slave thread to use


# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera

# Node IP address
# Cluster name

#If wsrep_node_name is not specified,  then system hostname will be used

#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER

# SST method

# TLS/SSL Cert
# https://www.percona.com/doc/percona-xtradb-cluster/LATEST/configure.html#configure


I was a little confused by the ssl certificate part in the instructions, which just says “Set up the traffic encryption settings” and gives the code example. Looks like mysql config to me, so I put at the bottom of the mysqld.cnf file.

If anyone has a suggestion on how to get past this, I would be most grateful. I’m very interested to migrate away from our current simple single Mysql instance + Replica instance setup.

1 Like

Somehow I missed that the post immediately prior to this one is apparently the same issue, and from the comments there I think the solution is that I did not copy the SSL certificates from the bootstrapped first node. I’ll post here again if that solves my issue.

1 Like

Solved it. Turns out I had two and a half problems: first, I needed to copy the ssl files from the first node onto the second and third nodes, and chown them to the mysql user (that was the half problem). Finally, turns out I copy-pasted the one of the IP addresses incorrectly in the wsrep_cluster_address setting. This was easily verified by attempting to connect to the bootstrap node via mysql -u user -p -h192.168.1.1. Once I got a MySQL error message and not a timeout, I knew I was able to connect correctly.

I now have the cluster functioning correctly. I’m leaving this post up on the chance someone else finds this useful.

1 Like