Can't restart one node

miraculamiracula EntrantInactive User Role Beginner
Hi everyone, I'm nervous as can be. This cluster has been working fine for a year, after I somehow reset it to position zero and lost everything. (Thus the nervousness).

I have one server (in a cluster of 3) that is showing disk errors, so I took one down for needed maintenance in advance of doing the disk swaps.

Now it won't come back.

I swear nothing has changed, but as this system is incredibly reliable 'til it's not, I can't guarantee.

The lines that are confusing me in the error log are:
Connecting to MySQL server host: localhost, user: (null), password: not set, port: 0, socket: /var/lib/mysql/mysql.sock
-innobackupex-backup: Using server version 5.6.28-76.1-56-log
-innobackupex-backup: innobackupex version 2.3.4 based on MySQL server 5.6.24 Linux (x86_64) (revision id: e80c779)
-innobackupex-backup: xtrabackup: uses posix_fadvise().
-innobackupex-backup: xtrabackup: cd to /var/lib/mysql
-innobackupex-backup: xtrabackup: open files limit requested 0, set to 5000
-innobackupex-backup: xtrabackup: using the following InnoDB configuration:
-innobackupex-backup: xtrabackup: innodb_data_home_dir = ./
-innobackupex-backup: xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend
-innobackupex-backup: xtrabackup: innodb_log_group_home_dir = ./
-innobackupex-backup: xtrabackup: innodb_log_files_in_group = 2
-innobackupex-backup: xtrabackup: innodb_log_file_size = 20971520
-innobackupex-backup: xtrabackup: using O_DIRECT
-innobackupex-backup: 160727 07:01:40 >> log scanned up to (70923364024)
-innobackupex-backup: xtrabackup: Generating a list of tablespaces
-innobackupex-backup: 2016-07-27 07:01:40 7f6534131740 InnoDB: Operating system error number 13 in a file operation.
-innobackupex-backup: InnoDB: The error means mysqld does not have the access rights to
-innobackupex-backup: InnoDB: the directory.
-wsrep-sst-donor: innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log
-wsrep-sst-donor: Cleanup after exit with status:22

If I read this correctly, it's trying to log in with no username or password, when I have it clearly listed in the /etc/my.cnf
wsrep_sst_auth = x:x

It also says it can't access anything in /var/lib/mysql, but that directory is owned by mysql:mysql. I even changed it to 777 to no avail.

There is no /var/lib/mysql/innobackup.backup.log - so something is definitely weird. The double slash part also seems odd - there is no trailing slash in /etc/my.cnf

Please point me in the right direction. My pants are down, 1 of 3 in the cluster is down, and one of the remaining 2 has a wonky disk. I don't want another disaster like last year!


  • miraculamiracula Entrant Inactive User Role Beginner
    So, I figured it out. (Thanks all for you great ideas and help :p)

    It turns out my backup guy had added a script on server that was to have been the donor, which left backup files in the /var/lib/mysql directory. They were owned by the backup process, not mysql, so the donor could not read those files, and errored out.

    I took the cluster down to one node, started up the misbehaving box successfully, then re-added the box with bad backup files.
