We have 6 databases in Amazon’s US East region. Our backup script connects over SSH to the 6 database servers and executes the following command:
sudo innobackupex --defaults-extra-file=/home/dbcontrol/.my.cnf.us-prod.backup --socket=/tmp/mysql.sock --slave-info --safe-slave-backup --safe-slave-backup-timeout=1200 --use-memory=4G --stream=xbstream --parallel=6 --compress --compress-threads=6 --encrypt=AES256 --encrypt-key=blah --encrypt-threads=3 /home/dbcontrol/backup
The xbstream file is getting streamed to the central server we use to back up and finally gets copied over to an S3 bucket for long-term storage.
If I run each backup serially I’m not seeing the problem but when we run all six in parallel I’m getting the error:
select() error: Bad address
Which gets written to the log file millions of times generating log files well over 16GB in size. Actually it doesn’t stop writing this until I kill the process.
Since it doesn’t happen when I run it serially (at least so far) I thought maybe I was hitting some kind of open file handle limit or something but I’ve checked all of the servers and everything is configured per the documentation’s recommendations so I’m at a loss. I’m not even sure if the error is coming from XtraBackup or SSH.
Has anyone else run into this problem or have any knowledge of it?