intermittent xtrabackup errors

I’m having an intermittant problem with xtraBackup.

The basic environment is a RHEL5 system, running MySql 5.1.6

This server has two instances of mysql on it. I’m trying to do a daily snapshot from the first instance to the second instance.

This is all being done from inside a script. The error acts as if a tarfile is being corrupted. This script is also used to do a snapshot of the second instance to a second server, which is why you see the command being piped through ssh. The IP address of the server is 192.168.101.8

$DESTDIR is a temporary directory created just before the script runs.

Thanks in advance.

JBB

======

Here is the relevant data:

Line from inside script:

/usr/bin/innobackupex --user=$DBUSER
–defaults-file=$SRCCNF
–databases=$t
–password=$DBPSWD
–socket=$srcdir/mysql.sock
–stream=tar $TMPDIR/ --slave-info 2>>$OUTFILE | ssh mysql@${DESTIP}
“tar xfi - -C $DESTDIR” 2>&1 >>$OUTFILE

command from a ps ax:

/usr/bin/innobackupex --user=root --defaults-file=/etc/my1.cnf --databases=/tmp/tmp31254/db-31254.txt --password=******* --socket=/var/lib/mysql1/mysql.sock --stream=tar /tmp/tmp31254/ --slave-info

Command from a ps ax | grep xtrabackup

xtrabackup_51 --defaults-file=/etc/my1.cnf --defaults-group=mysqld --backup --suspend-at-end --target-dir=/var/lib/mysql1tmp --stream=tar

ps showing all references to tar:

ps ax | grep tar
31254 pts/2 S+ 0:00 /bin/bash /home/jbayer/syncdb.sh morning start y
31410 pts/2 S+ 0:00 perl /usr/bin/innobackupex --user=root --defaults-file=/etc/my1.cnf --databases=/tmp/tmp31254/db-31254.txt --password=******* --socket=/var/lib/mysql1/mysql.sock --stream=tar /tmp/tmp31254/ --slave-info
31411 pts/2 R+ 0:17 ssh mysql@192.168.101.8 tar xfi - -C /var/lib/mysql2
31417 ? Ss 0:02 tar xfi - -C /var/lib/mysql2
31475 pts/2 Sl+ 0:02 xtrabackup_51 --defaults-file=/etc/my1.cnf --defaults-group=mysqld --backup --suspend-at-end --target-dir=/var/lib/mysql1tmp --stream=tar
31786 pts/2 S+ 0:00 sh -c cd /var/lib/mysql1; tar chf - summary/day072412_users.MYD
31787 pts/2 S+ 0:00 tar chf - summary/day072412_users.MYD

pstree showing processes:
|-screen-±bash—bash—syncdb.sh-±perl-±sh—tar
| | | |-sh—mysql
| | | -xtrabackup_51---6*[{xtrabackup_51}] | | -ssh
| -bash---bash-+-less | -pstree

Errors:

Start of transfer

Copying primary to secondary on master
Starting: Mon Sep 10 02:00:01 EDT 2012
Stopping MySQL: [ OK ]

InnoDB Backup Utility v1.5.1-xtrabackup; Copyright 2003, 2009 Innobase Oy
and Percona Inc 2009-2012. All Rights Reserved.

This software is published under
the GNU GENERAL PUBLIC LICENSE Version 2, June 1991.

120910 02:00:06 innobackupex: Starting mysql with options: --defaults-file=‘/etc/my1.cnf’ --password=xxxxxxxx --user=‘root’ --socket=‘/var/lib/mysql1/mysql.sock’ --unbuffered –
120910 02:00:06 innobackupex: Connected to database with mysql child process (pid=13043)
120910 02:00:12 innobackupex: Connection to database server closed
IMPORTANT: Please check that the backup run completes successfully.
At the end of a successful backup run innobackupex
prints “completed OK!”.

innobackupex: Using mysql Ver 14.14 Distrib 5.5.24, for Linux (x86_64) using readline 5.1
innobackupex: Using mysql server version Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

innobackupex: Created backup directory /tmp/tmp31254
120910 02:00:12 innobackupex: Starting mysql with options: --defaults-file=‘/etc/my1.cnf’ --password=xxxxxxxx --user=‘root’ --socket=‘/var/lib/mysql1/mysql.sock’ --unbuffered –
120910 02:00:12 innobackupex: Connected to database with mysql child process (pid=13101)
120910 02:00:14 innobackupex: Connection to database server closed

120910 02:00:14 innobackupex: Starting ibbackup with command: xtrabackup_51 --defaults-file=“/etc/my1.cnf” --defaults-group=“mysqld” --backup --suspend-at-end --target-dir=/var/lib/mysql1tmp --stream=tar
innobackupex: Waiting for ibbackup (pid=13118) to suspend
innobackupex: Suspend file ‘/var/lib/mysql1tmp/xtrabackup_suspended’

xtrabackup_51 version 2.0.2 for MySQL server 5.1.59 unknown-linux-gnu (x86_64) (revision id: 461)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql1
xtrabackup: Target instance is assumed as followings.
xtrabackup: innodb_data_home_dir = ./
xtrabackup: innodb_data_file_path = ibdata1:1G;ibdata2:10M:autoextend
xtrabackup: innodb_log_group_home_dir = ./
xtrabackup: innodb_log_files_in_group = 2
xtrabackup: innodb_log_file_size = 419430400

120910 02:00:16 innobackupex: Continuing after ibbackup has suspended
120910 02:00:16 innobackupex: Starting mysql with options: --defaults-file=‘/etc/my1.cnf’ --password=xxxxxxxx --user=‘root’ --socket=‘/var/lib/mysql1/mysql.sock’ --unbuffered –
120910 02:00:16 innobackupex: Connected to database with mysql child process (pid=13127)
120910 02:00:18 innobackupex: Starting to lock all tables…

log scanned up to (995 3812856701)
[01] Streaming ./ibdata1
log scanned up to (995 3812878329)
log scanned up to (995 3812879301)
120910 02:00:32 innobackupex: All tables locked and flushed to disk

120910 02:00:34 innobackupex: Starting to backup .frm, .MRG, .MYD, .MYI,
innobackupex: .TRG, .TRN, .ARM, .ARZ, .CSM, .CSV and .opt files in
innobackupex: subdirectories of ‘/var/lib/mysql1’
innobackupex: Backing up files ‘/var/lib/mysql1/summary2012/*.{frm,MYD,MYI,MRG,TRG,TRN,ARM, ARZ,CSM,CSV,opt,par}’ (311 files)

log scanned up to (995 3812879301)
log scanned up to (995 3812879301)
log scanned up to (995 3812879301)
log scanned up to (995 3812879301)
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers

Apparently innobackupex/xtrabackup do not produce a valid tar stream. This might be because something other than archive contents (e.g. error messages) is written to stdout. But we don’t have any similar bug reports.

Can you provide the tar output with the ‘v’ switch added to “tar xfi - -C $DESTDIR”?
Does it also fail without the --slave-info option?

If it happens again, yes.

I just found the following which might have been causing the problem:

Apparently the script (innodb…) has been storing files in /var/lib/mysql1tmp which weren’t getting deleted.

I’ve made the following changes to the script:

  1. Switched from tar to xbstream
  2. Fixed the script to cleanup the proper directories
  3. Verified ownership/permissions

If it happens again, I’ll post again. But, it’s been failing continously for the past 2 hours while I’ve been debugging, when I finally made these changes, it’s now working.

Thanks

JBB