I’ve got issue with making consistent mysql backups in sharded environment.
Description of setup:
Application I’m maintaining is build across three databases (InnoDB) named: datamaster, datashard1 and datashard2.
There are foreign keys to keep data consistent across one database but sometimes data splits between datashards or datamaster and then some sort of distributed transactions done on application level are coming into game
Now the tricky part for me is how to make backups consistent if data are spread across different databases and more important different machines.
For now I’m using complicated script which is doing following:
 switch app to read-only mode (app will finish current transactions and doesn’t allow to do any writes)
 flush tables with read-lock on all databases (to really ensure there are no writes)
 at this point databases are in consistent state, so i can safely start backups on every machine
 start mysqldump with --single-transaction option
 wait for first data portion of each backup
 switch app to read-write mode again
[*] continue with backup at the end check status of all backups and exit with proper message and code
App is only 5-30 seconds in read-only state.
This should give me a 3 dumps each at the same point (since were started while app was in readonly mode with --single-transaction). However databases are somehow big (~50GB) and making dumps is time consuming. Beside that restore is even more painful and time consuming
I’m looking for the ideas how it can be done to make consistent backup in other way. Databases must be at least in read-only state but no longer than 1 minute. Ideally it would be to do on line.
As far as i understand xtrabackup doesn’t give you backup of data at the point when backup started but rather - due constant reading log - at the point it finished. Is that right ?
One of my ideas was to read current txn number in mysql and use some tool like xtrabackup to do backup to that txn and no further. I’ve looked into innobackupex script and xtrabackup options but i haven’t found such features.
Also LVM doesn’t seems to be good solution since i have to shutdown database, do snapshot, start database (while having snapshot gives huge performance impact), do backup and remove snapshot
I hope I’ve wrote my minds clear )
My main goal is to achieve consistent and fast to restore backups.
Any ideas how it can be done other way ?