I have a pretty puzzling issue with InnoDB and LiquidWeb’s Storm on Demand service. Basically, the issue revolves around backups. They said they can’t do anything to change their backup system, so I thought maybe there was some MySQL settings that could help with the issue.
From what I can understand, Storm uses a Xen-based virtualization infrastructure. It does daily point-in-time snapshot backups. How exactly it does this, they won’t disclose. All I do know, is at exactly the time this starts, the MySQL database basically becomes completely unresponsive for around 2 minutes. Even simple selects go on hold. If I do “show processlist” a ton of processes are in “statistics” or “sorting results” or the like. And these are all very simple-index based queries. My entire dataset easily fits in the buffer-pool as well.
So, here’s what I’m thinking is happening. The DB does around 10-15 updates, deletes or inserts per second. In order to take a point-in-time backup, the Xen system freezes the state of the hard disks for a very brief period of time. Nothing can be written to the disk. Since the database is unable to write anything to disks, it can’t commit its inserts, updates or deletes, and everything is placed on hold. BTW, I have these settings:
innodb_flush_log_at_trx_commit = 2
So, does this sound about right to you guys? If so, is there anyway to stop the database from becoming unresponsive that you might know?
P.S. I just changed innodb_flush_log_at_trx_commit to 0 as my data is not mission critical. We’ll see if that makes any difference.