How reliable/usable are incremental backups based off partial backups?
Where I work, our current setup is to take a full backup of all our databases once per day, and run incremental backups a few times per hour. To let us have individual backups of each database, so we can restore just one or two if necessary, we use partial backups. Specifically, the database name is passed to xtrabackup using the --databases option.
The problem is that, while searching for something else, I just ran across this note in Percona’s documentation: “It is not recommended to run incremental backups after running a partial backup.”
The current scheme bases each incremental off the previous one, and the first incremental of the day off the day’s full backup. Everything uses partials. I’m concerned that this means none of our incremental backups are usable. We have, fortunately, not yet encountered a situation where we’ve had to test this solution for real, so I don’t have real-world experience with how well our setup does or doesn’t work. I think what I’m wondering comes down to two questions, given that we want the ability to restore single databases to a point in time:
- Should we use partial backups at all? If so, how reliable are they, and what dangers are there?
- Should we switch to not using partials at all? In that case, could we still restore just one database, without touching any other databases in our MySQL instance?
We use Xtrabackup to guard against rogue queries, a failing server, and other mishaps. We have server-level backups and other precautions for more serious disasters. Xtrabackup isn’t our sole protection, so restoring specific databases to specific times without impacting the entire server is the main goal. If a bad query messes up db1 and we have to restore it to how it was an hour ago, that shouldn’t mean we also have to revert db2 and db3 and lose everything in those databases from the last hour.
Thanks for any input. Oh, we use MySQL 8.0, have only InnoDB tables, and have each table in its own file, in case those factors matter.
This is going to be difficult, in general, FYI. So you might want to practice this and have a runbook created. The reason for this is because of how you actually doing point-in-time recovery in MySQL. First, you have perform a restore of the database files, then you have to replay the binary logs from the point of backup, up to the point of failure.
You cannot filter the binary log playback, so you must replay everything that happens between time A and time B, for all tables in all databases. This is what makes the partial backups an impractical thing to do.
You will never restore directly back to your primary server. You will always first restore to a new server, apply the incrementals, then replay binlogs to your desired point. From there, you can use transportable tablespaces to export specific tables, then copy/import them to the primary server (or do a logic dump/import of specific tables).
If I were you, I would implement one of these two solutions:
- Create a replica and use the delayed replica feature. You can configure a replica to intentionally be 1hr, 8hrs, etc behind in replication. Should the need arise to restore a table, go to the replica, break replication, dump the table, and import to primary. This gives you a “live” backup where you don’t have to restore, replay, etc.
- Do a daily full, and do incrementals of the entire server. You can then use xtrabackup --export to extract out which tables you need to restore (after merging the incrementals). After export, import them just like transportable tablespaces process.
Thank you for the reply. It seems we’ll have to review our backup strategy more carefully. Most of our database restoration experience in the company is with SQL Server, so I think we were all thinking that incremental backups would act similarly in MySQL. I’ve read the process, and it seemed more involved, but I don’t think it clicked that I was reading about restoring everything and not just a single database.
Yea, this is a common source of confusion because of the way MySQL+InnoDB+binlogs work together. Even though the individual tablespaces store the data+indexes for that specific table, there is still other data in global locations that is shared among all tables in all databases. Filtered backups are indeed a thing, but I’ve never seen it implemented as part of an ongoing backup solution.