Hope everyone is doing well. We have a ~6.5 TB MySQL InnoDB Cluster (3 nodes - single primary topology) on version 8.0.36 (Percona Server (GPL)). The MySQL Shell backup currently acquires a lock and consequently causes replication lag.
Business Problem: The MySQL Router on the App nodes don’t have a means to avoid the node with the replication lag. So, the App ends up in a situation where the lag prevents getting latest information (when MySQL Router directs traffic to the one node with the lag).
Ask: Is there a way to get MySQL Shell backup to avoid pausing GR (apart from setting consistent to false). Or, getting MySQL Router to avoid the Cluster node while lag > a certain threshold (am aware innovation release - version 9 introduced Routing Guidelines).
Hi @jojojthomas,
Yes, stop taking logical backups which are single-threaded. Instead, take physical backups which are multi-threaded using Percona XtraBackup
If you are insistent on logical backups (which are the slowest to backup, and slowest to restore), then you should use GitHub - mydumper/mydumper: Official MyDumper Project which is a multi-threaded logical backup tool.
You should not send “backup commands” through proxies, or routers, and should send these connections directly to a member.
With 6.5TB of data, if you are in the cloud, you should opt for storage snapshots. This would require locking 1 member for around 20s while the snapshot occurs. Seems like a decent trade-off to backup 6.5TB in less than 1 minute.
Hi Matthew,
Thank you for your response and suggestions.
FYI - the MySQL Shell Utilities to get logical backups are multi-threaded. References; Backup and Restore Using MySQL Shell, 11.5 Instance Dump Utility, Schema Dump Utility, and Table Dump Utility.
We are on Oracle’s OCI Cloud. So, we can use Block Volume (BV) Cloning and Backup for the physical, cold backups (i.e., storage snapshots). BV clone for the 8TB disk partition took ~45 seconds; BV backup took ~7 1/2 minutes. Both of these operations don’t have any impact on the MySQL instance. The cluster member can be locked for the brief duration when the BV clone / backup is happening. The clone is more expensive as it uses BV, while the backup will use Object Storage and so is cheaper.
With logical backup, the multi-threaded MySQL Shell (dump-instance) compresses the 6.5 TB to ~150 GB. So, still, also looking for recommendation on the original ask.
Many thanks again for the inputs.
@jojojthomas You might consider switching to a more advanced proxy. ProxySQL has been around longer than mysqlrouter, supports far more routing methodologies, and is more performant overall. mysqlrouter is a layer 4 (ie: port map) router, where ProxySQL is a layer 7 (ie: understands app protocol) router. ProxySQL “reads” the SQL, and makes determinations on where to send the query. One feature of ProxySQL is its native Group Replication awareness. It has the ability to monitor lag between GR members, and can route queries to members whose lag is less than some threshold.