Hello!
ProxySQL keeps sending SELECT traffic to a PXC node in Donor/Desynced state during backups — how to exclude it?
Description:
I’m running Percona XtraDB Cluster Operator v1.17.0 (3 nodes) on Kubernetes with ProxySQL + pxc-backup.
During our scheduled backup/dump job, one of the nodes frequently goes into:
SHOW GLOBAL STATUS LIKE ‘wsrep_local_state_comment’; – Donor/Desynced
SHOW GLOBAL STATUS LIKE ‘wsrep_cluster_status’; – Primary
SHOW GLOBAL STATUS LIKE ‘wsrep_ready’; – ON
While in Donor/Desynced, the node still accepts SELECT queries, and ProxySQL continues to route read traffic to it. This sometimes causes clients to read stale data (the node is temporarily behind / applying queue grows).
Steps to Reproduce:
–
Version:
v1.17.0
Logs:
–
Expected Result:
ProxySQL should not route reads to nodes in Donor/Desynced; only Synced nodes should receive SELECT traffic, preventing stale reads during backups.
Actual Result:
ProxySQL still keeps this node in the reader pool and continues routing SELECT queries to it. As a result, clients sometimes read stale/outdated data from that node until it returns to Synced state.
Thank you!
Hello @Siamion_Davydau ,
Can you show us what are settings made in mysql_servers table ?
and while node is in donor state, what is the status shown in the ProxySQL tables ?
Regards,
Jyoti
Hi @Siamion_Davydau welcome back,
Can you try using (setting) pxc_maint_mode in your backup script?
Refer: https://www.percona.com/blog/proxysql-assisted-pxc-maintenance-mode/
Thanks,
K
Hello!
Settings while the node is Donor/Desynced:
Hi!
I assumed something similar is built into the cluster operator, isn’t it?
Hello Again,
I would also like to know the settings in mysql_galera_hostgroups table.
What I found out is that :
ProxySQL’s built-in Galera checks may not always handle all edge cases especially during SST with xtrabackup, where the donor node may still be operational but not fully synchronized. Community reports and bug discussions highlight that ProxySQL sometimes leaves donor nodes ONLINE in the reader hostgroup, leading to stale reads.
Maybe you can use the external health check scripts such Percona’s pxc_scheduler_handler which is also recommended in production environment.
I believe you must have followed below settings :
-
-- Writer hostgroup (e.g., 100)
INSERT INTO mysql_servers (hostgroup_id, hostname, port, weight, max_connections, comment)
VALUES (100, 'pxc-node1', 3306, 1000, 2000, 'Preferred writer');
-- Reader hostgroup (e.g., 101)
INSERT INTO mysql_servers (hostgroup_id, hostname, port, weight, max_connections, comment)
VALUES (101, 'pxc-node2', 3306, 1000, 2000, 'reader1'),
(101, 'pxc-node3', 3306, 1000, 2000, 'reader2');
INSERT INTO mysql_galera_hostgroups (
writer_hostgroup, backup_writer_hostgroup, reader_hostgroup, offline_hostgroup,
active, max_writers, writer_is_also_reader, max_transactions_behind
) VALUES (100, 102, 101, 9101, 1, 1, 0, 10);
• writer_hostgroup: Hostgroup for write traffic.
• reader_hostgroup: Hostgroup for SELECT traffic.
• offline_hostgroup: Hostgroup for nodes that are not healthy.
• active: Enables monitoring and automatic movement between hostgroups.
• writer_is_also_reader: Set to 0 to prevent the writer from being used as a reader.
• max_transactions_behind: Threshold for replication lag before shunning a node.
-
Adjust the monitoring intervals for faster detection of state changes: UPDATE global_variables SET variable_value='2000' WHERE variable_name='mysql-monitor_galera_healthcheck_interval';
UPDATE global_variables SET variable_value='800' WHERE variable_name='mysql-monitor_galera_healthcheck_timeout';
UPDATE global_variables SET variable_value='3' WHERE variable_name='mysql-monitor_galera_healthcheck_max_timeout_count';
LOAD MYSQL VARIABLES TO RUNTIME;
SAVE MYSQL VARIABLES TO DISK;