wsrep_local_recv_queue high with SELECT on huge tables (wsrep in pre-commit stage

Hi,
I have a Percona MySQL cluster with 3 nodes and this configuration
Server version: 5.6.34-79.1-56 Percona XtraDB Cluster (GPL)
Release rel79.1
Revision 7c38350, 
WSREP version 26.19,


I have more than one DB that has small and huge tables. With small tables there aren’t problems.
With huge tables I have a very strange problem when I do full scan SELECT on these tables.
I.E., if I execute on node1 and node2  the "SELECT COUNT(*) FROM HugeTable WHERE time > ‘2020-01-01’ (i
have no index for time field) it take times but at the end return total rows.

If I execute the same query on the node3, the cluster hangs and other queries gone in “wsrep in pre-commit
stage”.

During the execution of the query on node3 I see the wsrep_queue variabiles and this is the result:

mysql> show status like '%queue%';
+----------------------------+------------+
| Variable_name | Value |
+----------------------------+------------+
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_max | 63 |
| wsrep_local_send_queue_min | 0 |
| wsrep_local_send_queue_avg | 0.046914 |
| wsrep_local_recv_queue | 48 |
| wsrep_local_recv_queue_max | 18108 |
| wsrep_local_recv_queue_min | 0 |
| wsrep_local_recv_queue_avg | 119.696628 |
+----------------------------+------------+


So my question is: why the local_recv_queue in that node growth on SELECT statement ? I do the same check in the other nodes and this does not happen. What could be the reason and what checks should I do?

These are the variables of second node:

+----------------------------+----------+ 
| Variable_name | Value | 
+----------------------------+----------+ 
| wsrep_local_send_queue | 0 | 
| wsrep_local_send_queue_max | 73 | 
| wsrep_local_send_queue_min | 0 | 
| wsrep_local_send_queue_avg | 0.375133 | 
| wsrep_local_recv_queue | 0 | 
| wsrep_local_recv_queue_max | 1959 | 
| wsrep_local_recv_queue_min | 0 | 
| wsrep_local_recv_queue_avg | 0.264582 | 
+----------------------------+----------+


Thank you, Salvo.

Hi Salvo

Do you see flow control kicking in while you are running the query against the HugeTable ? you can check that by running:
show status like ‘wsrep_flow_control%’;

Asking that because when flow control message is sent, the whole cluster stop taking write traffic. It will only serves reads.

“wsrep in pre-commit stage” means that galera is running the certification process, and that’s related to writes.