Flow Control, wsrep_cluster_size = 0?

Hi

We have a Percona Xtradb Cluster 8.0.36-28.1 with 3 nodes running with ProxySQL in front.

In PMM we see a lot of Flow Control Message Sent each day.

I have found some anormalies:

Node A
show status like 'wsrep_cluster%';

Variable_name     Value
wsrep_cluster_weight    3
wsrep_cluster_capabilities    
wsrep_cluster_conf_id   18446744073709551615
wsrep_cluster_size      0
wsrep_cluster_state_uuid      
wsrep_cluster_status    Primary

Node B
show status like 'wsrep_cluster%';

Variable_name     Value
wsrep_cluster_weight    3
wsrep_cluster_capabilities    
wsrep_cluster_conf_id   22
wsrep_cluster_size      3
wsrep_cluster_state_uuid      00000000-0000-0000-0000-000000000000
wsrep_cluster_status    Primary

Node C
show status like 'wsrep_cluster%';

Variable_name     Value
wsrep_cluster_weight    3
wsrep_cluster_capabilities    
wsrep_cluster_conf_id   22
wsrep_cluster_size      3
wsrep_cluster_state_uuid      00000000-0000-0000-0000-000000000000
wsrep_cluster_status    Primary

So wsrep_cluster_size 0 looks wrong.

All wsrep_cluster_conf_id expected to be 22.

And wsrep_cluster_state_uuid expected to be equal.

The flow control occurs several times a day. Even a simple one like

ALTER TABLE foo ADD INDEX idx_bar (bar);

would generate a FC where foo has 4 million rows.

Any hint would be much appreciated.

Hi Hans_Schou!

ALTER statements (or any other DDL) will stop all server activity (even for the tables in which the DDL is not running|) until the DDL is complete and all nodes are back in sync

Aside from DDLs you should check “fc_limit” and “recv_queue” to check if flow control is happening

Do you have PMM to keep track at the frequency and timing of the flow control ? Percona Monitoring and Management

Regards

That is strange because I have a testing cluster where I don’t have this problem when running the same ‘ADD INDEX’.

Aside from DDLs you should check “fc_limit” and “recv_queue” to check if flow control is happening

I’m not quite sure what “fc_limit” is. Is that ‘wsrep_flow_control_interval_low’?

I ran another command to find the important stuff on node A:

mysql -Ns -e ‘SHOW STATUS’ | egrep “wsrep_local_state_comment|wsrep_cluster_conf_id|wsrep_cluster_size|wsrep_cluster_state_uuid|wsrep_connected|wsrep_ready|wsrep_local_send_queue_avg|wsrep_local_recv_queue_avg|wsrep_flow_control_paused>|wsrep_flow_control_interval_” | column -t -o’ ’

wsrep_local_send_queue_avg 0.00655522
wsrep_local_recv_queue_avg 84.702
wsrep_flow_control_paused 0.00024627
wsrep_flow_control_interval_low 173
wsrep_flow_control_interval_high 173
wsrep_local_state_comment Synced
wsrep_cluster_conf_id 18446744073709551615
wsrep_cluster_size 0
wsrep_cluster_state_uuid
wsrep_connected ON
wsrep_ready ON

Node B and C are almost identical, so here is B:

wsrep_local_send_queue_avg 0.00452731
wsrep_local_recv_queue_avg 1.19776
wsrep_flow_control_paused 0.00025894
wsrep_flow_control_interval_low 173
wsrep_flow_control_interval_high 173
wsrep_local_state_comment Synced
wsrep_cluster_conf_id 22
wsrep_cluster_size 3
wsrep_cluster_state_uuid 00000000-0000-0000-0000-000000000000
wsrep_connected ON
wsrep_ready ON

I do have PMM running and monitoring flow control there.

Depending on the DDL , you might not see any locking for long enough or in a visible way.

Check TOI: Performing Schema Upgrades in Galera Cluster | Galera Cluster | MariaDB Documentation

Check our blogpost for avoiding locking in PXC with pt-osc: https://www.percona.com/blog/online-ddl-tools-and-metadata-locks/

For fc_limit check https://www.percona.com/blog/beware-increasing-fc_limit-can-affect-select-latency/

Regards