Delay in user synchronization across Percona XtraDB Cluster nodes

TAndreotti · October 27, 2025, 10:25am

Hi everyone,

I’m currently managing a Percona XtraDB Cluster (PXC) environment with the following setup:

Cluster version: 8.0.43-34.1 - Percona XtraDB Cluster (GPL), Release rel34, Revision 0682ba7, WSREP version 26.1.4.3
Topology: 2-node master-master cluster + 1 arbitrator.

Our database is accessed by two types of servers:

“Front” servers: Originally connected to Node 1 via HAProxy, with automatic failover to Node 2 if Node 1 crashes. Recently, some of these servers were reconfigured to use HAProxy for load balancing between both nodes.
“Management” server: Directly connected to Node 1 only.

We have a script on the management server that:

Creates a new database in the cluster.
Creates a MySQL user with access only to that database.
Triggers an API call on a front server, which uses the newly created user to access the database.

Issue: Between the user creation and the API call, there is a delay of over 1 second. Recently, we encountered a bug where the API call was routed to Node 2, which returned an error stating the user did not exist. By the time we received the error and retried manually, the user had synchronized.

Questions:

Why isn’t user synchronization instantaneous across nodes?
How and where is user synchronization configured in PXC?
What are the limitations or key points to be aware of regarding user synchronization?

I didn’t set up the cluster myself, so I’m not entirely sure how everything is configured, but I can access the servers if more details are needed.

Thanks in advance for your insights!

Yunus · October 28, 2025, 4:14am

Could you share your wsrep configuration?

show global variables like '%wsrep%';
show global status like '%wsrep%';

TAndreotti · October 28, 2025, 8:52am

Yep

show global variables like '%wsrep%';

Variable_name	Value
wsrep_applier_FK_checks	ON
wsrep_applier_threads	48
wsrep_applier_UK_checks	OFF
wsrep_auto_increment_control	ON
wsrep_causal_reads	OFF
wsrep_certification_rules	strict
wsrep_certify_nonPK	ON
wsrep_cluster_address	gcomm://10.xx.xx.3,10.xx.xx.4
wsrep_cluster_name	cluster_xxxxxxxxx_01
wsrep_data_home_dir	/var/lib/mysql/
wsrep_dbug_option
wsrep_debug	NONE
wsrep_desync	OFF
wsrep_dirty_reads	OFF
wsrep_disk_pages_encrypt	NONE
wsrep_gcache_encrypt	NONE
wsrep_ignore_apply_errors	0
wsrep_load_data_splitting	OFF
wsrep_log_conflicts	ON
wsrep_max_ws_rows	0
wsrep_max_ws_size	2147483647
wsrep_min_log_verbosity	3
wsrep_mode
wsrep_node_address	10.xx.xx.4
wsrep_node_incoming_address	AUTO
wsrep_node_name	xxxx-04
wsrep_notify_cmd
wsrep_OSU_method	TOI
wsrep_provider	/usr/lib/galera4/libgalera_smm.so
wsrep_provider_options	allocator.disk_pages_encryption = no; allocator.encryption_cache_page_size = 32K; allocator.encryption_cache_size = 16777216; base_dir = /var/lib/mysql/; base_host = 10.xx.xx.4; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = no; debug = no; evs.auto_evict = 0; evs.causal_keepalive_period = PT1S; evs.debug_log_mask = 0x1; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.install_timeout = PT7.5S; evs.join_retrans_period = PT1S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 10; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 4; evs.version = 1; evs.view_forget_timeout = P1D; gcache.dir = /var/lib/mysql/; gcache.encryption = no; gcache.encryption_cache_page_size = 32K; gcache.encryption_cache_size = 16777216; gcache.freeze_purge_at_seqno = -1; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_auto_evict_threshold = 0.75; gcs.fc_auto_evict_window = 0; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 100; gcs.fc_master_slave = no; gcs.fc_single_primary = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://0.0.0.0:4567; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; ist.recv_addr = 10.xx.xx.4; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.wait_restored_prim_timeout = PT0S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 11; socket.checksum = 2; socket.recv_buf_size = auto; socket.send_buf_size = auto;
wsrep_recover	OFF
wsrep_reject_queries	NONE
wsrep_replicate_myisam	OFF
wsrep_restart_replica	OFF
wsrep_restart_slave	OFF
wsrep_retry_autocommit	1
wsrep_RSU_commit_timeout	5000
wsrep_slave_FK_checks	ON
wsrep_slave_threads	48
wsrep_slave_UK_checks	OFF
wsrep_SR_store	table
wsrep_sst_allowed_methods	xtrabackup-v2,clone
wsrep_sst_donor
wsrep_sst_donor_rejects_queries	OFF
wsrep_sst_method	xtrabackup-v2
wsrep_sst_receive_address	AUTO
wsrep_start_position	c6f4c786-00be-11f0-8ecf-a2d19a721caa:112038713
wsrep_sync_wait	0
wsrep_trx_fragment_size	0
wsrep_trx_fragment_unit	bytes
wsrep_use_async_monitor	ON

show global status like '%wsrep%';

Variable_name	Value
wsrep_local_state_uuid	c6f4c786-00be-11f0-8ecf-a2d19a721caa
wsrep_protocol_version	11
wsrep_last_applied	115069889
wsrep_protocol_application	4
wsrep_protocol_replicator	11
wsrep_protocol_GCS	5
wsrep_last_committed	115069889
wsrep_monitor_status (L/A/C)	[ (3072710, 3072710), (115069889, 115069889), (115069889, 115069889) ]
wsrep_replicated	719971
wsrep_replicated_bytes	1014330176
wsrep_repl_keys	7009970
wsrep_repl_keys_bytes	73362608
wsrep_repl_data_bytes	892398608
wsrep_repl_other_bytes	0
wsrep_received	2352690
wsrep_received_bytes	3357110165
wsrep_local_commits	719898
wsrep_local_cert_failures	64
wsrep_local_replays	1
wsrep_local_send_queue	0
wsrep_local_send_queue_max	12
wsrep_local_send_queue_min	0
wsrep_local_send_queue_avg	0.000345927
wsrep_local_recv_queue	0
wsrep_local_recv_queue_max	58
wsrep_local_recv_queue_min	0
wsrep_local_recv_queue_avg	0.00685131
wsrep_local_cached_downto	114954454
wsrep_flow_control_paused_ns	446409
wsrep_flow_control_paused	7.2023e-10
wsrep_flow_control_sent	0
wsrep_flow_control_recv	1
wsrep_flow_control_active	false
wsrep_flow_control_requested	false
wsrep_flow_control_interval	[ 141, 141 ]
wsrep_flow_control_interval_low	141
wsrep_flow_control_interval_high	141
wsrep_flow_control_status	OFF
wsrep_cert_deps_distance	1.03947
wsrep_apply_oooe	0.0242157
wsrep_apply_oool	0.000235552
wsrep_apply_window	1.08584
wsrep_apply_waits	184710
wsrep_commit_oooe	0
wsrep_commit_oool	0
wsrep_commit_window	1.02072
wsrep_local_state	4
wsrep_local_state_comment	Synced
wsrep_cert_index_size	230
wsrep_cert_bucket_count	85229
wsrep_gcache_pool_size	134219032
wsrep_causal_reads	64
wsrep_cert_interval	2033.03
wsrep_open_transactions	0
wsrep_open_connections	0
wsrep_ist_receive_status
wsrep_ist_receive_seqno_start	0
wsrep_ist_receive_seqno_current	0
wsrep_ist_receive_seqno_end	0
wsrep_incoming_addresses	10.xx.xx.4:3306,10.xx.xx.3:3306
wsrep_cluster_weight	3
wsrep_desync_count	0
wsrep_evs_delayed
wsrep_evs_evict_list
wsrep_evs_repl_latency	0.000368835/0.000548454/0.000815358/0.000100379/22
wsrep_evs_state	OPERATIONAL
wsrep_gcomm_uuid	06cee51f-ae37-11f0-a011-2f2d9acb8a6b
wsrep_gmcast_segment	0
wsrep_cluster_capabilities
wsrep_cluster_conf_id	37
wsrep_cluster_size	3
wsrep_cluster_state_uuid	c6f4c786-00be-11f0-8ecf-a2d19a721caa
wsrep_cluster_status	Primary
wsrep_connected	ON
wsrep_local_bf_aborts	21
wsrep_local_index	0
wsrep_provider_capabilities	:MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO:
wsrep_provider_name	Galera
wsrep_provider_vendor	Codership Oy <info@codership.com> (modified by Percona <https://percona.com/>\)
wsrep_provider_version	4.23(cb05b32)
wsrep_ready	ON
wsrep_thread_count	49

matthewb · October 29, 2025, 12:05am

It should be. CREATE USER, and other DCL are replicated synchronously. What’s in your error logs on both nodes during user creation?

I see some flow control time; are you monitoring this? What’s your TPS/QPS?

Yunus · October 29, 2025, 1:21am

Chceking with error log and flow control, additionally you can define in your API to set
wsrep_sync_wait = 1 on session level if you dont want globally to have reads to be done when synced.

TAndreotti · October 29, 2025, 9:12am

Around 1.5k QPS on the cluster…
(We have over 1,000 databases and users in the cluster.)

I haven’t encountered any real errors. In this case the first call of the API throw :
SQLSTATE[HY000] [1045] Access denied for user 'xxxxxxxx'@'10.xxx.xxx.2' (using password: YES)

This hasn’t happened again since , but I’ll dig into the logs (which go back 15 days, ugh). I’ll also add wsrep_sync_wait = 1 to the creation script to prevent this from happening again.

matthewb · October 29, 2025, 1:47pm

SELECT COUNT(*) FROM mysql.user if that is over 1000 entries, then that is probably the issue. There is an internal caching table that holds user permissions for fast lookup. If you have that high of users, then the delay would be understandable. Each time you CREATE USER, or FLUSH PRIVILEGES, that table is destroyed and rebuilt.

TAndreotti · October 29, 2025, 3:16pm

1067 users.
1047 database.

So, look like i’ll add a simple sleep in my script

matthewb · October 29, 2025, 5:05pm

Don’t add a simple sleep, use wsrep_sync_wait. You can do that like this:

mysql> SELECT /*+ SET_VAR(wsrep_sync_wait=1) */ user, host FROM mysql.user WHERE user = 'foobar';

or do SET SESSION wsrep_sync_wait=1, then execute the query.

Topic		Replies	Views
Tables Not Synced Percona XtraDB Cluster 8.x	9	874	May 2, 2022
Percona cluster out of sync issue Percona XtraDB Cluster 8.x	3	1558	June 4, 2020
Percona xtradb Cluster 5.6 for Disaster Recovery on LAN/WAN problems Percona XtraDB Cluster 5.x	9	1272	April 6, 2018
percona xtraDB cluster is not completely synchronous Other MySQL® Questions	1	502	December 7, 2015
Replication lag 2 nodes cluster Percona XtraDB Cluster 5.x	3	911	March 1, 2018

Delay in user synchronization across Percona XtraDB Cluster nodes

Related topics