Performance issue with wsrep: in pre-commit stage

Hi,
I have following problem with PXC.

  1. I have following configuration
    2 x DB + Arbitrator
    PXC version 5.7.14

  2. I’m writing data to database on 50 concurrent threads.
    Writes are batched and every 10k rows there is a commit.

  3. The problem is that commit is super slow. It can take up to 60 second.
    And during commit thread is in “wsrep: in pre-commit stage” state.

  4. my.cnf

[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock

[mysqld]
log_error = /var/log/mysql/error.log
server-id=1
log-bin
expire_logs_days=2
binlog_format=ROW

default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2

wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_cluster_address=gcomm://list_of_nodes
wsrep_node_address=address
wsrep_sst_method=xtrabackup-v2
wsrep_cluster_name=bi-percona
wsrep_sst_auth=“login:password”
wsrep_slave_threads=15
wsrep_provider_options=“gcache.size=128G;gcs.fc_limit=128;gsc.fc_master_slave=YES;repl.commit_order=0”

skip-external-locking
sql_mode=“STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION”
pxc_strict_mode = PERMISSIVE

bind-address = 0.0.0.0
ssl = on

max_allowed_packet = 1G
group_concat_max_len = 1073741824
key_buffer_size = 8589934592
max_heap_table_size = 10737418240
max_connect_errors = 1000000
thread_stack = 192K
thread_cache_size = 8

query_cache_type = 0

max_connections = 2000
thread_cache_size = 100
open_files_limit = 10440

query_cache_size = 0

net_read_timeout = 360
net_write_timeout = 360
default-storage-engine = InnoDB
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 4
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 0
innodb_file_per_table = 1
innodb_buffer_pool_size = 128G
innodb_buffer_pool_instances = 64
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
range_optimizer_max_mem_size = 100M

What can I do to optimize writes ?
I already enabled wsrep_slave_threads. I was experimenting with wsrep_provider_options. What else can I do?

Regards,
Maciek

Hi maver1ck,

a. From the problem statement described above you are using long or big transactions (10K rows per commit).

b. Galera/PXC doesn’t work well with large trx and fix for it will be upcoming in galera-4.x

c. Easiest solution is break your commit in smaller set. Say 500 or less that works well for your environment


Said that some theory

when you commit in PXC Galera:

a. write-set needed for pxc replication is formed. For really large trx if write-set doesn’t fit in memory it will be written to disk in form of file.
b. this write-set is then replicated on group channel that needs round-trip to other nodes. (heavy operation)
c. this is followed by local certification.
d. Finally the real mysql-commit is done

(action a, b and c are termed as pre-commit)

So even though you are writing rows as part of trx there is no cost incurred immediately during your DML operation.
Real replication is done only on commit so commit is taking good amount of time.

check the performance optimized pxc-5.7

[url]https://www.percona.com/blog/2017/04/19/performance-improvements-percona-xtradb-cluster-5-7-17/[/url]