Poor writing performance on Percona XtraDB

I am testing a 3 node galera cluster using sysbench oltp test and i am obtaining very poor performance when testing write speed, around 10 times slower than a normal Percona server. I did some test comparing the performance of a single node galera cluster in order to discard network issues and got that the cluster was still 10 times slower on writing than the server. I don’t think the overhead from wsrep should be that high since in a single node cluster there is no actual replication so I don’t get why the results are so bad. Hope you can give me some clues.

I am using quite minimal configuration with only some modifications about sizes and threads.

binlog_format = ROW
innodb_buffer_pool_size = 400M
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 1500M
innodb_file_per_table = 1
datadir = /var/lib/mysql

wsrep_cluster_address = gcomm://
wsrep_provider = /usr/lib64/galera3/libgalera_smm.so

wsrep_slave_threads = 8
wsrep_cluster_name = Cluster
wsrep_node_name = Node1

innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2

I am using sysbench oltp test which does a mix of read(70%)/write(30%) simple queries distributed from 64 clients to 12 diferent tables.

sysbench --test=’/usr/share/doc/sysbench/tests/db/oltp.lua’ --db-driver=mysql --oltp-table-size=1000000 --mysql-db=sysbench --mysql-user=galeratest --mysql-password=‘1234’ --mysql-host= --max-time=60 --max-requests=0 --mysql-table-engine=innodb --oltp-nontrx-mode=complex --oltp-tables-count=12 --num-threads=64 run


I am testing the 5.7.14 cluster right now and I am facing the same problem as you did.
I used msqlslap to test write performance and whether I work with 1-2 or 3 node started, it is around 10 times slower to write a few 10s of thousand of query than to do it with a simple percona server freshly installed.
I tried with local and distant storage, and with vm forced on the same host, same result.
I couldn’t find any tweaking or tuning that had an impact on the performances.

Did you find anything that made the write performances better by any chance?

Let me share the result we got when we tried a simple sysbench against PS-5.7.14 and PXC-5.7.14

transactions: 11964 (239.13 per sec.)

PXC (operating as single cluster node)
transactions: 10974 (219.31 per sec.)

PXC (operating as non-cluster node)
transactions: 11625 (232.35 per sec.)

As you can see PS and PXC (in non-cluster node) performance is comparable.

PXC in cluster node (even though single node) has some extra code snippets to execute and so take a small hit.
But surely we don’t see a hit of 10 times slower.

BTW PXC in clustering mode has lot more to do than normal PS so if you want to use PXC node as standalone node please use it with wsrep_provider=none. Cluster node != Standalone node even if cluster has single node.

check the performance optimized pxc-5.7