Poor Galera performance

Hi,

I’m new to Galera, however I was able to setup 3 node cluster
(using MariaDB or Percona). My problem is that when I go from single node to Galera (for example
single node with wsrep* in my.cnf) performance have decreased about 10 fold (without wsrep node
1000% faster).
What may be wrong with my Galera setup ?
thank you,
Sergey

Hi Sergey,

Can you post your my.cnf/wsrep.cnf configurations?

–Dave

Of course. I forgot to say - this slow down only for write queries. Galera reading performance is good. In production environment I have good hardware (Xeon cpu, RAID and so on). I made experiments on Virtual machines (usual Celeron CPU, SATA HDD) and have found the same slow down at transition from stand-alone server to Galera.

[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock

[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
#nice = 0

[mysqld]
#CLUSTER
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_provider_options=“gcache.size=2G; gcache.page_size=1G;gcs.fc_limit = 256; gcs.fc_factor = 0.99;”
wsrep_slave_threads= 32
innodb_doublewrite=1
query_cache_size=0
query_cache_type=0
wsrep_node_name=“Gandalf”
#wsrep_node_incoming_address=10.0.0.1
#wsrep_sst_receive_address=10.0.0.1
innodb_locks_unsafe_for_binlog=1
wsrep_cluster_address=gcomm://10.0.0.1,10.0.0.6,10.0.0.3
binlog_format=ROW
auto_increment_offset = 1
auto_increment_increment = 1
wsrep_auto_increment_control = 0
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_node_address=10.0.0.1
wsrep_sst_method=xtrabackup-v2
wsrep_cluster_name=DB_CLUSTER
wsrep_sst_auth=“sstuserroot:842655”
#wsrep_sst_method=rsync
wsrep_replicate_myisam=1

#GENERAL
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
skip-name-resolve
default_time_zone=’+04:00’
collation-server = utf8_general_ci
init-connect=‘SET NAMES utf8;’
#init-connect=‘SET GLOBAL sql_mode="";’
#init-connect=‘SET SESSION sql_mode="";’
#init-connect=‘SET sql_mode="";’
default-storage-engine = InnoDB
character-set-server = utf8
bind-address = 0.0.0.0
#bind-address = 127.0.0.1

#THREADS
thread_handling = pool-of-threads
#thread_pool_high_prio_mode = transactions
thread_pool_size = 16
thread-cache-size = 400

MyISAM

key-buffer-size = 32M
myisam-recover = FORCE,BACKUP

SAFETY

max-allowed-packet = 16M
max-connect-errors = 1000000

CACHES AND LIMITS

tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 1000
open-files-limit = 65535
table-definition-cache = 4096
table-open-cache = 1900

INNODB

innodb-log-files-in-group = 2
innodb-log-file-size = 128M
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table = 1
innodb-buffer-pool-size = 32G

LOGGING

log-error = /var/log/mysql/mysql-error.log
#log-queries-not-using-indexes = 1
#slow-query-log = 1
#slow-query-log-file = /var/log/mysql/mysql-slow.log

[mysqldump]
quick
quote-names
max_allowed_packet = 16M

[mysql]
#no-auto-rehash # faster start of mysql but no tab completition

[isamchk]
key_buffer = 16M

* IMPORTANT: Additional settings that can override those from this file!

The files must end with ‘.cnf’, otherwise they’ll be ignored.

!includedir /etc/mysql/conf.d/

Hi Sergey,

Thanks for posting. Here’s a couple of performance tips for starters:

  • wsrep_slave_threads= 32

From the README-wsrep file, this number should be low per CPU core. To be safe, I’d say start with 2 threads per core, Use

cat /proc/cpuinfo | grep -c ‘^processor\s*:’

and multiply by 2 for a starting value. You may want to reduce this to 1 thread per core or lower.

  • wsrep_provider_options=“gcache.size=2G; gcache.page_size=1G;gcs.fc_limit = 256; gcs.fc_factor = 0.99;”

Refer to this guide http://www.fromdual.com/gcache_size_in_galera_cluster for these values.

Also:

  • rule out a network bandwidth issue, use the iperf tool (http://iperf.sourceforge.net/) to measure the bandwidth between your nodes.

  • make sure that values aren’t being overridden by configuration files in your /etc/mysql/conf.d directory.

Thank you, Dave.

All is ok on my system, including bandwith issue.
No one can explain me why Galera so slow on write queries…
(at least for query “insert into test.tmptable values (‘test1’,‘test2’,‘test3’)”)

Hi Sergey,

Can you provide some general specifications on the write operation? table size, row length, column definition, indexes and avg/max write operations per second would help. I could put together a randgen simulation with your configuration above.

–Dave

One other question, what is your overall transaction size?

Hi

Within my environment, the most dominant tuning parameter was repl.commit_order(although we may not change this parameter)
http://www.percona.com/doc/percona-xtradb-cluster/5.6/wsrep-provider-index.html#repl.commit_order

This configuration seems to serialize the transaction in each replicated nodes, scalability for cuncurrent transaction may lost significantly
This occurred even when the cluster node is just one node.

To judge whether the problem is caused by galera cluster or not, just to turn off all the wsrep configuration(comment out or so on) and benchmark may help you.

thank you.

Thank you very much.

Yes, I made benchmarks (before posting) with VMWare environment (Ubuntu and so on).
Expert from SeveralNines made his benchmark with sysbench (olpt), also using VMWare and he
have got ~400% slow down.

So, from my experience I can’t view database clustering as good solution for performance increase.
For example if I will setup 4 nodes I will have performance of 1 node.
If I will setup for example 20 nodes I don’t know value of performance I will have. May be I will have 4xperformance of 1 node,
but may be less than 4x.

Hi

  • scaling for read traffic is very effective and easy(this almost depends on CPU resource)
  • scaling for write seems difficult and I also need the effective and simple strategy.

thank you

In most scenarios, moving from a virtualized environment to dedicated hardware would be a logical first scaling step prior to considering multi-master replication. So a performance benchmark of a pxc/galera multi-master virtualized environment may not be practical. Aside from overall performace scaling, there is also the high availability use case in which vitalization may play a role. That being said, there are some tuning steps regarding threading and transaction size that may also be effecting your test results.

If you’re running a PXC cluster as a VM guest you should definitely reduce your slave threads to match 2x max cores (or lower) defined in the guest. Also, if you are running very large write transactions such as multiple inserts, updates or [B] to see if there’s a large memory allocation and/or swapping occurring.

There are some tuning parameters for this case. If a large writeset is created by a [B] which will set limits on the transaction size. These parameters will cause replication rejections which will be logged. If they are, you need to modify your code to split up the apps large writesets.

Thank you. It is very interesting.

Check the new performance optimized pxc-5.7

https://www.percona.com/blog/2017/04/19/performance-improvements-percona-xtradb-cluster-5-7-17/