Very very slow insert for a cluster of 3 nodes :(

Hi,

We have installed and configurated Percona xtradb cluster 5.5.34 on 3 strong servers.
First we have tested the performance on one node not ine the cluster. We can execute 22 000 query insert per seconds.
But in a cluster of 3 nodes the performances are gone down at only 125 inserts per seconds ? (O_O)
The tables are with the engine Innodb.
The replications are synchronised and work very well.
Can you help us please ?

Hardware (three same nodes):
CPU: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz (12 threads)
RAM: 128 Go
SSD: 512 Go
LAN: 1Gbit/s

my.cnf (part innodb):
innodb_buffer_pool_size=64G
innodb_buffer_pool_instances=4
innodb_additional_mem_pool_size=20M
innodb_log_buffer_size=16M
innodb_flush_log_at_trx_commit=0
innodb_support_xa=0
innodb_file_per_table=1
innodb_data_file_path=ibdata1:100M:autoextend
innodb_read_io_threads=64
innodb_write_io_threads=64
innodb_io_capacity=5000
innodb_log_file_size=512M
innodb_thread_concurrency=0
innodb_file_format=Barracuda
innodb_flush_method=O_DIRECT
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
innodb_stats_on_metadata=0

=== GALERA ===
wsrep_cluster_name=“db_cluster”
wsrep_slave_threads=12
wsrep_certify_nonPK=1
wsrep_max_ws_rows=131072
wsrep_max_ws_size=1073741824
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
wsrep_replicate_myisam=1
wsrep_drupal_282555_workaround=0
wsrep_causal_reads=0
wsrep_sst_method=rsync

server-id=1
wsrep_node_address=XXX.XXX.XXX.XXX
#wsrep_cluster_address=“gcomm://”
wsrep_cluster_address=“gcomm://XXX.XXX.XXX.XXX,YYY.YYY.YYY.YYY”
wsrep_provider_options=“socket.checksum = 1; pc.weight=0; gcache.size=8G; evs.keepalive_period=PT3S; evs.inactive_check_period=PT10S; evs.suspect_timeout=PT30S; evs.inactive_timeout=PT1M; evs.consensus_timeout=PT1M; evs.send_window=1024; evs.user_send_window=512;”

  • Are these three nodes on the same LAN or spread out across a WAN?
  • Inserts on one node or across all of them?
  • Single threaded insert or multiple threads?
  • 1 transaction for each insert, or bulk inserts/multi-insert transaction?
  • Is flow control kicking in? If so – is it a specific node or from all? (wsrep_flow_control_sent)

There is no question that going from a standalone node to even a “cluster” of 1 will cause a performance hit. However, I’d agree that this performance degradation is too much.

thanks for your answer.
The three nodes are in the same LAN. The inserts was on one node for my test. The script doesn’t cut the connection on mysql server beacause I have one connect, one disconnect and in the middle many queries inserts. So I have one thread for this test and one transaction per insert.
On each node wsrep_flow_control_sent = 0.
Do you see anything strange ?

Part of the script (PERL) :
$dbh = DBI->connect(“DBI:mysql:$base:$server”, $login, $password);

for(${$i} = 1; ${$i} <= $NbInsert; ${$i}++)
{
$request = “INSERT INTO City VALUES (”.${$i}.“,‘Paris “.${$i}.”’,‘FRA’,‘PARIS’,1000);”;
$prepdbh = $dbh->prepare($request) or die $dbh->errstr;
$prepdbh->execute() or die “Echec requete\n”;
}

${$i} = ${$i} - 1;

$dbh->disconnect;

When you tested the single server – did you have wsrep_provider enabled, or was it completely Galera-free? If the later, you may test a 1-node cluster to see if the behavior is introduced with Galera or only when you add the extra nodes.

With a rate of 125 commits per second for a single thread, this suggests 8ms of time for a single commit. The question is if this is caused by just adding wsrep or by adding extra node(s).

The performance hit from just adding wsrep is from adding transaction certification to the commit process.

The performance hit from adding more than 1 node to the cluster is from replication.

Can you test to see where the penalty comes from?

Yes when I had tested with one node wsre_provider was enabled:

mysql> show variables like “%wsrep_provider%” \G;
*************************** 1. row ***************************
Variable_name: wsrep_provider
Value: /usr/lib/libgalera_smm.so
*************************** 2. row ***************************
Variable_name: wsrep_provider_options
Value: base_host = 192.168.10.2; base_port = 4567; cert.log_conflicts = no; evs.causal_keepalive_period = PT3S; evs.consensus_timeout = PT1M; evs.debug_log_mask = 0x1; evs.inactive_check_period = PT10S; evs.inactive_timeout = PT1M; evs.info_log_mask = 0; evs.install_timeout = PT1M; evs.join_retrans_period = PT1S; evs.keepalive_period = PT3S; evs.max_install_timeouts = 1; evs.send_window = 1024; evs.stats_report_period = PT1M; evs.suspect_timeout = PT30S; evs.use_aggregate = true; evs.user_send_window = 512; evs.version = 0; evs.view_forget_timeout = PT5M; gcache.dir = /disks/mysql/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /disks/mysql/data//galera.cache; gcache.page_size = 128M; gcache.size = 8G; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; gmcast.listen_addr = tcp://0.0.0.0:4567; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; ist.recv_addr = 37.187.28.14; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.version = 0; pc.weight = 0; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.proto_max = 5; socket.checksum = 1

For me the problem arrive when we add the 2 other nodes.
Perhaps we have forgottent an option in the configuration of galera. I think the commit is too slow for a synchronous replication on 3 nodes with this configuration of galera.

What is the ping RTT between each of your nodes? If the link between any of the nodes has a latency if 8ms, then that is your problem.

If the ping RTT is much lower (I would expect < 1ms on a LAN), then something else is wrong, I agree. Perhaps you should reduce your custom configuration to a bare minimum to see if any of your settings are the culprit.

Hi,

I have 0.1 ms between the nodes. I already have changed the configuration in my.cnf many times :frowning:
Perhaps the kernel version of the server can degrade the performance ?
I have the kernel Linux db1 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64 GNU/Linux on Debian 7

Well, something is wrong. Can you time a single transaction? time mysql -e “insert into …” ?

The performance is bad: 0.009 * 120 = 1.08s

time ./bin/mysql -u root -e “INSERT INTO testperformance.City VALUES (5003,‘Paris 5003’,‘FRA’,‘PARIS’,1000);”

real 0m0.009s
user 0m0.000s
sys 0m0.004s


mysql> INSERT INTO testperformance.City VALUES (5004,‘Paris 5004’,‘FRA’,‘PARIS’,1000);
Query OK, 1 row affected (0.00 sec)

I have found why the performance was very poor !!!
In our LAN sometimes we have 8ms between the nodes.
It is very strange that 8ms can degrade the performance, is there a solution ?

We have the same problem between a client from the web (IP public) and the server in our datacenter (8ms).

That is a network problem them (and not PXC’s problem). I’m glad, at least, we found an explanation. For a solution you’ll need to work with your colo provider/network team/whatever.

Yes we are to see this. But it is sad that only 8ms can hugely degrades the performances :frowning:

Thanks a lot for your exchanges :slight_smile:

1 Like