All the servers are in the same datacenter and are the same spec. I made a few config changes which seemed to improve things a bit, but when I try switching applications to read from the servers, the replication slows down again. PMM is set up, though I’m not an expert so am not 100% sure what to look for with this particular issue. It doesn’t seem to indicate a problem with resources as everything looks fine and doesn’t look like the servers are breaking a sweat. CPU usage increases when I make the applications read from the cluster, but no higher than on the current servers.
The servers are virtual servers and spec is as follows:
CPU 4 vCPUs
RAM 15 GB
System Disk 40 GB
Data Disk 150 GB (data and logs are all on this disk)
MySQL settings are:
[mysqld]
server-id=1
datadir=/data/var/lib/mysql
socket=/var/run/mysqld/mysqld.sock
log-error=/data/var/log/mysql/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
log_bin= /data/var/log/mysql/mysql-bin.log
log_slave_updates
expire_logs_days=2
bind-address = 0.0.0.0
sql-mode = ‘STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION’
ssl_ca=/data/var/lib/mysql/ca.pem
ssl_cert=/data/var/lib/mysql/server-cert.pem
ssl_key=/data/var/lib/mysql/server-key.pem
lower_case_table_names = 1
max_connections = 800
skip-name-resolve
innodb_file_per_table
innodb_buffer_pool_size = 10G
innodb_buffer_pool_instances = 16
innodb_io_capacity_max=8000
innodb_io_capacity=4000
innodb_log_file_size=256M
innodb_flush_log_at_trx_commit = 2
sync_binlog = 0
wsrep_slave_threads = 32
slave-skip-errors = 1062,1047
max_binlog_size = 100M
#PMM Logging
log_output=file
slow_query_log=ON
long_query_time=0
log_slow_rate_limit=100
log_slow_rate_type=query
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1
slow_query_log_use_global_control=all
innodb_monitor_enable=all
userstat=1
pxc-encrypt-cluster-traffic=ON
early-plugin-load=keyring_file.so
keyring-file-data = /var/lib/mysql-keyring/keyring