I am here because of an application issue. To make it easy, the application is basically some sort of forum it creates and ID when someone draft a post and then it will read it.
The big deal is that we have one master and two slaves. The application will do all writes to the master and all reads to the slaves, just like unaware of replication. The case above cause issues because these two actions are done sequentially and when a lot of replication is occurring, the application will throw an obvious error that the data which were INSERT into the master are not yet available in the slave.
The delay is about 1 to 3 seconds and always varies. I know that the application should be replication aware for this case, but it is not what I am trying to solve. My first question is to find whether anything very obvious in my configuration could explain the delay.
The three servers are in the cloud and each with the exact same configuration. While I am not a fan of that, the latency and connectivity between each slave and the master is excellent (about 0.0200 ms latency average and 400 Mbit/s connectivity). Each server has 68 GB of RAM and 8 Intel® Xeon® CPU X5550 @ 2.67GHz CPU, running CentOS 5.5 x86_64 (LM flag activated) and MySQL Server is the 5.5.9-log. Each indicator clearly shows that the machine has plenty of free resources.
The configuration used is:
[mysqld]user=mysqllog_bin = mysql-binserver-id = 1000 # Each has a unique ID…#relay_log = mysql-relay-log # on slave only#log_slave_updates = 1 # on slave only#read_only = 1 # on slave onlybinlog_cache_size=1Mft_min_word_len=4innodb_additional_mem_pool_size = 32Minnodb_buffer_pool_size = 60Ginnodb_data_file_path = ibdata1:10M:autoextendinnodb_file_per_table = 1innodb_flush_log_at_trx_commit = 0innodb_log_buffer_size = 8Minnodb_log_file_size = 256Minnodb_thread_concurrency = 0innodb_status_file = 0innodb_flush_method = O_DIRECTinnodb_log_files_in_group = 2innodb_io_capacity = 10000innodb_write_io_threads = 4innodb_read_io_threads = 4join_buffer_size = 2Mmax_allowed_packet = 1Mmax_connections = 15000max_heap_table_size = 64Mmax_prepared_stmt_count = 16382query_cache_size = 0sort_buffer_size = 2048Ktable_cache = 2048thread_cache = 0thread_cache_size = 0thread_concurrency = 10thread_stack = 196Kquery_cache_size = 128Mtmp_table_size = 16Mtransaction_isolation = REPEATABLE-READlocal-infile = 1
This configuration has been heavily tuned and gave very good performances so far. Other than the replication delay (which does not really scares me more than that actually), do you see anything wrong? Anything that would make you yell?