High IO observed on one half of master-master setup

Hi All,

We have two data centers, an XtraDB 5.1 server in each, with a master-master replication between them.

Load is spread dynamically, sending an increased load to the datacenter which displays the lowest latency in the fronting web application. Load is high, but plenty of hardware has been thrown at it and general performance is good.

One datacenter (lets call it DC1) receives the majority of the traffic – averaging about 20% more than the other (DC2) due to the dynamic load splitting.

However, I consistently observe a higher IO load on the DC2 server. This can be seen mirrored in the InnoDB_Buffer_Pool_Write_Requests – averaging about 20% higher than DC1. During absolute peak periods this is starting to cause issues as the pages aren’t flushed fast enough, get beyond their age limit, the aggressive flushing algorithm kicks in and write IO gets a bit out of control.

We have some options to speed up our IO (better fs configuration, some changed MySQL config, some application issues), which we are currently pursuing.

However I’d like to know if this is a normal pattern? Are somehow more buffer pool writes caused by replicated writes, rather than direct writes? Is this a symptom anyone has seen before, or has any ideas about?

Thanks in advance,

Luke.

Edit: Should add that this is fully statement-based replication.

Where are the slave relay logs placed on the machines?
Are they placed on the same volume as the innodb table space?

I can’t say why you are having higher InnoDB_Buffer_Pool_Write_Requests on the “slave”

But the IO load could be higher on the “slave” than the “master” since the “master” only have to:
And write data to binary log file
Write data to InnoDB

While the slave will need to:
Write the data to the relay log file. (IO_THREAD fetches from master)
Read the data from the relay log file. (SQL_THREAD reads statements)
Write it to InnoDB. (SQL_THREAD executes statements)

So in total you might get more IO happening on the “slave” than the “master”.
But I don’t know how that can be translated to the InnoDB writes since I think they should be the same for the two servers.

Hope you find an answer and can post it here for us to learn. :slight_smile:

The binary and relay logs are (unfortunately) on the same filesystem, however as the servers are multi-master the difference between each server shouldn’t be quite as stark as I’m observing. Also, the IO discrepancy is easily accounted for by the different InnoDB_Buffer_Pool_Write_Requests – which as you said should not be different between both servers.

I’m trying to organise a few hours of strict 50/50 load balancing, rather than the current dynamic setup. Hopefully that will prove if the difference is related to the frontend load or not.

L.