MySQL replication causing massive performance drop

Hello, I wonder if someone could offer advice…

We are a small game developer and we use MySQL as our backend database, shared between our game and website…

We are by no means experts with MySQL but we have run into a problem when setting up replication…

We have two servers, quad core CPU’s, 4Gb ram… and as soon as we enabled replication in a master / slave setup memory usage went through the roof and the average query time went up almost ten fold or more…

Our database is around 1Gb in size… and at the moment I would not consider our website or our game to be anywhere near busy… with perhaps 1000 visitors a day to each…

Any thoughts?

Thanks

Memory usage on the master?
Is it MySQL that consumes all that memory or some other process?

What about CPU load?
Any particular process or high I/O wait?

Are there any other queries executed on the master?

Have you made any other changes to the config when you set up the replication?

I ask all these questions since the master is doing very little extra when used in replication and you usually don’t really notice any difference.

Hello

The memory usage was around 2.5Gb and CPU load was minimal…

In terms of queries on the master… it would be running queries associated with our game database / website… but not hugely busy as mentioned… I can get a read for average queries being done

And no other config changes were made…

We could not understand the performance hit either…

But, interestingly BOTH master and slave had similar memory usage…

I did realize were using 5.0 on the master and 5.1 on the slave though… not sure that could be the problem.

What changes did you make when you “enabled replication”?

Hello

I followed the details set out on the MySQL website…

I turned on binary logging on the master

I added a replication username and password

Granted SLAVE permissions to that username and password from that IP address

I flushed the tables and locked the dbase, did a backup

I restored the back to the slave

I stopped the slave via command line

Then ran the below on a command line

CHANGE MASTER TO
MASTER_HOST='XXXXXX,
MASTER_USER=‘XXXX’,
MASTER_PASSWORD=‘XXXX’,
MASTER_PORT=3306,
MASTER_LOG_FILE=‘XXXX-bin.000001’,
MASTER_LOG_POS=98,
MASTER_CONNECT_RETRY=10;

I then started the slave via command line

Replication was working… but resources were being eaten up…

Enabling binary logging was probably what caused the difference. You can find a lot of details on this at mysqlperformanceblog.com. Try searching for sync_binlog and reading those articles.

Hello

Thanks for the direction…

Seems sync_binlog is our culprit indeed…

It seems using a second drive for the log file can alleviate a lot of this problem… and at greater expense a battery backed raid drive…

There is mention of using the ext3 file system as well, but all our servers are Windows based… would using a seperate drive give us a reasonable solution under windows?

Are there any references you could point me to for a suggested setup for Windows based replication?

Thanks for your help

[B]Staggan wrote on Thu, 08 July 2010 18:51[/B]

There is mention of using the ext3 file system as well, but all our servers are Windows based… would using a seperate drive give us a reasonable solution under windows?

Even with a separate drive you will still have a limit of 130-250 writes per second which isn't much.

And if you are using InnoDB tables and you are using the default innodb_flush_log_at_trx_commit=1 you have the same problem.

So if you need more speed you either sacrifice robustness by setting the sync_binlog=0 and the innodb_flush_log_at_trx_commit=2.

Or if you need the robustness I would recommend the battery backed RAID controller solution. At which point you should also buy a second disk and set up mirroring between them and you have real robustness. )

Hello

Thanks for the advice…

We have moved our dbase onto one of our other servers now as the load on that server being created by another app was causing us concerns, and we have noticed that the second server, whilst still busy, is behaving far, far better… when compared to the original server we noticed that the cache settings on the new server are set much lower, but it is achieving much better performance in terms of query time etc… any thoughts on that? Or can you direct me to some material to read on optimizing those settings?

In terms of the replication, and given the costs and the fact that we are intending to start pushing a vast amount more players into our game, that MySQL Cluster may be a better solution as it gives us the redundancy we need and can scale the throughput… If my research is correct :smiley:

I think that if we were not looking to scale massively past where we are, replication alone might be a good solution, but we are planning 10 - 100 (all going well) times the player numbers we are seeing now, possibly more…

We can then use MySQL Cluster to replicate regional versions of the database allowing us to improve access to the game from multiple geographic regions…

Any thoughts or advice on this would be greatly appreciated…

Great forum by the way

[B]Staggan wrote on Fri, 09 July 2010 11:28[/B]

In terms of the replication, and given the costs and the fact that we are intending to start pushing a vast amount more players into our game, that MySQL Cluster may be a better solution as it gives us the redundancy we need and can scale the throughput… If my research is correct :smiley:

Well I think MySQL cluster is overkill for you, but it's your business decision. But lets look at it this way: you said you have 1000 visitors/day today and if we multiply that with your upper estimated growth of 100 times. It will equal 100,000 if we divide this with 86400 (seconds per day) we get 1,16 users/second which means that each user would get about 860ms of cpu time (if you only have one core, the double if you have two cores on the server etc), which should be more than enough to support one user (of course depending on how many queries etc that each user consumes), but since you can buy a dual 4 core cpu machine quite cheap you can easily extend this time by 8 times, which means almost 7 seconds of cpu time per user, which feels like it should be more than enough. Especially since your database is only 1GB which means that all of it will be in RAM during operation and be served very fast.
[B]Staggan wrote on Fri, 09 July 2010 11:28[/B]

We can then use MySQL Cluster to replicate regional versions of the database allowing us to improve access to the game from multiple geographic regions…

This sounded a lot like you where thinking about placing different nodes of the cluster in different geographical locations, which is not a good idea. MySQL cluster is basically designed to run all on one dedicated LAN where you have very low latency and high bandwidth between the nodes. Trying to spread it out of WAN connections is going to give you a lot of headaches.

Summary:
So out of a performance perspective I don’t think that you are going to need a cluster in the near future.
And you are still going to need to use replication if you want the data spread out over geographically separated locations.

Then the aspect left for you is failover time in case of if your primary server crashes.

But compared to the amount of extra cost, extra work, administration etc of having a cluster, I don’t think it’s a valid investment.
My suggestion for you is instead to spend part of the money that you would spend on the cluster to buy two hefty servers with RAID and battery backed up controller and run a simple master->slave replication the same way that you have already started with.
And if that isn’t enough in a couple of years then you can start thinking about cluster. :wink: