EC2 slave I/O performance issue

zsolt · March 5, 2014, 6:17am

We are looking into migrating some of our MySQL servers to AWS EC2, however our initial tests are not too promising.

We tried 4 x EBS (2000 piops each) volumes on a EBS optimized instance, prewarmed the ebs volumes as instructed by Amazon, set up a Raid 0 array with mdadm, installed xfs, and restored a backup then configured the replication parameters. The slave is not able to catch up to the master.

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid0 xvdi[3] xvdh[2] xvdg[1] xvdf[0]
419429376 blocks super 1.2 256k chunks

unused devices:

iostat output:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdap1 0.00 0.20 0.00 0.60 0.00 3.20 10.67 0.00 1.33 0.00 1.33 1.33 0.08
xvdf 0.00 0.00 1.20 496.20 19.20 16995.20 68.41 1.23 2.47 0.67 2.47 0.66 32.72
xvdg 0.00 0.00 1.20 404.20 19.20 16628.00 82.13 0.85 2.09 0.67 2.10 0.44 17.92
xvdh 0.00 0.00 1.00 427.20 16.00 16776.80 78.43 1.03 2.39 0.00 2.40 0.53 22.64
xvdi 0.00 0.20 0.60 493.80 9.60 16947.20 68.60 1.14 2.30 0.00 2.30 0.63 31.04
md0 0.00 0.00 4.00 1820.40 64.00 67310.40 73.86 0.00 0.00 0.00 0.00 0.00 0.00

Write throughput is 67310 kB/s, however iotop attributes much less to the mysql process:

Total DISK READ: 79.73 K/s | Total DISK WRITE: 64.24 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
5464 be/4 mysql 0.00 B/s 886.63 K/s 0.00 % 16.37 % mysqld --basedir=/usr --datadir=/var/lib/mys~ocket=/var/run/mysqld/mysqld.sock --port=3306
5465 be/4 mysql 54.22 K/s 173.82 K/s 0.00 % 3.18 % mysqld --basedir=/usr --datadir=/var/lib/mys~ocket=/var/run/mysqld/mysqld.sock --port=3306
5101 be/4 mysql 19.14 K/s 0.00 B/s 0.00 % 0.09 % mysqld --basedir=/usr --datadir=/var/lib/mys~ocket=/var/run/mysqld/mysqld.sock --port=3306
5100 be/4 mysql 6.38 K/s 11.96 K/s 0.00 % 0.07 % mysqld --basedir=/usr --datadir=/var/lib/mys~ocket=/var/run/mysqld/mysqld.sock --port=3306

I have no idea what is doing the remaining 63 M/s writes. Interestingly, when I stop the slave, the write throughput goes to 0. After slave start, it again exhibits the strange I/O pattern.

Next, I tried a high I/O instance that has an SSD. I only used one of the ephemeral SSD disks, and then MySQL was able to catch up to the master. It was doing 30 M/s until it caught up and sustained 10 M/s after that.

So I am suspicious of the EBS Raid 0 setup, but have no idea how to fix it. Has anyone else had this same issue?

Both master and slave are 5.5.30-30.1-log Percona Server (GPL), Release 30.1, OS is Ubuntu 12.04.

zsolt · March 5, 2014, 8:02am

I attached another EBS volume with 2000 piops and copied the MySQL data directory over. Restarted the server and started replication. MySQL now has better throughput with a single EBS instance, about 5 M/s write throughput. Still not enough for our use case, though.

Why does Raid 0 perform worse than a single volume? I’d expect the opposite to happen

zsolt · March 7, 2014, 7:18am

Transferred the relay logs to a ram drive, and hey presto, the slave is now catching up!

ranrub · March 15, 2014, 9:49am

What will happen in case of a server crash and loss of the relay logs?

Topic		Replies	Views
AWS EC2 with MySQL not using anywhere near the resources available MySQL & MariaDB mysql	9	970	August 27, 2021
high disk i/o after upgrade from 5.5 to 5.6 Other MySQL® Questions	0	624	January 7, 2016
Replication from AWS RDS to MySQL Percona Distribution for MySQL	1	736	September 29, 2021
A few advices , my.cnf as well Other MySQL® Questions	1	557	October 16, 2010
Performance testing different storage with real workload Other MySQL® Questions	0	361	June 9, 2010

EC2 slave I/O performance issue

Related topics