HDD 100% utilisation problem, CPU all cores WA over 90% : <

I have big problem. Finally i traced our next bottleneck which is HDD. First the prove, then problem:

Prove

not from peak load iostat -x 1 10avg-cpu: %user %nice %system %iowait %steal %idle 2.53 0.00 1.26 48.74 0.00 47.47Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilsda 0.00 40.00 70.00 263.00 784.00 2504.00 9.87 35.64 114.39 3.00 100.00sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00avg-cpu: %user %nice %system %iowait %steal %idle 3.29 0.00 2.03 32.91 0.00 61.77Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilsda 0.99 3.96 119.80 0.00 1275.25 0.00 10.64 3.82 25.17 8.19 98.12sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00avg-cpu: %user %nice %system %iowait %steal %idle 4.05 0.00 2.28 36.71 0.00 56.96Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilsda 0.00 0.00 95.00 2.00 1136.00 48.00 12.21 3.34 42.62 10.23 99.20sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00avg-cpu: %user %nice %system %iowait %steal %idle 4.58 0.00 1.53 41.73 0.00 52.16

CPU waiting ~40% of its time on waiting for disc, disk utilization is 100%

am i right its hdd saturation?

Cpu0 : 7.1%us, 2.4%sy, 0.0%ni, 38.3%id, 50.5%wa, 0.0%hi, 1.7%si, 0.0%stCpu1 : 2.4%us, 1.0%sy, 0.0%ni, 3.1%id, 93.2%wa, 0.0%hi, 0.3%si, 0.0%stCpu2 : 0.7%us, 1.0%sy, 0.0%ni, 76.1%id, 22.2%wa, 0.0%hi, 0.0%si, 0.0%stCpu3 : 1.0%us, 0.3%sy, 0.0%ni, 84.7%id, 13.9%wa, 0.0%hi, 0.0%si, 0.0%st

Solution
With solution we have problem, our DC not allow third HDD on server (then we would made from non used one (sdb) and new one RAID 0 (low budget)).
But we cannot, as our DC is not elastic with hardware changes.

So what can we do, i mean, we have non used sdb, maybe there is some good techniques to split mysql data to two hdd’s without raid?

My idea: move some tables from mysql data folder to sdb and made symbolic links to where they was before (problem is if mysql treat symbolic links like files).

We could also move binlog from mysql sda to sdb.

Any other ideas?

here are results of vmstat 1 (maybe swapiness?)

procs -----------memory---------- —swap-- -----io---- --system-- -----cpu------ 0 0 125900 253252 30304 2303804 44 0 132 0 1245 379 0 0 96 3 0 0 1 125900 253132 30304 2303804 0 0 68 0 1279 455 0 0 98 2 0 0 0 125900 253012 30316 2303944 0 0 152 0 1291 491 9 1 85 5 0 0 0 125900 252464 30392 2304088 0 0 320 0 1352 535 1 0 91 8 0 0 0 125900 252344 30408 2304468 0 0 168 248 1275 428 1 0 95 4 0 0 0 125900 252284 30416 2304460 0 0 52 12 1394 610 1 0 97 1 0 0 0 125900 252044 30496 2304720 0 0 292 36 1268 446 1 0 91 8 0 0 2 125900 251744 30520 2304916 0 0 348 0 1418 702 1 0 90 9 0 0 0 125900 251444 30520 2305164 0 0 284 0 1344 559 1 0 90 10 0 0 1 125900 251444 30540 2305364 0 0 68 208 1315 512 1 0 97 2 0 0 0 125900 251012 30540 2305448 0 0 64 0 1214 351 1 0 98 2 0 0 0 125900 250892 30540 2305448 0 0 108 0 1338 572 3 0 94 3 0 0 0 125900 250832 30540 2305564 0 0 56 0 1089 151 0 0 98 2 0 0 0 125896 250404 30540 2305564 4 0 128 0 1443 672 2 1 93 4 0 0 0 125896 250404 30564 2305724 0 0 20 504 1360 495 1 0 97 2 0 0 0 125896 250404 30580 2305708 0 0 36 0 1268 463 2 0 97 1 0 0 0 125896 250228 30580 2306084 0 0 268 0 1337 537 1 0 90 10 0 1 0 125896 249724 30580 2306084 0 0 48 0 1182 290 1 1 97 2 0 0 0 125896 249312 30584 2306440 0 0 360 0 1371 566 2 0 92 7 0 0 0 125896 249132 30600 2306644 0 0 216 320 1495 750 2 1 92 6 0 0 0 125896 249072 30600 2306696 0 0 60 0 1396 609 1 0 97 2 0 0 1 125896 248652 30720 2307016 0 0 576 0 1399 680 2 0 90 8 0 0 0 125896 248532 30728 2307240 0 0 68 16 1260 439 0 0 98 2 0 0 0 125896 248400 30728 2307240 0 0 48 0 1381 582 2 1 96 2 0 1 0 125896 248408 30744 2307320 0 0 8 376 1413 535 1 0 98 1 0 0 0 125896 248288 30744 2307320 0 0 20 0 1398 653 1 0 98 1 0 0 1 125896 247964 30748 2307380 0 0 104 0 1233 465 2 0 94 4 0 0 0 125896 247364 30748 2307820 0 0 376 0 1390 630 1 0 90 8 0 0 0 125896 247372 30764 2307848 0 0 0 100 1213 361 0 0 99 0 0 0 0 125892 247124 30780 2307836 0 0 12 264 1425 617 2 1 96 1 0 1 0 125892 246764 30780 2308092 0 0 352 0 1292 470 1 0 97 3 0 0 0 125892 246460 30780 2308312 0 0 200 0 1389 614 1 0 96 3 0 0 0 125892 246460 30780 2308432 0 0 8 0 1244 402 1 1 98 0 0 1 1 125852 244880 30804 2309464 84 0 1288 0 1501 701 2 0 92 6 0

Now system is on almost no load state (late night) but how to interpret this values? I mean, its mostly 0,0 swap but as you can see one go up first 44 then 4 and then by 84 up. How to interet this vmstat results?

About system and mysql, it was all on SDA, im now moved main database to sdb then symlinked it to /sda/mysqldirectory/data/ nad its working as writed here [URL]http://dev.mysql.com/doc/refman/5.0/en/symbolic-links.html.[/URL] I can move all mysql to sdb but isn’t it be too much? i mean /sda/mysqldirectory/mysql-bin.000661 are quite big txt files accessed from slave throught network.

Below iostat after moving main database to sdb, but we will see under peak load later today.

avg-cpu: %user %nice %system %iowait %steal %idle 1.26 0.00 0.00 10.61 0.00 88.13 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 1.00 0.00 8.00 0.00 8.00 0.02 18.00 18.00 1.80 sdb 0.00 0.00 66.00 0.00 840.00 0.00 12.73 0.60 9.05 5.26 34.70

PS
Im very curious why WA going up like this on no load almost, could network overhead (web server (server 1) <==> mysql (server 2) causing WA or only hdd?

top - 02:45:38 up 273 days, 22:56, 1 user, load average: 0.09, 0.20, 0.43 Tasks: 105 total, 1 running, 103 sleeping, 0 stopped, 1 zombie Cpu0 : 2.0%us, 0.0%sy, 0.0%ni, 95.6%id, 2.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.3%us, 0.0%sy, 0.0%ni, 83.2%id, 16.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3973292k total, 3875892k used, 97400k free, 38128k buffers Swap: 4192956k total, 125476k used, 4067480k free, 2426856k cached

Improve your database so it needs less disk i/o or install (slc) ssd’s instead.

i moved main database to separate non used SDB and bottleneck is over ) thanks for help