IOWait so high

Hi again community.
I’m getting Disk I/O overload alerts from zabbix because in our xtrabackup process taking more time and we want to know if our settings are ok or exist some tuning we can do to avoid this issue.

How it’s possible get this all time to calcolate LSN ?

we do backup every 30 mins to get 1 full at midnight and every 30min the differences backups to recovery more actual data.

this action create Read disk over 40-60mb/s with IOWait 90-99% in 3 minutes approx.

Here is what we got
1 Percona server - 4 cpu, 6 gb ram disk in Raid5 on SSD over SAN
1 Percona slave -  8 cpu, 10 gb ram disk in Raid5 on SSD  over SAN

our slave ram right now is
              total        used        free      shared  buff/cache   availableMem:           9866        5481        2604         392        1780        3831Swap:          3967         109        3858
- Our application run only versus percona master server.
- The slave is of course for replication and create backup.
percona slave configuration is
# Percona Custom server configuration## Ansible managed
# Percona 5x specific configinnodb_temp_data_file_path=…/ibgtemp/gtemp:50M:autoextend:max:500M

Our script process run in this way with this values. 80% for parallel, compress and crypt process using 7 cpu., take the files and send by xbstream over ssh into repository on other server.

logs from script using xtrabackup


xtrabackup: recognized server arguments: --log_bin=/mysql/binlogs/binlog_node2 --server-id=2 --innodb_undo_directory=/mysql/innodb/ibundologs --innodb_log_files_in_group=5 --innodb_log_group_home_dir=/mysql/innodb/ibredologs --innodb_write_io_threads=8 --innodb_buffer_pool_size=3G --innodb_flush_method=O_DIRECT --innodb_data_home_dir=/mysql/innodb/ibdata --innodb_log_file_size=250M --innodb_autoextend_increment=50 --innodb_flush_log_at_trx_commit=1 --innodb_data_file_path=data1:100M:autoextend --innodb_read_io_threads=8 --innodb_file_per_table=1 --datadir=/mysql/data --parallel=4

xtrabackup: recognized client arguments: --socket=/mysql/mysql.sock --backup=1 --user=xbuser --password=* --stream=xbstream --lock-ddl=1 --slave-info=1 --compress --compress-threads=4 --encrypt=AES256 --encrypt-key=* --encrypt-threads=4


Any idea ? 


First IOWait is not very good metrics to alert on -  When you copy the data to spend most of the time waiting on disk IO is exactly expected outcome.   Looking at the disk IO latencies and ensuring they do not spike while backup is going is much better choice. 

If your disk is indeed overloaded or if you do not want to adjust your monitoring you can enable throttling to  backup data with speed desired

Thanks Peter…