Peter remarks in his blog that write scaling can be a serious problem regarding MySQL replication (especially replication delay) since the replication works single threaded which means only one CPU on the slave side may effectively process the incoming replication data which is a problem, if the master is a e.g. a 8 core system with 20 hard drives:
http://www.mysqlperformanceblog.com/2006/07/07/thoughts-on-m ysql-replication/
Now, we exactly ran into this problem recently while inserting 220 million rows of stock realtime data each day into our so called “intraday” tables - which means that the data is cleared every night and only a small amount of historic data is written to some other tables.
The beginning of each day is no problem since the intraday tables are empty and both INSERTs and UPDATESs are fast of course. As tables get filled the slave significantly falls back to a peak of 1 hour!! delay.
Both machines have identical hardware (4 disks RAID 10, dual core xeon, 8 GB RAM, mysql 4.0). Tables are InnoDB, manually partitioned by MOD’ing integer primary keys and quite small in row size:
Into these types of tables, bulk inserts are executed:
CREATE TABLE intraday_0 ( instruments_id int(10) unsigned NOT NULL default ‘0’, hour tinyint(2) unsigned zerofill NOT NULL default ‘00’, minute tinyint(2) unsigned zerofill NOT NULL default ‘00’, second tinyint(2) unsigned zerofill NOT NULL default ‘00’, bid float unsigned NOT NULL default ‘0’, KEY instruments_id (instruments_id,hour,minute)) TYPE=InnoDB;
These types of tables are constantly updated with the most recent values:
CREATE TABLE snapshot_0 ( instruments_id int(10) unsigned NOT NULL default ‘0’, date date NOT NULL default ‘0000-00-00’, bid float unsigned NOT NULL default ‘0’, bid_high float unsigned NOT NULL default ‘0’, bid_low float unsigned NOT NULL default ‘0’, bid_open float unsigned NOT NULL default ‘0’, bid_close float unsigned NOT NULL default ‘0’, bid_trend enum(‘up’,‘down’,‘inv’) NOT NULL default ‘inv’, bid_time time NOT NULL default ‘00:00:00’, ask float unsigned NOT NULL default ‘0’, ask_high float unsigned NOT NULL default ‘0’, ask_low float unsigned NOT NULL default ‘0’, ask_open float unsigned NOT NULL default ‘0’, ask_close float unsigned NOT NULL default ‘0’, ask_trend enum(‘up’,‘down’,‘inv’) NOT NULL default ‘inv’, ask_time time NOT NULL default ‘00:00:00’, last float unsigned NOT NULL default ‘0’, last_high float unsigned NOT NULL default ‘0’, last_low float unsigned NOT NULL default ‘0’, last_open float unsigned NOT NULL default ‘0’, last_close float unsigned NOT NULL default ‘0’, last_trend enum(‘up’,‘down’,‘inv’) NOT NULL default ‘inv’, last_time time NOT NULL default ‘00:00:00’, volume_last_unit bigint(20) unsigned NOT NULL default ‘0’, volume_cumulative_unit bigint(20) unsigned NOT NULL default ‘0’, volume_cumulative_price bigint(20) unsigned NOT NULL default ‘0’, PRIMARY KEY (instruments_id)) ENGINE=InnoDB;
InnoDB settings:
innodb_data_home_dir = /var/lib/mysqlinnodb_data_file_path = ibdata1:60Ginnodb_log_file_size = 2000Minnodb_log_buffer_size = 8Minnodb_buffer_pool_size = 1800Minnodb_additional_mem_pool_size = 16Minnodb_file_io_threads = 4innodb_lock_wait_timeout = 50innodb_flush_log_at_trx_commit = 0innodb_flush_method = O_DIRECT
To solve this problem I am considering the following two solutions:
- Set up a true master-master replication and distribute the write load among two servers: each server receives half of all inserts / updates
- In this case the replication thread of both servers would only need to insert / update half of the data - am I right? )
- buy new machines with 16 RAID 10 disks
- although Peter remarked that more disks and cpu’s won’t help, I think more disks working in a RAID 10 will certainly do. of course only one cpu will work on the replication thread, but data will be inserted / updated much faster with 16 RAID 10 disks instead of 4 - am I right? )
Thanks for any further suggestions about distributing write load or enhancing write speed in general.