History list length / RSEG


I am currently experiencing problems making backups of some of our “heavy write load”-databases.

Our backup strategy is making full dumps in a single transaction using mysqldump. During that time the history list is growing rapidely, which is normal due to tx_isolation = REPEATABLE-READ

It now seems that the undo space ist “swapped out” to the ibdata1

After the Backup is finished I end up with a server doing heavy disc read IO from the undo space to “feed” the purge thread to get rid of the history list.

This conflicts with the heavy default write workload on the server ending in a max_connnections situation and a more or less useless server which need time without any work load to purge the history list.

How can this problem be solved ?

Possible known options:

  • Do the backup on the replication slave which i dont like because of data consistency
  • Do the backup with Xtrabackup which i dont like because of wasting a lot of backup space

Server is a 24 GB Box with Raid6 (4 disc)
5.1.54-rel12.5-log (Percona Server (GPL), 12.5 , Revision 188)

According to http://www.mysqlperformanceblog.com/2008/11/21/how-to-calcul ate-a-good-innodb-log-file-size/ the peak time write load (not during backup) is

(none)@srv017224> show engine innodb status\G select sleep(60); show engine innodb status\G
Log sequence number 2477563846622
1 row in set (0.01 sec)

1 row in set (1 min 0.04 sec)

Log sequence number 2477612161440
1 row in set (0.00 sec)

(none)@srv017224> select (2477612161440 - 2477563846622) / 1024 / 1024 as MB_per_min;
| MB_per_min |
| 46.07660103 |
1 row in set (0.00 sec)

The current innodb_log_file_size=288MB is definitively to small but should not be the cause of this problem, right ?

Another “funny” thing is the following on one of the servers:

Trx id counter 2C1CAD773
Purge done for trx’s n:o < 2C1CAD6E4 undo n:o < 0
History list length 95

information_schema@srv017224> select * from INNODB_RSEG;±--------±---------±---------±--------±-----------±----------+| rseg_id | space_id | zip_size | page_no | max_size | curr_size |±--------±---------±---------±--------±-----------±----------+| 0 | 0 | 0 | 6 | 4294967294 | 2889502 |±--------±---------±---------±--------±-----------±----------+
If the history list length is on 95 how is it possible to have a such a huge RSEG ?

It´s a little bit confusing.

Best Regards

SyncMaster72 wrote on Tue, 08 February 2011 20:22

Which consistency?
That you started backups a millisecond earlier :wink: (compared too non-replicated backup on master).

Just monitor the master-server replication so that you know that it hasn’t fallen behind and you get an up to date backup.

I set up 3 master-slave pairs that way about 3,5 years ago (when xtrabackup wasn’t available) and they’ve only lost sync a handful of times during this time. About 70% of these times it was due to running out of disk on either master or slave (before monitoring was implemented on OS, don’t ask :wink: ), the rest of the times the application issued some statement that failed on the master but still somehow got replicated to the slave, and basically the only thing I did at these times was to tell the SQL_THREAD to skip the next statement and continue.

Now the most heavily loaded server pair of these are averaging about 3,500 qps 24x7 with a 30% write and a mysqldump takes about 45 minutes and since that doesn’t affect the master in any way there’s a lot of increased uptime compared to a backup dump every night.

Only thing is that the slave is about 45 minutes behind the master after a backup and uses about 2 hours to catch up (mostly due to the single threaded replication issue) after that they are just milliseconds apart.

I don´t care about a possible replication lag.

As you mentioned yourself there are always situations where replication false and this can also happen with even get noticed.

The project where these servers are running is a 34 master-slave pair shard with ~ 3 kqps with almost 95% write due to maximum memcache optimization.

In my opinion the data on the master should always be considered as the only valid reference data and should therefore also be used as the backup source.
Switching over to the slave is only an emergency-option and then defines the data on the new master as valid.

But if you have to switch you are in deep trouble if you haven´t enabled log-slave-updates-option

Doing the Backups on the slave is just a temporary workaraound solution but is absolutely not acceptable as the final solution for me.

I know of mk-table-checksum but never used it in production.
How much impack does it have ?