I’ve been trying to hunt down this issue for a few days and believe that I have found an issue. Here is my setup for the particular cluster I have been having issues with:
4 Total Servers
- 2 Masters in Active/Passive M-M
- 2 slaves (each off of 1 master)
- All running Percona MySQL 5.1.54
- Statement Based Replication
- All writes go to single master - replicated to passive master and primary slave
Documentation says that when using SBR and you know the number of rows to be inserted and those are all the same type, innodb_autoinc_lock_mode 0 OR 1 should be safe. I have been using mode 1 (default) and have seen my slaves get out of sync rapidly (when I define a few skip-slaves) or die on duplicate primary key. I verified that the only user connecting was the mysql system user (replication thread) and found that the PKs were different on the master and the slave.
The query that is causing the issue is a multi row insert, where no PKs are specified, so I would assume it is not multi-type (this could be the problem)?
Here is some sample data (with column names changed):
Master Data
pk, fk_1, fk_2, fk_3, att_1, att_2, fk_4, att_3, att_4
885357437, 105516509, 186757, 230399771, ‘0’, ‘new’, 0, 2011-02-21 11:57:15, 1298311035
885357431, 2453251, 186757, 230399771, ‘1’, ‘sent’, 0, 2011-02-21 11:57:15, 1298311035
Master Binlog Events:
‘mysql-bin.000047’, 214491072, ‘Query’, 5001, 214491155, ‘BEGIN’
‘mysql-bin.000047’, 214491155, ‘Intvar’, 5001, 214491183, ‘INSERT_ID=885357431’
‘mysql-bin.000047’, 214491183, ‘Query’, 5001, 214491470, ‘use my_db
; INSERT INTO table_1
(fk_1, fk_2, fk_3, att_1, att_2, att_4) VALUES (‘2453251’, 186757, 230399771, 1, ‘sent’, 1298311035),(‘105516509’, 186757, 230399771, 0, ‘new’, 1298311035)’
‘mysql-bin.000047’, 214491470, ‘Xid’, 5001, 214491497, ‘COMMIT /* xid=12719492 */’
…
‘mysql-bin.000047’, 214494305, ‘Query’, 5001, 214494388, ‘BEGIN’
‘mysql-bin.000047’, 214494388, ‘Intvar’, 5001, 214494416, ‘INSERT_ID=885357433’
‘mysql-bin.000047’, 214494416, ‘Query’, 5001, 214494704, ‘use my_db
; INSERT INTO table_1
(fk_1, fk_2, fk_3, att_1, att_2, att_4) VALUES (‘105620951’, 186757, 230399773, 1, ‘sent’, 1298311040),(‘38049876’, 186757, 230399773, 0, ‘new’, 1298311040)’
‘mysql-bin.000047’, 214494704, ‘Xid’, 5001, 214494731, ‘COMMIT /* xid=12719926 */’
Slave Data:
885357433, 105516509, 186757, 230399771, ‘0’, ‘new’, 0, 2011-02-21 11:57:15, 1298311035
885357431, 2453251, 186757, 230399771, ‘1’, ‘sent’, 0, 2011-02-21 11:57:15, 1298311035
As you can see, the slave got the event with 2 rows and did the inserts with 2, sequential IDs (as expected). However, the master did not and then later tried to do another insert with the already used PK and killed replication.
After changing the mode from 1 to 0 (traditional using table-level AUTO-INC locks), this issue has vanished. I’m wondering if this is a Percona/MySQL/XtraDB bug or if my inserts are technically considered multi-type (in which case documentation is correct in saying it is not safe for mode 1).
Any thoughts or comments would be appreciated!
Thanks