Hi
I think I have a similar problem. We have a three node Xtradb Cluster (5.6.15-25.5-759.wheezy). Clients Connect to only two nodes via a ipvs load balancer.
Every week we import two times a bunch of data (1500-3000 rows) in a single transaction to only one of the two active nodes. Sometimes we got this bug while we import the data. Sometimes not.
Befor we installed the PXC we used the version directy from galera (5.1.62-23.4 with galera-23.2.1) and got the bug only one time in a year. After we switched to PXC on the 16 june we got the bug three times (23 june, 7 july and 10 july).
If the bug occur we can start the failed node and do the import once again an it runs without a error.
Here is the Log from today:
2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
2014-07-10 10:29:16 29911 [Warning] WSREP: Failed to apply app buffer: seqno: 11963863, status: 1
at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
2014-07-10 10:29:16 29911 [Warning] WSREP: Failed to apply app buffer: seqno: 11963863, status: 1
at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
2014-07-10 10:29:16 29911 [Warning] WSREP: Failed to apply app buffer: seqno: 11963863, status: 1
at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
2014-07-10 10:29:16 29911 [ERROR] WSREP: Failed to apply trx: source: 6afdc17d-0674-11e4-8528-13d0bc22aafe version: 3 local: 0 state: APPLYING flags: 1 conn_id: 44231 trx_id: 415884858 seqno
s (l: 664181, g: 11963863, s: 11963862, d: 11963858, ts: 1978184313228671)
2014-07-10 10:29:16 29911 [ERROR] WSREP: Failed to apply trx 11963863 4 times
2014-07-10 10:29:16 29911 [ERROR] WSREP: Node consistency compromized, aborting...
2014-07-10 10:29:16 29911 [Note] WSREP: Closing send monitor...
2014-07-10 10:29:16 29911 [Note] WSREP: Closed send monitor.
2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: terminating thread
2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: joining thread
2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: closing backend
2014-07-10 10:29:16 29911 [Note] WSREP: view(view_id(NON_PRIM,6afdc17d-0674-11e4-8528-13d0bc22aafe,22) memb {
f4d2e14b-0673-11e4-99b4-d28700569b85,0
} joined {
} left {
} partitioned {
6afdc17d-0674-11e4-8528-13d0bc22aafe,0
a89cedd7-0674-11e4-8e61-ab5da27ec321,0
})
2014-07-10 10:29:16 29911 [Note] WSREP: view((empty))
2014-07-10 10:29:16 29911 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: closed
2014-07-10 10:29:16 29911 [Note] WSREP: Flow-control interval: [16, 16]
2014-07-10 10:29:16 29911 [Note] WSREP: Received NON-PRIMARY.
2014-07-10 10:29:16 29911 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 11963863)
2014-07-10 10:29:16 29911 [Note] WSREP: Received self-leave message.
2014-07-10 10:29:16 29911 [Note] WSREP: Flow-control interval: [0, 0]
2014-07-10 10:29:16 29911 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2014-07-10 10:29:16 29911 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 11963863)
2014-07-10 10:29:16 29911 [Note] WSREP: RECV thread exiting 0: Success
2014-07-10 10:29:16 29911 [Note] WSREP: recv_thread() joined.
2014-07-10 10:29:16 29911 [Note] WSREP: Closing replication queue.
2014-07-10 10:29:16 29911 [Note] WSREP: Closing slave action queue.
2014-07-10 10:29:16 29911 [Note] WSREP: /usr/sbin/mysqld: Terminated.
Aborted
140710 10:29:16 mysqld_safe Number of processes running now: 0
140710 10:29:16 mysqld_safe WSREP: not restarting wsrep node automatically
140710 10:29:16 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
Here the output from the GRA_*.log file:
mysqlbinlog --base64-output=DECODE-ROWS -vvv /root/GRA_2_10902541.log
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#120715 9:45:56 server id 1 end_log_pos 107 Start: binlog v 4, server v 5.5.25-debug-log created 120715 9:45:56 at startup
# Warning: this binlog is either in use or was not closed properly.
ROLLBACK/*!*/;
# at 107
#140707 10:23:29 server id 0 end_log_pos 80 Query thread_id=335318 exec_time=0 error_code=0
SET TIMESTAMP=1404721409/*!*/;
SET @@session.pseudo_thread_id=335318/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=3, @@session.auto_increment_offset=3/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 187
#140707 10:23:29 server id 0 end_log_pos 149 Table_map: `lottohelden`.`kkb` mapped to number 74
ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 129, event_type: 30
WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
The create statement:
Create Table: CREATE TABLE `kkb` (
`nummer` bigint(20) unsigned NOT NULL DEFAULT '0',
`letzte_nummer` bigint(20) unsigned DEFAULT NULL,
`user` varchar(12) NOT NULL DEFAULT '',
`betrag` bigint(20) NOT NULL DEFAULT '0',
`gk` tinyint(6) unsigned NOT NULL DEFAULT '0',
`gk_ref` bigint(20) unsigned NOT NULL DEFAULT '0',
`gv_typ` tinyint(6) unsigned NOT NULL DEFAULT '0',
`gv_ref` bigint(20) unsigned NOT NULL DEFAULT '0',
`saldo` bigint(20) NOT NULL DEFAULT '0',
`alter_saldo` bigint(20) NOT NULL DEFAULT '0',
`gebucht_um` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`gebucht_von` varchar(12) NOT NULL DEFAULT '',
`is_storno` tinyint(1) unsigned NOT NULL DEFAULT '0',
`storniert_nummer` bigint(20) unsigned DEFAULT NULL,
`kommentar` bigint(20) unsigned DEFAULT NULL,
`auszahlbar` bigint(20) NOT NULL DEFAULT '0',
PRIMARY KEY (`nummer`),
UNIQUE KEY `storniert_nummer` (`storniert_nummer`),
UNIQUE KEY `letzte_nummer` (`letzte_nummer`),
KEY `gebucht_um` (`gebucht_um`),
KEY `gk` (`gk`),
KEY `gv_typ` (`gv_typ`),
KEY `user` (`user`),
KEY `gv_ref` (`gv_ref`),
KEY `gk_ref` (`gk_ref`),
KEY `is_storno` (`is_storno`),
KEY `gk_ref_2` (`gk_ref`,`gk`),
KEY `user_2` (`user`,`gk`,`nummer`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
I found this Post with a similar error message.
[URL=“Percona XtraDB Cluster (PXC): what about GRA_*.log files ? - Percona Database Performance Blog”]http://www.mysqlperformanceblog.com/...ra_-log-files/[/URL]
But Frederic wrote: “In this case it’s obvious why it failed but it’s not always the case” and this is not very helpfull for me ;-(
Do you need any additional information?
regards
ralph