Hello all,
Sorry in advance for the long post. I’m having issues with a stored procedure causing certification errors when testing a 3 nodes Percona XtraDB cluster using multiple concurent connection threads. The affected stored procedure does the following: 1/ create table if it doesn’t exist 2/ run multiple select statements to set variables to be inserted in the table 3/ delete all data in the table
And creates the table with: CREATE TABLE IF NOT EXISTS my-sp-table ( var1 INT NOT NULL AUTO_INCREMENT, var2 BINARY(16), var3 INT, var4 INT, var5 BIGINT, PRIMARY KEY (var1) ) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = InnoDB;
All writes are done on the cluster node with wsrep_local_index = 0 (wsrep_gcomm_uuid is 88487ed5-85ae-11ea-87d9-e7555a59a6c2) while reads are going to one fo the other cluster nodes. So far, regardless of number of concurent connection threads, table always gets created without error. Same as inserts. Deletes from the table however, usually generates more certification errors as the number of concurent connection threads increases.
I have tried various config changes including setting wsrep_certification_rules from STRICT to OPTIMIZED, try different values of wsrep_slave_threads, wsrep_retry_autocommit and innodb_autoinc_lock_mode without any luck.
I have been banging my head against this problem for weeks so any insights or suggestions will be greatly appreciated. Looking at the logs (see below. obfuscated for obvious reason), it looks like certification fails on the writer node itself when it tries the apply the transactions. I will be very grateful to anyone who can shine some light on this issue and help me resolve it.
-Bob
************************* Cluster Settings *************************Variable_name: wsrep_provider_name Value: Galera
Variable_name: wsrep_provider_vendor
Value: Codership Oy <info@codership.com>
Variable_name: wsrep_provider_version Value: 3.41(rb3295e6)
Server version 5.7.28-31-57.2-log Percona XtraDB Cluster (GPL), Release rel31, Revision 939b65f, WSREP version 31.41, wsrep_31.41
OS: Ubuntu 18.04.4 LTS
SHOW GLOBAL STATUS LIKE ‘wsrep_local_%r_s’;±--------------------------±------+| Variable_name | Value |±--------------------------±------+| wsrep_local_cert_failures | 5 || wsrep_local_bf_aborts | 0 |±--------------------------±------+2 rows in set (0.01 sec)
show variables like “wsrep_slave_threads”;±--------------------±------+| Variable_name | Value |±--------------------±------+| wsrep_slave_threads | 128 |±--------------------±------+1 row in set (0.00 sec)
show variables like “%innodb_autoinc_lock_mode%”;±-------------------------±------+| Variable_name | Value |±-------------------------±------+| innodb_autoinc_lock_mode | 2 |±-------------------------±------+1 row in set (0.01 sec)
show variables like “wsrep_auto_increment_control”;±-----------------------------±------+| Variable_name | Value |±-----------------------------±------+| wsrep_auto_increment_control | ON |±-----------------------------±------+1 row in set (0.00 sec)
show variables like “wsrep_slave_fk_checks”;±----------------------±------+| Variable_name | Value |±----------------------±------+| wsrep_slave_FK_checks | ON |±----------------------±------+1 row in set (0.00 sec)
show variables like “%foreign%”;±-------------------±------+| Variable_name | Value |±-------------------±------+| foreign_key_checks | ON |±-------------------±------+1 row in set (0.00 sec)
show variables like “%auto_increment_increment%”;±-------------------------±------+| Variable_name | Value |±-------------------------±------+| auto_increment_increment | 3 |±-------------------------±------+1 row in set (0.01 sec)
show variables like “%wsrep_certification_rules%”;±--------------------------±----------+| Variable_name | Value |±--------------------------±----------+| wsrep_certification_rules | optimized |±--------------------------±----------+1 row in set (0.01 sec)
************************* Logs Relevant Sections *************************===CLUSTER-WRITE_NODE===2020-04-23T19:26:11.784201Z 971 [Note] WSREP: set_query_id(), assigned new next trx id: 1435562020-04-23T19:26:11.790425Z 971 [Note] WSREP: Executing Query (CREATE TABLE IF NOT EXISTS my-sp-table ( var1 INT NOT NULL AUTO_INCREMENT, var2 BINARY(16), var3 INT, var4 INT, var5 BIGINT, PRIMARY KEY (var1) ) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = InnoDB) with write-set (-1) and exec_mode: LOCAL_STATE in TO Isolation mode2020-04-23T19:26:11.792398Z 971 [Note] WSREP: Query (CREATE TABLE IF NOT EXISTS my-sp-table (
var1 INT NOT NULL AUTO_INCREMENT, var2 BINARY(16), var3 INT, var4 INT, var5 BIGINT, PRIMARY KEY (var1) ) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = InnoDB) with write-set (70542) and exec_mode: TOTAL_ORDER replicated in TO Isolation mode2020-04-23T19:26:11.792430Z 971 [Note] WSREP: wsrep: initiating TOI for write set (70542)2020-04-23T19:26:11.793425Z 971 [Note] WSREP: wsrep: completed TOI write set (70542)2020-04-23T19:26:11.793461Z 971 [Note] WSREP: Setting WSREPXid (InnoDB): f3dfd56b-75e0-11ea-ade3-3246be302bf4:705422020-04-23T19:26:11.795324Z 971 [Note] WSREP: Completed query (CREATE TABLE IF NOT EXISTS my-sp-table ( var1 INT NOT NULL AUTO_INCREMENT, var2 BINARY(16), var3 INT, var4 INT, var5 BIGINT, PRIMARY KEY (var1) ) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = InnoDB) replication with write-set (70542) and exec_mode: TOTAL_ORDER in TO Isolation mode2020-04-23T19:26:11.795364Z 971 [Note] WSREP: Assigned new next trx-id (146606) to Store-Procedure execution2020-04-23T19:26:11.796132Z 971 [Note] WSREP: cleanup transaction for LOCAL_STATE: SELECT id FROM other_table WHERE org-id = _org_id AND status = my_status ORDER BY display_order LIMIT 1 INTO @var42020-04-23T19:26:11.796188Z 971 [Note] WSREP: Assigned new next trx-id (149256) to Store-Procedure execution2020-04-23T19:26:11.830587Z 971 [Note] WSREP: cleanup transaction for LOCAL_STATE: CALL my-sp-table(‘5964’, ‘in’, ‘2020-03-24 05:00:00’, ‘2020-04-24 04:59:59’)2020-04-23T19:26:11.830625Z 971 [Note] WSREP: Assigned new next trx-id (149436) to Store-Procedure execution2020-04-23T19:26:12.146230Z 971 [Note] WSREP: wsrep: replicating commit (-1)2020-04-23T19:26:12.146542Z 971 [Note] WSREP: wsrep: initiating replication for write set (-1)2020-04-23T19:26:12.158894Z 971 [Note] WSREP: wsrep: write set replicated (70551)2020-04-23T19:26:12.158942Z 971 [Note] WSREP: wsrep: initiating pre-commit for write set (70551)2020-04-23T19:26:12.158990Z 971 [Note] WSREP: SH-EX trx conflict for key (0,FLAT8)443c93c2 f2b030b8: source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 1 state: CERTIFYING flags: 1 conn_id: 971 trx_id: 149436 seqnos (l: 135, g: 70551, s: 70549, d: 70537, ts: 10652749573830401) <—> source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 1 state: APPLYING flags: 65 conn_id: 977 trx_id: -1 seqnos (l: 134, g: 70550, s: 70549, d: 70549, ts: 10652749571481014)2020-04-23T19:26:12.159013Z 971 [Note] WSREP: pre-commit action failed for reason: WSREP_TRX_FAIL THD: 971 Query: COMMIT2020-04-23T19:26:12.159023Z 971 [Note] WSREP: conflict state: NO_CONFLICT2020-04-23T19:26:12.159031Z 971 [Note] WSREP: --------- CONFLICT DETECTED --------2020-04-23T19:26:12.159039Z 971 [Note] WSREP: cluster conflict due to certification failure for threads:2020-04-23T19:26:12.159050Z 971 [Note] WSREP: Victim thread: THD: 971, mode: local, state: executing, conflict: cert failure, seqno: 70551 SQL: COMMIT2020-04-23T19:26:12.196016Z 971 [Note] WSREP: cleanup transaction for LOCAL_STATE: COMMIT2020-04-23T19:26:12.196039Z 971 [Note] WSREP: COMMIT command failed, MDL released: 9712020-04-23T19:26:12.196090Z 971 [Note] WSREP: Assigned new next trx-id (404507) to Store-Procedure execution
===CLUSTER-WRITE_NODE===2020-04-23T19:26:12.158990Z 971 [Note] WSREP: SH-EX trx conflict for key (0,FLAT8)443c93c2 f2b030b8: source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 1 state: CERTIFYING flags: 1 conn_id: 971 trx_id: 149436 seqnos (l: 135, g: 70551, s: 70549, d: 70537, ts: 10652749573830401) <—> source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 1 state: APPLYING flags: 65 conn_id: 977 trx_id: -1 seqnos (l: 134, g: 70550, s: 70549, d: 70549, ts: 10652749571481014)
===CLUSTER-READ-NODE1===2020-04-23T19:26:13.837102Z 63 [Note] WSREP: SH-EX trx conflict for key (0,FLAT8)443c93c2 f2b030b8: source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 1 conn_id: 1007 trx_id: 2177603 seqnos (l: 235, g: 70644, s: 70642, d: 70598, ts: 10652751242588179) <—> source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 65 conn_id: 1017 trx_id: -1 seqnos (l: 234, g: 70643, s: 70642, d: 70642, ts: 10652751239191817)
===CLUSTER-READ-NODE2===2020-04-23T19:26:12.161926Z 86 [Note] WSREP: SH-EX trx conflict for key (0,FLAT8)443c93c2 f2b030b8: source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 1 conn_id: 971 trx_id: 149436 seqnos (l: 131, g: 70551, s: 70549, d: 70537, ts: 10652749573830401) <—> source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 65 conn_id: 977 trx_id: -1 seqnos (l: 130, g: 70550, s: 70549, d: 70549, ts: 10652749571481014)
2020-04-23T19:26:12.408773Z 3 [Note] WSREP: SH-EX trx conflict for key (0,FLAT8)443c93c2 f2b030b8: source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 1 conn_id: 974 trx_id: 299394 seqnos (l: 155, g: 70574, s: 70572, d: 70544, ts: 10652749822545683) <—> source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 65 conn_id: 998 trx_id: -1 seqnos (l: 154, g: 70573, s: 70572, d: 70572, ts: 10652749820871318)
2020-04-23T19:26:12.434432Z 44 [Note] WSREP: SH-EX trx conflict for key (0,FLAT8)443c93c2 f2b030b8: source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 1 conn_id: 973 trx_id: 287623 seqnos (l: 159, g: 70577, s: 70575, d: 70546, ts: 10652749847941914) <—> source: 86018775-8591-11ea-88a6-abfdd6aa92f2 version: 4 local: 0 state: CERTIFYING flags: 65 conn_id: 1000 trx_id: -1 seqnos (l: 158, g: 70576, s: 70575, d: 70575, ts: 10652749845781985)