Wan implementation

We have 2 datacenters with 2 nodes and 1 datacenter with only one node.

Sometimes an insert in a node would take down all nodes (except itself) with a “Node consistency compromized”, bringing down the cluster.
Looking at the log we could see the error:

[ERROR] Slave SQL: Could not execute Write_rows event on table main.documents; Duplicate entry ‘3729-3882600-01P2017-17040’ for key ‘sequence’, Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event’s master log FIRST, end_log_pos 455, Error_code: 1062

It seemed like nodes were accepting the same statement more than once (from different nodes), and when applying transactions, it would raise the error.

So, we’ve investigated and read more about WAN implementation and made some changes, like defining segments, changing network configuration, using transactions instead of autocommited statements…

[server]

net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_slow_start_after_idle = 0

[my.cnf]

innodb_flush_log_at_trx_commit = 2
binlog_row_image = minimal
wsrep_sync_wait = 2

wsrep_provider_options = ’
gmcast.segment=1|2|3;
gcache.size=1G;
evs.user_send_window=256;
evs.send_window=512;
evs.suspect_timeout=PT15S;
evs.inactive_timeout=PT45S;
gcs.max_packet_size=1048576’.

Despite that, we keep getting this kind of error…

Any thoughts?

Thank you.

Are there any statement that are running local the said data-center or node ?

At the time, only the node that was kept alive (DC3) was the one running a local statement.
All others (2xDC1 and 2xDC2) were not receiving local statements and have log entries such as:

2017-10-13T12:54:16.308289Z 5 [ERROR] Slave SQL: Could not execute Write_rows event on table main.documents; Duplicate entry ‘3729-3882600-01P2017-17040’ for key ‘sequence’, Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event’s master log FIRST, end_log_pos 455, Error_code: 1062
2017-10-13T12:54:16.308425Z 5 [Warning] WSREP: RBR event 7 Write_rows apply warning: 121, 15569840
2017-10-13T12:54:16.309214Z 5 [Warning] WSREP: Failed to apply app buffer: seqno: 15569840, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 2th time
2017-10-13T12:54:16.309771Z 5 [ERROR] Slave SQL: Could not execute Write_rows event on table main.documents; Duplicate entry ‘3729-3882600-01P2017-17040’ for key ‘sequence’, Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event’s master log FIRST, end_log_pos 455, Error_code: 1062
2017-10-13T12:54:16.309795Z 5 [Warning] WSREP: RBR event 7 Write_rows apply warning: 121, 15569840
2017-10-13T12:54:16.310171Z 5 [Warning] WSREP: Failed to apply app buffer: seqno: 15569840, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2017-10-13T12:54:16.310663Z 5 [ERROR] Slave SQL: Could not execute Write_rows event on table main.documents; Duplicate entry ‘3729-3882600-01P2017-17040’ for key ‘sequence’, Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event’s master log FIRST, end_log_pos 455, Error_code: 1062
2017-10-13T12:54:16.310684Z 5 [Warning] WSREP: RBR event 7 Write_rows apply warning: 121, 15569840
2017-10-13T12:54:16.311046Z 5 [Warning] WSREP: Failed to apply app buffer: seqno: 15569840, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2017-10-13T12:54:16.311589Z 5 [ERROR] Slave SQL: Could not execute Write_rows event on table main.documents; Duplicate entry ‘3729-3882600-01P2017-17040’ for key ‘sequence’, Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event’s master log FIRST, end_log_pos 455, Error_code: 1062
2017-10-13T12:54:16.311612Z 5 [Warning] WSREP: RBR event 7 Write_rows apply warning: 121, 15569840
2017-10-13T12:54:16.312261Z 5 [ERROR] WSREP: Failed to apply trx: source: e67403b7-a91a-11e7-bb23-6e69a82d477f version: 3 local: 0 state: APPLYING flags: 1 conn_id: 256554 trx_id: 16103959 seqnos (l: 1208502, g: 15569840, s: 15569839, d: 15569797, ts: 773497920874558)
2017-10-13T12:54:16.312287Z 5 [ERROR] WSREP: Failed to apply trx 15569840 4 times
2017-10-13T12:54:16.312297Z 5 [ERROR] WSREP: Node consistency compromized, aborting…

RTT between DC1 and DC2 is 12ms and between them and DC3 is 30ms.
Instances have low cpu and network usage. All runs well for some days, before giving such an error.

I wanted to chime in, because I had a similar error but have a little extra debugging information that might help with the search. I believe we are hitting the same bug in mysql/percona.

As with Jorang, I also have a DC with 2 nodes (node A, node B), and a second DC with 1 node (node C). 3 nodes in total.

  • No local scripts were running on the nodes. That would explain the inconsistency but as far as I know, that is not happening. And even if it did, the sync should have worked the other way around as well. (multi write master right?)
  • All nodes are running:
    Server version: 5.7.18-15-57 Percona XtraDB Cluster (GPL),
    Release rel15, Revision 7693d6e,
    WSREP version 29.20, wsrep_29.20
  • I have checked the release notes of version 5.7.19 but I can’t see any mention of this error.
    - Jorang: could you paste your version as well?
  • All nodes are Ubuntu 16.04
  • Only node B remained on line.
  • Node A and Node C were shut down because of a node consistency problem
  • Upon restart Node A or Node C, a full SST was done. (according to the log it couldn’t do an IST because of an unexpected shutdown)
  • I got the same HA_ERR_FOUND_DUPP_KEY error but in my case it is clearly a unique index on two columns. Jorang is that “sequence” key something you know? Is it a simple unique key, or a unique key on two columns?
  • Node A and Node C had the same error in /var/log/mysql/error.log (similar to Jorang’s error):

2017-10-20T08:06:43.877112Z 6 [ERROR] Slave SQL: Could not execute Write_rows event on table himalaya_tdv_renault.inspection; Duplicate entry ‘224-560744’ for key ‘un_inspection_repairorder’, Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event’s master log FIRST, end_log_pos 490, Error_code: 1062
2017-10-20T08:06:43.877135Z 6 [Warning] WSREP: RBR event 5 Write_rows apply warning: 121, 32518800
2017-10-20T08:06:43.877598Z 6 [Warning] WSREP: Failed to apply app buffer: seqno: 32518800, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 2th time

Retrying 3th time

Retrying 4th time

2017-10-20T08:06:43.878931Z 6 [ERROR] WSREP: Failed to apply trx 32518800 4 times
2017-10-20T08:06:43.878936Z 6 [ERROR] WSREP: Node consistency compromized, aborting…
2017-10-20T08:06:43.878943Z 6 [Note] WSREP: Closing send monitor…
2017-10-20T08:06:43.878947Z 6 [Note] WSREP: Closed send monitor.
2017-10-20T08:06:43.878953Z 6 [Note] WSREP: gcomm: terminating thread
2017-10-20T08:06:43.878969Z 6 [Note] WSREP: gcomm: joining thread
2017-10-20T08:06:43.879100Z 6 [Note] WSREP: gcomm: closing backend

I’m guessing the developers would like a reproducible use case. But I don’t have one at the moment, I don’t know where to begin… Any idea on how we can narrow this error down?

As with Jorang, our system is running again, but since I don’t know exactly what caused it, I’m pretty sure it can happen again.

Hi,

1 - We are running the same version on Ubuntu 16.04.

5.7.18-15-57-log
Percona XtraDB Cluster (GPL), Release rel15, Revision 7693d6e,
WSREP version 29.20, wsrep_29.20

2 - The “sequence” is a unique key on 4 columns.

I think this is a problem that occurs when having multi-node transactions communication, whenever a “slave” accepts a transaction and then receives the same transaction from another node which is also accepted. Then, when it tries to commit both, it fails.

I assumed the wsrep would take care of such events, but it seems it does not, at least not always.

Sadly, we had to migrate to a “one way transactions flow arquitecture” to avoid such problems:

  • We now have two datacenters, DC1 and DC2. On DC1 we have two nodes, A and B and on DC2 only one C.
  • Writes are performed to only node A, so that a transaction goes from A to B and then from DC1 to DC2 to C.

In theory, we could also write to B, but we’ve noticed some problems/conflicts resulting from gap locks while in multi-node writing.

Horizontal scaling is somewhat impossible this way, but for us the goal is redundancy and high-availability. For now, at least…

In your case, it seems the transaction was originally made to B, so the error should only occur if you haven’t configured segments for each node. I think…

Jorang,

You might be on to something. I did not configure gmcast.segment. I have done that now, thanks for the tip. For future googlers: check out [URL=“http://galeracluster.com/documentation-webpages/galeraparameters.html#gmcast-segment”]http://galeracluster.com/documentati...gmcast-segment[/URL]

That said, I am not confident that this will solve the issue. I find it intriguing that your “sequence” is a multicolumn unique key. The same is true for my index on un_inspection_repairorder (a unique on 2 columns). This doesn’t look like a coincidence, right? I wonder how to formulate a bug report to percona (or galera?) on this.
Some extra information. Some people experience the same issue using mariadb.
See: [URL][MDEV-9309] Galera node crashed with error code 1062 - Jira
That makes me think this is a galera issue. I’ve also found a percona bug report saying something very similar. [URL]https://bugs.launchpad.net/codership-mysql/+bug/1179361[/URL] and that should already be fixed in 5.6.19-25.6 . But I am unsure if this is the same error. In my case my unique index is on a mediumint and smallint, the report talks about UTF8. But this might be the cause of the issue in your case. Does one or more of your unique key columns contain UTF characters? If so, try upgrading to 5.6.19. … Edit: forget it, we’re both on 5.7.18.

In my use case I need to be able to write on all nodes. But, in normal operation we pin each individual database to a database server. We use proxysql for this. As mentioned in the documentation and advice from Percona, it is considered a bad idea to have multiple nodes receive writes at the same time, so we try to avoid it as best we can.