Deadlocking in Multi-Master Replication Read/Write to ALL Nodes

We have 3 PCX Nodes and 2 HAProxy Load Balancers in front with KeepAlived for failover between, so every moment of time reads/writes go to a single node (to avoid deadlocking)

Is there any possibility to have “least connection algorithm” on HAProxy to write/read to all 3 nodes without deadlocking?

Please advice a solution.

For a very short transactions (like single row update query), the chance of cross node deadlock is very low. For inserts to a table having auto_increment PK, there is no way to get conflict if you let PXC cluster to automatically control increment/offset. Problem is with long transactions possible modifying big number of rows. Either you make the application deadlock aware and handle them or maybe route writes to different data hot points in each node. For example if several applications are using the same PXC cluster - make them use their own separate write node.

As I understand your answer, I could write to all 3 pxc nodes at the same time in multi-master sync replication without any conflicts?

No. przemek’s comment about “no way to get conflicts” is only for inserts. That’s because Percona/Galera will set auto_increment_increment and auto_increment_offset, so node1 will create IDs 1, 4, 7, etc, node2 will create IDs 2, 5, 8, etc, and node3 will create IDs 3, 6, 9, etc. HOWEVER, “writes” can be inserts, updates, or deletes. And if you try to update the same record on two different nodes simultaneously, you have the potential for a conflict. Best approach is to expect that if it can happen, it will, especially if you’re expecting any level of scale.

You could hypothetically distribute writes to different nodes by designating an “owner node” for each DB or even table (though the latter case would be more complicated). For example, if your cluster is hosting three DBs, you could direct all DB1 writes to node1, all DB2 writes to node2, and all DB3 writes to node3. Row-based replication (RBR) makes updates faster on the replicated nodes so this would improve your scalability some, though maybe not worth the extra management effort.