Percona XtraDB Cluster susceptible to disk-bound nodes

We have a three node cluster but for the purposes of experimentation I’m only sending traffic to one node (which I’ll call the active node, the other two being inactive).

If I perform disk-heavy operations on the inactive nodes, the active node gets bogged down behind a lot of pending WSREP commits.

All three nodes are on RAID10 EBS volumes, however only one of them is running with Provisioned IOPs. I’m going to replace the storage on the other two nodes so that they’re using Provisioned IOPs as well and repeat the experiments but I was wondering if there was something else I should be looking into?


Ideally all the nodes in PXC cluster should be identical in terms of hardware. This is due to the fact that Galera takes care about replication lag, hence the write throughput of the whole cluster is limited by the slowest node. So if you overload one node, which will make it slow, it may be slower in applying writesets and trigger Flow Control:
You may however allow a particular node to get behind by switching desync mode on it, see some examples: