We would like to describe and understand the impact of large writes (megabytes of data or thousands of rows) submitted as a single transaction, which could severely impact cluster performance and stability.
How do these large writes impact the cluster and its replication mechanism, and which status variables and logs are worth checking?
Are there any tuning parameters that can make the cluster more robust to prevent from suffering the impact of such large writes?
HLL (in SHOW ENGINE INNODB STATUS) is a good metric to check
SHOW PROCESSLIST; can also show long lived executions
You should keep transaction size to be low , split large transactions into smaller chunks and commit every now and then rather than executing all in a single transaction trying to eat more than you can chew