We are using a percona cluster with 3 nodes and today at the same time 2 nodes go down.
In the log I only see that they is an active transaction going in ROLL BACK and after a report about the crash
I have attached the log.
any idea ?
debug_txt.txt (13.5 KB)
The tail of the error log does not tell much. I am interested what happened shortly before these InnoDB monitor entries started to appear in the error log.
Were there any Galera conflicts reported? Were there any “WSREP: BF lock wait long” messages logged maybe?
I can see one local transaction in ROLLING BACK state, while wsrep appliers cannot commit, so clearly some write conflict was not resolved properly here.
Do you use stored procedures perhaps?