Any reason not to set default wsrep-osu-method to NBO?

I am trying to understand if there is any reason to leave wsrep_osu_method as TOI when we have NBO as an option.

I don’t think RSU is a good option for us because it requires users to understand and remember to run the update on all nodes.

We sometimes have users running DDL on non-critical tables during business hours with the assumption it would not impact production schema. With TOI this is causing outages for us. My thought is to change the default to NBO to avoid this.

I’m trying to understand is there is any downside?

Version 8.0.33 on Debian

Hello @Patrick_C,
I don’t see any reason to not use NBO in this case.

I’m testing and also using the NBO method in production systems. When the DDL operation completed one of the nodes on the cluster, on the other nodes I noticed that the open transactions. Because of that binlog files are not relocated on this nodes.

Now I’m trying to restore the latest xtrabackup from the problematic node, I’m getting the following error when I try to start the restored database.

I will start a topic to discuss this issue. I think this method has some bugs.

2023-12-12T09:25:53.810621Z 0 [ERROR] [MY-013909] [Server] Found invalid event sequence while recovering from binary log file ‘./binlog.000416’, between positions 109079998 and 109080475: Xid_log_event holds an invalid XID. The recovery process was stopped early and no transaction was recovered. Side effects may be transactions in an inconsistent state between the binary log and the storage engines, or transactions kept by storage engines in a prepared state (possibly holding locks). Either fix the issues with the binary log or, to release possibly acquired locks, disable the binary log during server recovery. Note that disabling the binary log may lead to loss of transactions that were already acknowledged as successful to client connections and may have been replicated to other servers in the topology.

I’m just following up on this. I did not realize that NBO will not let you do things like “truncate table” or “create user” So… this is definitely not a good solution to use as the default.

I guess we are back to using NBO as a session variable.

So did this happen on all other nodes except the one where the transaction was run?