Missing/Errant GTIDs in Multi-Master Galera Cluster

Hi all, looking for some advice to the following problem.

I have 4 PXC nodes with 3 masters (Galera cluster) and a slave node. I’ve been doing research into why the slave is unable to switch between masters and this is due to differing GTID_EXECUTED values across all three masters. A lot of posts and blogs about this specifically deal with a slave not having the missing transactions rather than there being an inconsistency with the masters.

// Master 1
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
gtid_executed | 039c8dc4-5cda-ee18-77f0-78972002092e:1-1057812217,
43a41fb7-a283-11e9-ba9a-245ebe295fb1:1-6,
9ca3401d-a332-11e7-8170-00155de15408:1-2,
c47112cb-abf7-11ea-a6df-245ebe295fb1:1-11
// Master 2
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
039c8dc4-5cda-ee18-77f0-78972002092e:1-1057812127,
9ca3401d-a332-11e7-8170-00155de15408:1-2,
d11311e3-9fda-11e8-bb79-245ebe2912cd:1-3
// Master 3
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
gtid_executed | 039c8dc4-5cda-ee18-77f0-78972002092e:1-1057812127,
43a41fb7-a283-11e9-ba9a-245ebe295fb1:1-6,
9ca3401d-a332-11e7-8170-00155de15408:1-2,
c47112cb-abf7-11ea-a6df-245ebe295fb1:1-3,
d31ef178-704d-11eb-9f55-002590c9381e:1-2

So question 1, is how is this possible? I’ve been trying to replicate the problem using a similar 3-node cluster that I’ve setup in some virtual machines before actioning anything in production.

My expectation is that these errant GTIDs are possibly years old and long since gone. So question 2 is how can I be sure the cluster doesn’t have any data drift? I took a look at pt-table-checksum, but this can’t run with the binlog_format=ROW and I’m not sure if it’s safe to change it (perhaps worth noting that the DB is around 1TB in size).

And finally, question 3 is how to resolve this problem so that all masters are in sync. Obviously I would like to be as sure as possible about the state of the cluster before choosing a node that is considered the single source of truth and restoring the whole cluster from it should this be necessary.