Hi,
i have a Percona XtraDB Cluster 8.0.33-25.1
My cluster have 2 Master-node and a abritators.
The cluster container ~1000 database with ~180 tables in each database ( its saas app… 1database by client )
We make a lot of cleanning and refactor in the code and now i have to DROP 100 tables from each databse …
My cluster didn’t really accepte DDL lock without crashing du to the hight activity.
So DROP TABLE make a DDL lock ^^ and i have to run it 100 000 times …
What is your advise to drop this 100 000 tables ?
Can i use external tools to perform the drop without make a DDL lock ?
( There are some table with complex FK… so can’t really remove data without disabled fk checking )
1 Like
Hi @TAndreotti,
DDL locks occur even on regular Percona MySQL. Any time you ALTER, or DROP a table, locks are created/managed. This situation is magnified due to the cluster nature.
If you are 100% certain that no writes are occurring on the tables, you can change PXC to use ‘RSU’ mode, which will not replicate DDLs. You must run the DROP TABLE on both PXC1 and PXC2.
pxc1> SET wsrep_osu_method = ‘RSU’;
pxc1> DROP TABLE ...;
Then repeat on pxc2.
WARNING!! You will suffer greater cluster failure if you drop a table from pxc1 AND writes happen to this table on pxc2. You must 100000000% ensure that no writes are happening to the table when using RSU.
If you cannot ensure no writes, then you must continue to use TOI mode, and drop the tables slowly over time.
One other option would be to shut down pxc2, make all changes on pxc1, erase pxc2, and start pxc2. This will force pxc2 to get a new data copy which would not have the dropped tables.
1 Like
Thanks for your answer.
Shutting down 1 node, performing the drop, and restarting the node is also a solution. However, the last time our cluster performed an IST sync, it took 8 hours. So, while this solution is viable, having 8 hours without a backup node would be quite stressful for me.
The 100,000 tables I want to drop are no longer in use, but there are another 80,000 tables that are intensively used. At this point, I see two possible solutions:
- A script that drops 1 table at a time and waits 200ms between each drop. This approach would take at least 6 hours, but seem secure.
- Redirect all traffic to node1, set node2 to RSU, perform the drop, then redirect all traffic to node2, set node1 to RSU, and perform the drop on it.
In the second solution, if one node crashes during the process, would it force an IST sync?
1 Like
Make sure you understand the difference between SST and IST. These are two very different things. An SST makes a complete copy of the dataset, while an IST is only the delta, or incremental changes. If the gcache file on pxc1 is too small, then IST cannot happen, and an SST will take place instead automatically. 8 hours sounds like an SST, not an IST.
This would force an SST, not an IST.
I would go with option #1, as it is the safest.
2 Likes