Partitioning problem after Percona XtraDB cluster upgrading

Problem

After upgrading Percona XtraDB cluster from 5 version to 8 version (8.0.28-19.1) I have some problems with the cluster in prod environment:

  1. Time of partition updating was increased 5 times. (before average time was about 2 sec, now it is about 10-11 sec.)
  2. When partition updating is starting all active transactions are crushed with an error “Deadlock found when trying to get lock; try restarting transaction”

An example to repeat the deadlock problem

// create a table to test the problem
CREATE TABLE test_table
(
    id BIGINT UNSIGNED AUTO_INCREMENT,
    value BIGINT NOT NULL,
    created_at DATETIME NOT NULL,
    primary key (id, created_at)
)
COLLATE = utf8_unicode_ci PARTITION BY RANGE COLUMNS (`created_at`) (
    PARTITION p20221105 VALUES LESS THAN ('2022-11-06 00:00:00'),
    PARTITION p20221106 VALUES LESS THAN ('2022-11-07 00:00:00'),
    PARTITION p20221107 VALUES LESS THAN ('2022-11-08 00:00:00'),
    PARTITION p20221108 VALUES LESS THAN ('2022-11-09 00:00:00'),
    PARTITION p20221109 VALUES LESS THAN ('2022-11-10 00:00:00')
);

CREATE INDEX idx_created_at ON test_table (created_at);

In one session run a transaction with sleep to do transaction execution and partition updating at the same time

START TRANSACTION;
INSERT INTO test_table (value, created_at) VALUES (1, '2022-11-09 13:00:00');
SELECT SLEEP(30);
COMMIT;

In another session run a partition updating - drop the old partition and add a new one

ALTER TABLE test_table DROP PARTITION p20221105;
ALTER TABLE test_table ADD PARTITION (partition p20221110 values less than ('2022-11-11 00:00:00'));

After running a query to add/drop the partition, the transaction instantly fails with a deadlock error.

Structure of the cluster
cluster

wsrep_cluster_address for each node has the same values gcomm://node1,node2,node3

Hi @qonand

Thank you for joining our community. Regarding your problem, I have a few questions:

  • Are you executing both the select and the alters on the same node?
  • Are the other nodes accepting writes?
  • What is the value of wsrep_OSU_method

Thank you!

1 Like

Hi!

Are you executing both the select and the alters on the same node?

Yes, I do it on the same node

Are the other nodes accepting writes?

Yes, if I run a query to write some data all three nodes accept it

What is the value of wsrep_OSU_method

TOI

1 Like

Hi @qonand,

One more question… when you say:
“After running a query to add/drop the partition, the transaction instantly fails with a deadlock error.”

Do you mean that insert fails? If this is what you mean, this is expected behavior:

https://galeracluster.com/library/documentation/schema-upgrades.html

“Transactions that were in progress while the DDL was running and that involved the same database resource will get a deadlock error at commit time and will be rolled back.”

Thank you,

Pep

1 Like

Thank you for the answer Pep.

Do you mean that insert fails?

Yes. In fact, any transaction in progress fails during add/drop a partition with the error "Deadlock found when trying to get lock; try restarting transaction”

Schema Upgrades — Galera Cluster Documentation

“Transactions that were in progress while the DDL was running and that involved the same database resource will get a deadlock error at commit time and will be rolled back.”

Thanks, It explains this behavior. But before upgrading the cluster to version 8, also TOI was used as wsrep_OSU_method and there wasn’t any problem like deadlocks (I mean version 5). And it is hard to understand why different versions with the same wsrep_OSU_method give different results.

And one more question - what can cause an increase in partition time?

1 Like

It is hard to tell. It could be for a different number of reasons. Do you know what does it take time? Droping the partition or adding the new one? Sometimes dropping tables takes some time because the filesystem is not fast removing files. If this is a problem for you, you can architect the change a little bit and check if a process like this could be faster:

  • Create empty table.
  • Alter table exchange partition between the empty table and the partition you want to remove.
  • Alter table drop partition (this should be faster as the file is smaller).
  • Alter table add partition.
  • Drop the exchanged table at a later time. (you can even run this step using RSU, but beware of the GTID side effects, maybe you will need to disable binlog for that session.)

And remember to test this process before putting it in production! RSU changes, if not executed properly, can lead to nodes being expelled from the cluster or inconsistencies.

1 Like

Thanks for your advice!

But I still can’t understand why I didn’t have the problem with deadlock during partitioning in version 5. The cluster had the same structure and also wsrep_OSU_method was TOI. Can you explain this situation, please?

1 Like

The link to the documentation is specific to Galera, so deadlocks should happen both on 5.7 and 8. The only possible explanation is that some queries on 8.0 take longer than they took on 5.7. This could explain why we now detect more deadlock errors than we detected previously.

1 Like

Thanks a lot for your explanation!

You proposed the following schema:

  • Create empty table.
  • Alter table exchange partition between the empty table and the partition you want to remove.
  • Alter table drop partition (this should be faster as the file is smaller).
  • Alter table add partition.
  • Drop the exchanged table at a later time. (you can even run this step using RSU, but beware of the
    GTID side effects, maybe you will need to disable binlog for that session.)

Are there any tools for it? I mean tools that can do it automatically or this schema can be implemented from scratch?

1 Like

I don’t know of any tools that do anything like that.

Thanks!

1 Like