I have a few terabytes database (5.7.37, running on CentOS 7.6), I would say pretty active, with ~1000-3000 queries per minute.
This is part of master to master replication.
I would like to partition all tables using list partitioning
Is there any special issue about partitioning and replication I need to consider?
Any thoughts more then welcome
Hello, do you write to both masters at the same time? in that case I suggest looking at Resolving the MySQL Active-Active Replication Dilemma - Percona Database Performance Blog to understand why that is not a good idea. That being said there are no special considerations for a partitioned table in terms of replication.
Yes, absolutely agree, unfortunately this is not something I can change in the near future.
In regards to partitioning. I am worry about the partitioning manipulations mostly, exchange partition, add partition, drop partition. Perhaps I would run all the DDL’s always under one master, is it get replicated properly in busy system? I would simulate synthetic load on my system before actual change but I woud love to hear about other people experience.
Welcome to Percona forums. As @Ivan_Groenewold said, there are no special considerations for a partitioned table in terms of replication. All the queries, updates, deletes, and inserts should have the same behaviors.
But you still are affected by the impact of DDL when using replication. DDLs need to be replicated, and if they take a long time on the source server, they will take a long time on the replica servers.
You can use the WITHOUT VALIDATION option for some partition operations, making the operation faster.
Exchanging Partitions and Subpartitions with Tables
And, of course, you can use pt-online-schema-change to avoid incurring extra latency. Here you have an interesting blog post from @David_Ducos that both explains pt-online-schema-change and presents a different approach to testing partitioning.
Testing MySQL partitioning with pt-online-schema-change
Make sure to test everything in a non-productive environment because master-master replication has a lot of side effects that can impact the database’s consistency.
So the partitioning wan’t “break” replication. Good to hear. Thanks
I’ll check the mentioned tool pt-online-schema-change, but seems it would introduce huge overhead on my replication (obviously I would need to run it on one master only ).
In regards to the tool pt-online-schema-change.
It can’t handle the case when partitoning is by list using non PK column
For example, my table is:
autoid bigint(20) unsigned NOT NULL AUTO_INCREMENT,
id char(36) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
name varchar(30) NOT NULL
PRIMARY KEY (autoid),
UNIQUE KEY id_unique (id)
) ENGINE = InnoDB
I would like to add partitioning by list:
–alter “PARTITION BY LIST COLUMNS(name) ( PARTITION p_XXX VALUES IN (‘XXX’), PARTITION p_YYY VALUES IN (‘YYY’), …)”
in this case the column “name” should be added to all unique keys. I.E. unique keys need to be created/altered during creation of “new” table (btw, then the autoid need to be altered as you can’t remove index when auto increment column exists if you using create table like approach and then creating indexes). Or you may generate create table DDL including new columns in indexes (can’t do it using create like)
But pt-online-schema-change failing with:
Error altering new table
_test_new: DBD::mysql::db do failed: A PRIMARY KEY must include all columns in the table’s partitioning function .
As a general recommendation, the alter you are going to run using pt-osc most be able to run without pt-osc. In this case you can’t add partitioning without changing the primary key first. This is not a limitation of pt-osc, it is a limitation of MySQL.
This section discusses the relationship of partitioning keys with primary keys and unique keys. The rule governing this relationship can be expressed as follows: All columns used in the partitioning expression for a partitioned table must be part of every unique key that the table may have.
In other words, every unique key on the table must use every column in the table’s partitioning expression. (This also includes the table’s primary key, since it is by definition a unique key…
In this cases, usually what is done is:
- Add the missing columns to the primary key
- Configure replication
This is a two step process.
Sure, there is no limitation at all. And I didn’t wrote that there is any limitation.
I would say there is missing documentation, the documentation of the tool, probably, should mention the type of DDL’s the tool can handle.
In this case, obviously, the work is little more complex that the tool can handle, since in MySQL partition key should be part of any unique index existing on table (primary index is just particular case of unique index). This is, probably, because MySQL tables is clustered and not heap.
Btw, except tables in memory there is any chance that mySQL would support heap tables?
Thanks for such amazing and free set of tools as you have there in Percona.