mysqlcheck on percona xtradb cluster?

We’ve historically ran mysqlcheck once per day on our mysql servers. Now that we’ve migrated to Percona XtraDB I’m assuming we should continue that practice - although I’m a little unclear. Do we run mysqlcheck on each percona node or is running it on one node sufficient and replication takes care of the rest?


Brad: mysqlcheck is not replicated, so it could be the case that tables are corrupt on different nodes.

That being said, mysqlcheck was really designed for MyISAM. I’m not aware of people using it with much regularity with Innodb (which I would assume you are using if you are using PXC).

I should have probably been clearer - I am not referring to analyze but rather optimize flag. Which as I understand re-organizes/restructures data for better performance.

In the case of INNODB I guess mysqlcheck --optimize does recreate + analyze. So I suppose I’m still unclear on weather this is necessary on all nodes and if this is doing anything beneficial performance wise particularly for PXC nodes.

We’re running a drupal based website and I keep seeing suggestions to schedule a regular mysqlcheck --optimize on the drupal database when using traditional mysql so I’m just trying to decide if this makes sense to do with PXC.

I also stumbled upon this: [URL=“"][/URL="”]

The suggestion seems to be to drop indexes besides primary key, optimizing table and adding them back - however there is a stipulation that its safe to do this on a slave where it is OK table is exposed without indexes for some time. All our PXC nodes are read/write so I’m not sure that would be a good idea for us.

Any thoughts or opinions would be appreciated.


Well, first of all, OPTIMIZE is not replicated in Galera currently, but this is coming in a future release:

As far as if it’s a good idea: I’d say most people don’t optimize Innodb that often. What drupal does or does not suggest doesn’t mean all that much to me in that regard, but certainly it is possible that better performance may result. I’d say that rather than guessing, you should measure. Does the average query response time on the server actually improve after all this optimize work or not?

Another interesting way to accomplish something like an optimize in a more online fashion might be to use pt-online-schema-change to rebuild the table in a more online fashion.

pt-online-schema-change D=database,t=table --alter "ENGINE=InnoDB"

Optimizing is going to lock the table in question, and that’s going to be negative for flow control in the cluster. See the FLUSH TABLES WITH READ LOCK example here: for how I would expect an optimize to affect the cluster.

Hmmm so I’m a bit uncertain how to proceed now.

On one hand you’re saying optimize is going to lock tables and isn’t replicated but yet you’re adding it in a future release? Does that mean that you’re addressing the fact that its not replicated and it locks tables? And when this new version of galera is released, its better to use optimize vs pt-online-schema-change?

Or even when the new optimize code is available - would it still better to do pt-online-schema-change?

Will a pt-online-schema-change be replicated across all PXC nodes?


Some more thoughts - what does “pt-online-schema-change D=database,t=table --alter “ENGINE=InnoDB”” actually do?

I had a look at the manual here [url][/url] but I don't completely understand what the --alter flag is doing. Don't I have to specify what to alter? 


When I just run it with --alter I see output similar to this:

	Starting a dry run. `drupal`.`workflow_scheduled_transition` will not be altered. Specify --execute instead of --dry-run to alter the table.

	Creating new table...

	Created new table drupal._workflow_scheduled_transition_new OK.

	Altering new table...

	Altered `drupal`.`_workflow_scheduled_transition_new` OK.

	Dropping new table...

	Dropped new table OK.

	Dry run complete. `drupal`.`workflow_scheduled_transition` was not altered.

	The new table `drupal`.`_workflow_scheduled_transition_new` does not have a PRIMARY KEY or a unique index which is required for the DELETE trigger.

	Operation, tries, wait:

	 copy_rows, 10, 0.25

	 create_triggers, 10, 1

	 drop_triggers, 10, 1

	 swap_tables, 10, 1

	 update_foreign_keys, 10, 1
I'm not a DBA and my experience with MySQL is quite limited, however in the past when I've used Microsoft SQL it was necessary to perform regularly scheduled maintenance (which I'm assuming may be similar to a mysql optimize?) which rebuilds indexes and reorganizes tables. A job like this typically runs once per week and keeps the database healthy. If you don't run a job like this over time performance will slowly degrade.


Again I assume the same would be true of performing mysql --optimize or pt-online-schema change? In other words if you hadn't run this maintenance in 6 months you might see a noticable improvement in performance but if you had been running it regularly you probably wouldn't see a big difference in performance.


This is a round about way of saying how do I know what this is doing and if its helping? 


It sounds like if I had specified --execute it would have copied a bunch of rows, created some triggers, drop some triggers, swapped some tables and updated some foreign_keys. Other than trying to run a load test against the server which again may or may not show a meaningful change - how can I tell if this is "helping"? 




Optimize will lock tables, period, just like every DDL. Currently DDLs (CREATE/DROP/ALTER) are replicated by Galera by default in the TOI mode. TOI effectively halts the entire cluster while the DDL is replicated with statement based replication and executed at the same time on each node.

Optimize is not currently replicated in this fashion, but support for it to be replicated with TOI is what I was talking about, and may be added in a future release by Codership.

Probably. Though pt-osc does DDL, it does it on empty tables, so it’s much more efficient.


pt-osc is doing the equivalent of an ALTER TABLE, hence the need for the --alter flag. ALTER TABLE foo ENGINE=Innodb simply rebuilds the table from scratch as if it was converting the table from another engine. This is what OPTIMIZE does on Innodb tables.

The larger question of “how do I know if it’s worth it” starts to get more philosophical:

My take on this is to measure. Use this technique: or something similar to determine your slowest queries on the table(s) in question. If you can see any significant performance change after your optimize, then you know for sure.