SST-Problem, memory question

Hi.

I am currently doing a trial setup of Percona Cluster and I am facing three problems, with the third one one probably caused by the other two.

The setup is:

OS: CentOs 6.2
Cluster: Percona-XtraDB-Cluster-5.5.27-23.6.356, installed from tarball

The HW is 3 servers with Intel i7-920 Quad-Core w/ Hyperthreading (8 logical CPUs) and 24GB of RAM.

MySQL configuration resides in /etc/mysql/my.cnf, the root user has a private configuration file in /root/.my.cnf

The size of the database on disk is about 1.3TB

The intended intial setup is:

clusterdb1: slave of production server
clusterdb2, clusterdb3: cluster members without special purpose

The cluster gets started on all servers by root issuing /etc/init.d/mysql.server with the MySQL server configured to run a user “mysql”.

The problems I am facing are:

  1. SST does not work - it crashes when applying the logs because wsrep_sst_xtrabackup gets called with /root/.my.cnf as the configuration file.

  2. innodb_buffer_pool_size is currently set to 14GB (on dedicated servers with 24GB of RAM), still clusterdb1 keeps on crashing with out-of-memory error after running for about 15hours.

  3. custerdb1 gets restarted immediately by mysqld_safe and should definitely be able to do an IST, however that fails with

[Warning] WSREP: Failed to prepare for incremental state transfer: Local state seqno is undefined: 1 (Operation not permitted) at galera/src/replicator_str.cpp:prepare_for_IST():449. IST will be unavailable.

After that clusterdb1 starts an SST (which will fail because of --defaults-file=/root/.my.cnf)

So my questions are:

  1. how to configure the my.cnf file used by PXC for SST? Why does it choose /root/.my.cnf?

  2. what is the recommended memory setting for Percona Xtradb Cluster? How much memory has to be reserved for “cluster purposes”? (non-cluster MySQL works fine with 18GB of buffer_pool on these machines)

  3. Why doesn’t IST work? exactly what operation is not permitted?

If the answer to 3. requires a more detailed log I will post that however I suspect that this is problably related to 1. (as SST fails with “operation not permitted” because the mysql user cannot read /root/.my.cnf).

Regards,
Robert.

Ok, answering 3. myself: IST obviously does not work because the server in question crashed. When stopping one server and starting it again a clean and fast IST is done.

The most pressing question for me is still the one about /root/.my.cnf being used as config file for SST. From other threads in this forum I gather that it works for some people. Is PXC supposed to be run as root?

Robert.

I found the reason for the excessive memory use / memory leak: when using Percona XtraDB Cluster as a slave you have to enable bin_log and set log_slave_updates=1.

If binlog is not enabled the server will start to accumulate memory until it gets killed by the OOM killer.

Either this is a bug or I have missed something very fundamental about the workings of Percona Cluster / Galera replication.

Robert.