Hi,
I have a three node cluster, and I lost synchronisation regularly with one of the server. Now I have corruption in the InnoDB tablespace, I feel uncomfortable to run a innodb_force_recovery=6 because I have 2 important production dabatases on these servers. I retired the failed server from the cluster, reinstall a new one, and all the symptoms reappear : synchronisation lost, and corruption once again.
Servers are Ubuntu 10.04.4 LTS, and I use percona-xtradb-cluster-server-5.5 (version 5.5.31-23.7.5-438).
You can find the error log in the attached file, and below the my.cnf file :
[client]
password = 'xxxxx'
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld_safe]
wsrep_urls=gcomm://192.168.183.40:4567,gcomm://192.168.183.41:4567,gcomm://192.168.183.42:4567
[mysqld]
datadir=/var/lib/mysql
user=mysql
binlog_format=ROW
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_slave_threads=2
wsrep_cluster_name=prod_pa
wsrep_sst_method=rsync
wsrep_node_name=lxpadb03
default_storage_engine=InnoDB
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
#tuning
max_allowed_packet = 16M
max_connect_errors = 1000000
skip_name_resolve
query_cache_size=0
query_cache_type=0
tmp_table_size = 32M
max_heap_table_size = 32M
max_connections = 500
thread_cache_size = 50
open_files_limit = 65535
table_definition_cache = 4096
table_open_cache = 4096
# INNODB #
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 256M
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table = 1
innodb_buffer_pool_size = 3072M
Thanks in advance.
Laeti
lxpadb03.err.zip (60 KB)