Hello,
We have recently tried and tested Percona XtraDB cluster in lab for some time and decided to roll it into production.
The production environment is consisted of 5 servers.
Only the main one is read&write node, and others are for backup via HAProxy.
The initial migration from standalone node (Percona Mysql Server) went fine with no glitches, and soon we had all 5 nodes up&running.
A week later some serious issues started to occur.
We have rolled a new web server (pure clone of the old one, just different IP). It worked fine alongside old one (behind haproxy).
We forgot to change one connection string on a subsite , and as soon as it went online , ALL nodes went offline with following:
Slave SQL: Error ‘You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘? WHERE (codeId= 4967241)’ at line 1’ on query. Default database: ‘-----------’. Query: ‘UPDATE --------
.codes
SET codeUsed
= ? WHERE (codeId= xxxxxx)’, Error_code: 1064
After that we bootstrapped the main node and joined (full SST) other nodes. All worked until we restarted Apache, upon which same error happened (same query, different id) and all nodes went offline simultaneously.
After that we are running bootstrapped node just fine, but cannot join other nodes. SST goes on fine, but the same error is produced on subsequential IST (DB is live) whenever we try to join a node.
As of today, I have a even more confusing error when doing SST:
print() on closed filehandle XTRABACKUP_PID at /usr/bin/innobackupex line 1084.
log scanned up to (741995496503)
log scanned up to (741995496503)
log scanned up to (741995496503)
log scanned up to (741995496503)
log scanned up to (741995496503)
…which goes on forever. If i interrupt this process on joining node, donor crashes.
my.cnf snip:
default_storage_engine=InnoDB
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
wsrep_replicate_myisam = 1
##Per node conf
wsrep_node_address=10.42.71.68 #IP adresa TOG node-a, ne remote.
wsrep_sst_method=xtrabackup
wsrep_cluster_name=Cluster
Mysqldump works fine and without error on ALL databases at donor. All DB’s in question are InnoDB (there’s one myisam table for fulltext search but server never complained about it in the logs).
But SST/IST no longer works, even after multiple restarts. I have cleared data dirs at joining nodes.
We are running Percona-XtraDB-Cluster-5.5.31-23.7.5.438.Linux.x86_64
Any ideas why this is happening and how to get cluster up&running again?
Regards,
Marko