Not the answer you need?
Register and ask your own question!

Cluster crash and not restart

ManuelRighiManuelRighi EntrantCurrent User Role Beginner
Hello,
I have XtraDB Cluster 5.6 on 3 node with Ubuntu Server 12.04.2 LTS.
All works fine.
Today one node has crash and when I try to restart service, other 2 nodes goes down.
Now, I restart first node with "bootstrap-pxc" and this node is ok.
When I try to add second node, this join fail and first node crash :|

On first node, on log error I find this:


2015-02-27 16:02:02 7f87a406c700 InnoDB: Error: page 7595 log sequence number 138747407668
InnoDB: is in the future! Current system log sequence number 119269942957.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.6/...-recovery.html
InnoDB: for more information.
2015-02-27 16:02:04 15322 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-02-27 16:02:04 15322 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
15:02:04 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=25165824
read_buffer_size=131072
max_used_connections=33
max_threads=202
thread_count=35
connection_count=33
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 105196 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7735e50
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
WSREP_SST: [ERROR] innobackupex finished with error: 9. Check /var/lib/mysql//innobackup.backup.log (20150227 16:02:04.738)
WSREP_SST: [ERROR] Cleanup after exit with status:22 (20150227 16:02:04.743)
WSREP_SST: [INFO] Cleaning up temporary directories (20150227 16:02:04.751)
Segmentation fault
150227 16:02:04 mysqld_safe Number of processes running now: 0
150227 16:02:04 mysqld_safe WSREP: not restarting wsrep node automatically
150227 16:02:04 mysqld_safe mysqld from pid file /var/lib/mysql/cls-mysql1-db1.pid ended



on second node, on log error, I find this:

WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20150227 16:01:03.021)
2015-02-27 16:01:03 26922 [Note] WSREP: (711054fc, 'tcp://0.0.0.0:4567') turning message relay requesting off
grep: /var/lib/mysql//xtrabackup_checkpoints: No such file or directory
WSREP_SST: [INFO] Preparing the backup at /var/lib/mysql/ (20150227 16:02:04.742)
WSREP_SST: [INFO] Evaluating innobackupex --no-version-check --apply-log $rebuildcmd ${DATA} &>${DATA}/innobackup.prepare.log (20150227 16:02:04.744)
WSREP_SST: [ERROR] Cleanup after exit with status:1 (20150227 16:02:04.965)
2015-02-27 16:02:04 26922 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.10.20.80' --auth 'sstuser:xxxaaaaassss' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '26922' '' : 1 (Operation not permitted)
2015-02-27 16:02:04 26922 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2015-02-27 16:02:04 26922 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
2015-02-27 16:02:04 26922 [ERROR] Aborting

2015-02-27 16:02:05 26922 [Note] WSREP: (711054fc, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.20.79:4567
2015-02-27 16:02:06 26922 [Note] WSREP: (711054fc, 'tcp://0.0.0.0:4567') reconnecting to 805bc8ba (tcp://10.10.20.79:4567), attempt 0
2015-02-27 16:02:06 26922 [Note] WSREP: Closing send monitor...
2015-02-27 16:02:06 26922 [Note] WSREP: Closed send monitor.
2015-02-27 16:02:06 26922 [Note] WSREP: gcomm: terminating thread
2015-02-27 16:02:06 26922 [Note] WSREP: gcomm: joining thread
2015-02-27 16:02:06 26922 [Note] WSREP: gcomm: closing backend
2015-02-27 16:02:09 26922 [Note] WSREP: evs::proto(711054fc, LEAVING, view_id(REG,711054fc,5)) suspecting node: 805bc8ba
2015-02-27 16:02:09 26922 [Note] WSREP: evs::proto(711054fc, LEAVING, view_id(REG,711054fc,5)) suspected node without join message, declaring inactive
2015-02-27 16:02:09 26922 [Note] WSREP: view(view_id(NON_PRIM,711054fc,5) memb {
711054fc,0
} joined {
} left {
} partitioned {
805bc8ba,0
})
2015-02-27 16:02:09 26922 [Note] WSREP: view((empty))
2015-02-27 16:02:09 26922 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2015-02-27 16:02:09 26922 [Note] WSREP: gcomm: closed
2015-02-27 16:02:09 26922 [Note] WSREP: Flow-control interval: [16, 16]
2015-02-27 16:02:09 26922 [Note] WSREP: Received NON-PRIMARY.
2015-02-27 16:02:09 26922 [Note] WSREP: Shifting JOINER -> OPEN (TO: 2268)
2015-02-27 16:02:09 26922 [Note] WSREP: Received self-leave message.
2015-02-27 16:02:09 26922 [Note] WSREP: Flow-control interval: [0, 0]
2015-02-27 16:02:09 26922 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2015-02-27 16:02:09 26922 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 2268)
2015-02-27 16:02:09 26922 [Note] WSREP: RECV thread exiting 0: Success
2015-02-27 16:02:09 26922 [Note] WSREP: recv_thread() joined.
2015-02-27 16:02:09 26922 [Note] WSREP: Closing replication queue.
2015-02-27 16:02:09 26922 [Note] WSREP: Closing slave action queue.
2015-02-27 16:02:09 26922 [Note] WSREP: Service disconnected.
2015-02-27 16:02:09 26922 [Note] WSREP: rollbacker thread exiting
2015-02-27 16:02:10 26922 [Note] WSREP: Some threads may fail to exit.
2015-02-27 16:02:10 26922 [Note] Binlog end
2015-02-27 16:02:10 26922 [Note] /usr/sbin/mysqld: Shutdown complete

Error in my_thread_global_end(): 1 threads didn't exit
150227 16:02:16 mysqld_safe mysqld from pid file /var/lib/mysql/cls-mysql1-db2.pid ended

Comments

  • ManuelRighiManuelRighi Entrant Current User Role Beginner
    Hello,
    can anyone help me ?
  • mmikemmike Entrant Current User Role Beginner
    This error message: "2015-02-27 16:02:04 26922 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.10.20.80' --auth 'sstuser:Arght64dGyTR32P' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '26922' '' : 1 (Operation not permitted)" suggest that there is something wrong with the SSD. Unfortunately the error messages are not very descriptive and often misleading. (I had situations where I had socat running and new one could not be started). You may wish to check innobackup logs on the donor (most likely the bootstrap node in your case) and see if there is any useful info there.

    I'd start over - if your bootstrap node is running fine, check that there are no 'extra' processes running (like socat, or innobackupex, wsrep_sst_* or similair). Also, make sure that on the 'joiner' node there are no mysql processes running. Then try staring the joiner and observe the logs for any failures.

    Also, make sure you change the sstuser password as you compromised it by posting it here.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.