I reviewed the launchpad and saw that Valerii highlighted some problems related to the MySQL configs. I’d like to say that Signal 11/Segment[ation] Fault is related (as well) with some memory usage misconfiguration. I haven’t seen much memory usage looking at global status vars published on launchpad. Can you share here the my.cnf file content of the last crashed node?
Not sure if problems are located on the configs anymore. I was just in doubt about the Query Cache support, but this is supported by PXC since 5.6.14-25.1. But you can try disable it and check if the problem stops. Additionally, it’s a good chance to have a look on the error log from the crashed node, can you share it?
PS.: not that this is mandatory or will make your cluster crashes, but, try to arrange better you config file, getting together all the variables in just one section, e.g., one [mysqld].
Correct the measure of max_binlog_size as well, it’s going to be M or G, probably M.
Thanks. Here is crashing node’s log. It contains only last 100 lines of the file. Will update my.cnf according your recommendations and restart all nodes.
Here is crash’s log from one crashing node (both crashed in 10 minutes), and when I manually one node, another crash until all nodes crashed (and have been restarted).
2015-04-17 19:09:34 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:36 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:36 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:37 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') turning message relay requesting off
2015-04-17 19:09:38 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:39 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:41 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:41 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:43 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:44 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:46 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:46 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:46 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:46 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:47 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:47 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:48 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:48 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:48 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:48 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:48 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:49 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:49 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:49 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:49 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:49 8719 [Note] WSREP: (7e63ca95, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid 7e63ca95 is blacklisted, skipping
2015-04-17 19:09:51 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:51 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:53 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:54 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:56 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:56 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:58 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:09:59 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:10:01 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
2015-04-17 19:10:01 8719 [Warning] Client failed to provide its character set. 'latin1' will be used as client character set.
17:10:03 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=145
max_threads=502
thread_count=45
connection_count=43
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 331448 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f18501e4fa0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f1842ff1e50 thread_stack 0x1000000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x8ccd9e]
/usr/sbin/mysqld(handle_fatal_signal+0x36c)[0x6828dc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf0a0)[0x7f1947ed20a0]
/lib/x86_64-linux-gnu/libc.so.6(+0x784ad)[0x7f19461524ad]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70)[0x7f1946154a70]
/usr/sbin/mysqld(my_malloc+0x32)[0x8c8bc2]
/usr/sbin/mysqld(alloc_root+0x8c)[0x8c4b9c]
/usr/sbin/mysqld(_ZN4ItemnwEm+0x9)[0x6021e9]
/usr/sbin/mysqld[0x8358ab]
/usr/sbin/mysqld[0x836082]
/usr/sbin/mysqld(_Z17build_equal_itemsP3THDP4ItemP10COND_EQUALbP4ListI10TABLE_LISTEPS4_+0x34)[0x836504]
/usr/sbin/mysqld(_Z13optimize_condP3THDP4ItemPP10COND_EQUALP4ListI10TABLE_LISTEbPNS1_11cond_resultE+0x25d)[0x83689d]
/usr/sbin/mysqld(_ZN4JOIN8optimizeEv+0x2e8)[0x83bd98]
/usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x21c)[0x71e5fc]
/usr/sbin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x175)[0x71e8f5]
/usr/sbin/mysqld[0x59a85f]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0xb60)[0x6f8810]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x5c8)[0x6fd7f8]
/usr/sbin/mysqld[0x6fd8f2]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1076)[0x6fef36]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x202)[0x6ffbe2]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x2ad)[0x6d11dd]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6d1262]
/usr/sbin/mysqld(pfs_spawn_thread+0x140)[0xb195d0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7f1947ec9b50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f19461b595d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (55f9470): [QUERY REMOVED]
Connection ID (thread ID): 114279
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
150417 19:10:03 mysqld_safe Number of processes running now: 0
150417 19:10:03 mysqld_safe WSREP: not restarting wsrep node automatically
150417 19:10:03 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
Here is crash’s log from one crashing node after switched “sst_method” to “rsync”.
mysqld: malloc.c:4993: _int_free: Assertion `p->bk_nextsize->fd_nextsize == p' failed.
20:25:53 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=143
max_threads=502
thread_count=53
connection_count=44
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 331448 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x1000000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x8ccd9e]
/usr/sbin/mysqld(handle_fatal_signal+0x36c)[0x6828dc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf0a0)[0x7f137458b0a0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7f13727c5165]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x180)[0x7f13727c83e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x75dea)[0x7f1372808dea]
/lib/x86_64-linux-gnu/libc.so.6(+0x7782f)[0x7f137280a82f]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x6c)[0x7f137280d98c]
/usr/lib/libgalera_smm.so(_ZN5boost6detail17sp_counted_impl_pISt6vectorIhSaIhEEE7disposeEv+0x17)[0x7f13523c7177]
/usr/lib/libgalera_smm.so(_ZNSt8_Rb_treeIN5gcomm14InputMapMsgKeyESt4pairIKS1_NS0_3evs11InputMapMsgEESt10_Select1stIS6_ESt4lessIS1_ESaIS6_EE12_M_erase_auxESt23_Rb_tree_const_iteratorIS6_ESE_+0x11c)
[0x7f13523c888c]
/usr/lib/libgalera_smm.so(_ZN5gcomm3evs8InputMap22cleanup_recovery_indexEv+0x51)[0x7f13523c6ea1]
/usr/lib/libgalera_smm.so(_ZN5gcomm3evs8InputMap12set_safe_seqEml+0x95)[0x7f13523c6fa5]
/usr/lib/libgalera_smm.so(_ZN5gcomm3evs5Proto18update_im_safe_seqEml+0x35)[0x7f13523d0625]
/usr/lib/libgalera_smm.so(_ZN5gcomm3evs5Proto11handle_userERKNS0_11UserMessageESt17_Rb_tree_iteratorISt4pairIKNS_4UUIDENS0_4NodeEEERKNS_8DatagramE+0x466)[0x7f13523ec316]
/usr/lib/libgalera_smm.so(_ZN5gcomm3evs5Proto10handle_msgERKNS0_7MessageERKNS_8DatagramEb+0x5ef)[0x7f13523f199f]
/usr/lib/libgalera_smm.so(_ZN5gcomm3evs5Proto9handle_upEPKvRKNS_8DatagramERKNS_11ProtoUpMetaE+0x2dc)[0x7f13523f26ac]
/usr/lib/libgalera_smm.so(_ZN5gcomm8Protolay7send_upERKNS_8DatagramERKNS_11ProtoUpMetaE+0x36)[0x7f13523f3d16]
/usr/lib/libgalera_smm.so(_ZN5gcomm6GMCast9handle_upEPKvRKNS_8DatagramERKNS_11ProtoUpMetaE+0x221)[0x7f1352409b41]
/usr/lib/libgalera_smm.so(_ZN5gcomm10Protostack8dispatchEPKvRKNS_8DatagramERKNS_11ProtoUpMetaE+0x58)[0x7f1352430208]
/usr/lib/libgalera_smm.so(_ZN5gcomm12AsioProtonet8dispatchERKPKvRKNS_8DatagramERKNS_11ProtoUpMetaE+0x4b)[0x7f135245963b]
/usr/lib/libgalera_smm.so(_ZN5gcomm13AsioTcpSocket12read_handlerERKN4asio10error_codeEm+0x669)[0x7f135243e319]
/usr/lib/libgalera_smm.so(_ZN4asio6detail7read_opINS_19basic_stream_socketINS_2ip3tcpENS_21stream_socket_serviceIS4_EEEEN5boost5arrayINS_14mutable_bufferELm1EEENS8_3_bi6bind_tImNS8_4_mfi3mf2ImN5gc
omm13AsioTcpSocketERKNS_10error_codeEmEENSC_5list3INSC_5valueINS8_10shared_ptrISH_EEEEPFNS8_3argILi1EEEvEPFNSR_ILi2EEEvEEEEENSD_IvNSF_IvSH_SK_mEESY_EEEclESK_mi+0x93)[0x7f135244aca3]
/usr/lib/libgalera_smm.so(_ZN4asio6detail23reactive_socket_recv_opINS0_17consuming_buffersINS_14mutable_bufferEN5boost5arrayIS3_Lm1EEEEENS0_7read_opINS_19basic_stream_socketINS_2ip3tcpENS_21stream
_socket_serviceISB_EEEES6_NS4_3_bi6bind_tImNS4_4_mfi3mf2ImN5gcomm13AsioTcpSocketERKNS_10error_codeEmEENSF_5list3INSF_5valueINS4_10shared_ptrISK_EEEEPFNS4_3argILi1EEEvEPFNSU_ILi2EEEvEEEEENSG_IvNSI_
IvSK_SN_mEES11_EEEEE11do_completeEPNS0_15task_io_serviceEPNS0_25task_io_service_operationESL_m+0xdd)[0x7f135244afdd]
/usr/lib/libgalera_smm.so(_ZN4asio6detail15task_io_service3runERNS_10error_codeE+0x3cc)[0x7f135245bc0c]
/usr/lib/libgalera_smm.so(_ZN5gcomm12AsioProtonet10event_loopERKN2gu8datetime6PeriodE+0x1a2)[0x7f135245a012]
/usr/lib/libgalera_smm.so(_ZN9GCommConn3runEv+0x60)[0x7f1352473ff0]
/usr/lib/libgalera_smm.so(_ZN9GCommConn6run_fnEPv+0x9)[0x7f13524767b9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7f1374582b50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f137286e95d]
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
150417 22:25:53 mysqld_safe Number of processes running now: 0
150417 22:25:53 mysqld_safe WSREP: not restarting wsrep node automatically
150417 22:25:53 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
Sorry about the delay in answer your last interaction. So, PXC is crashing by a signal 6 at this time, not by a signal 11 anymore. The signal 6 or the SIGABRT usually happens when there is a problem with memory allocation, not just by mysqld, for any other processes that runs on the same machine as mysqld. Doing some researches on the web:
[QUOTE]
abort()
If it’s not production, try to build new virtual machines, raising up a new cluster…
08:10:03 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=136
max_threads=502
thread_count=92
connection_count=83
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 331448 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x20b39d0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f1fd03c9e50 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x8ccd9e]
/usr/sbin/mysqld(handle_fatal_signal+0x36c)[0x6828dc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf0a0)[0x7f209c3090a0]
/lib/x86_64-linux-gnu/libc.so.6(+0x78a8e)[0x7f209a589a8e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70)[0x7f209a58ba70]
/usr/sbin/mysqld(my_malloc+0x32)[0x8c8bc2]
/usr/sbin/mysqld(alloc_root+0x8c)[0x8c4b9c]
/usr/sbin/mysqld(_ZN4ItemnwEmP11st_mem_root+0x12)[0x614972]
/usr/sbin/mysqld(_Z10MYSQLparseP3THD+0x119cf)[0x7a776f]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x268)[0x6fd498]
/usr/sbin/mysqld[0x6fd8f2]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x132f)[0x6ff1ef]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x202)[0x6ffbe2]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x2ad)[0x6d11dd]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6d1262]
/usr/sbin/mysqld(pfs_spawn_thread+0x140)[0xb195d0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7f209c300b50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f209a5ec95d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f1fbc31ed9e): is an invalid pointer
Connection ID (thread ID): 394517
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
150424 10:10:03 mysqld_safe Number of processes running now: 0
150424 10:10:03 mysqld_safe WSREP: not restarting wsrep node automatically
150424 10:10:03 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended