Not the answer you need?
Register and ask your own question!

Production Percona DB crashing repeatedly

dctechtestdctechtest EntrantCurrent User Role Beginner
Hi all, I've got a production box that crashes repeatedly with the following:
Mar 15 12:13:01 box863 mysqld: 18:13:01 UTC - mysqld got signal 11 ;
Mar 15 12:13:01 box863 mysqld: This could be because you hit a bug. It is also possible that this binary
Mar 15 12:13:01 box863 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Mar 15 12:13:01 box863 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Mar 15 12:13:01 box863 mysqld: We will try our best to scrape up some info that will hopefully help
Mar 15 12:13:01 box863 mysqld: diagnose the problem, but since we have already crashed,
Mar 15 12:13:01 box863 mysqld: something is definitely wrong and this may fail.
Mar 15 12:13:01 box863 mysqld: Please help us make Percona Server better by reporting any
Mar 15 12:13:01 box863 mysqld: bugs at http://bugs.percona.com/
Mar 15 12:13:01 box863 mysqld:
Mar 15 12:13:01 box863 mysqld: key_buffer_size=268435456
Mar 15 12:13:01 box863 mysqld: read_buffer_size=4194304
Mar 15 12:13:01 box863 mysqld: max_used_connections=0
Mar 15 12:13:01 box863 mysqld: max_threads=1502
Mar 15 12:13:01 box863 mysqld: thread_count=0
Mar 15 12:13:01 box863 mysqld: connection_count=0
Mar 15 12:13:01 box863 mysqld: It is possible that mysqld could use up to
Mar 15 12:13:01 box863 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 12585173 K bytes of memory
Mar 15 12:13:01 box863 mysqld: Hope that's ok; if not, decrease some variables in the equation.
Mar 15 12:13:01 box863 mysqld:
Mar 15 12:13:01 box863 mysqld: Thread pointer: 0x0
Mar 15 12:13:01 box863 mysqld: Attempting backtrace. You can use the following information to find out
Mar 15 12:13:01 box863 mysqld: where mysqld died. If you see no messages after this, something went
Mar 15 12:13:01 box863 mysqld: terribly wrong...
Mar 15 12:13:01 box863 mysqld: stack_bottom = 0 thread_stack 0x40000
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7cbfc5]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x6a0ec4]
Mar 15 12:13:01 box863 mysqld: /lib64/libpthread.so.0[0x344aa0f710]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld(_ZN3THD15raise_conditionEjPKcN11MYSQL_ERROR18enum_warning_levelES1_+0x3b)[0x56ed6b]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld(_Z19push_warning_printfP3THDN11MYSQL_ERROR18enum_warning_levelEjPKcz+0xdd)[0x57c08d]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld(ib_warn_row_too_big+0x86)[0x8348a6]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x8e7753]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x8f3c70]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x8f1c52]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x8f2785]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x96f9f5]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x963d27]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x882cf7]
Mar 15 12:13:01 box863 mysqld: /usr/sbin/mysqld[0x877ebc]
Mar 15 12:13:01 box863 mysqld: /lib64/libpthread.so.0[0x344aa079d1]
Mar 15 12:13:01 box863 mysqld: /lib64/libc.so.6(clone+0x6d)[0x344a2e88fd]
Mar 15 12:13:01 box863 mysqld: You may download the Percona Server operations manual by visiting
Mar 15 12:13:01 box863 mysqld: http://www.percona.com/software/percona-server/. You may find information
Mar 15 12:13:01 box863 mysqld: in the manual which will help you identify the cause of the crash.
Mar 15 12:13:04 box863 mysqld_safe: mysqld from pid file /var/lib/mysql/box863.bluehost.com.pid ended

Comments

  • dctechtestdctechtest Entrant Current User Role Beginner
    Sorry, I forgot to include that I've had to recover InnoDB tables several times over the past few days due to this issue.
  • scott.nemesscott.nemes MySQL Sage Current User Role Patron
    Hi dctechtest;

    What version are you using? I think you might be hitting bug #20144839 (can't link to it as Oracle is not publishing the details). Try upgrading to 5.5.42 or 5.6.23 (if you are not there already) and see if it still happens.

    -Scott
  • dctechtestdctechtest Entrant Current User Role Beginner
    updated to mysql Ver 14.14 Distrib 5.5.42-37.1, for Linux (x86_64) using readline 5.1, and my crashes stopped. Thanks scott.nemes.
  • scott.nemesscott.nemes MySQL Sage Current User Role Patron
    Hi dctechtest;

    Excellent! Glad that worked out for you. =)

    -Scott
  • MySQL-UserMySQL-User Entrant Current User Role Beginner
    Hi Scott,
    Could you please help me with my issue. I am on 5.6.21 and experiencing a similar kind of issue.
  • lolominlolomin Contributor Current User Role Supporter
    Hi,

    getting the same exact problem with a just a few days upgraded node of Percona XtraDB Cluster to v5.5.41, tested to downgrade back to v5.5.37 but same problem occurs again !?.

    Node is syncing OK with xtrabackup from another node in the cluster (420Go of data) but when trying to start after xtrabackup transfer it crashed with the following logs :

    150413 22:24:20 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (71f6d011-dc25-11e3-9cad-be804f0c3a69): 1 (Operation not permitted)
    at galera/src/replicator_str.cpp:prepare_for_IST():447. IST will be unavailable.
    150413 22:24:20 [Note] WSREP: Node 0 (altair) requested state transfer from '*any*'. Selected 1 (atlas)(SYNCED) as donor.
    150413 22:24:20 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 19776226)
    150413 22:24:20 [Note] WSREP: Requesting state transfer: success, donor: 1
    150413 23:30:34 [Note] WSREP: 1 (atlas): State transfer to 0 (altair) complete.
    150413 23:30:34 [Note] WSREP: Member 1 (atlas) synced with group.
    WSREP_SST: [INFO] Proceeding with SST (20150413 23:30:34.796)
    WSREP_SST: [INFO] Removing existing ib_logfile files (20150413 23:30:34.800)
    WSREP_SST: [INFO] Preparing the backup at /data/mysql/ (20150413 23:30:34.878)
    WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf --apply-log $rebuildcmd ${DATA} &>${DATA}/innobackup.prepare.log (20150413 23:30:34.881)
    WSREP_SST: [INFO] Total time on joiner: 0 seconds (20150413 23:31:14.622)
    WSREP_SST: [INFO] Removing the sst_in_progress file (20150413 23:31:14.626)
    150413 23:31:14 [Note] WSREP: SST complete, seqno: 19776226
    150413 23:31:14 [Warning] Using unique option prefix myisam_recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
    150413 23:31:14 [Note] Plugin 'FEDERATED' is disabled.
    150413 23:31:14 InnoDB: The InnoDB memory heap is disabled
    150413 23:31:14 InnoDB: Mutexes and rw_locks use GCC atomic builtins
    150413 23:31:14 InnoDB: Compressed tables use zlib 1.2.7
    150413 23:31:14 InnoDB: Using Linux native AIO
    150413 23:31:14 InnoDB: Initializing buffer pool, size = 20.0G
    150413 23:31:16 InnoDB: Completed initialization of buffer pool
    150413 23:31:16 InnoDB: highest supported file format is Barracuda.
    150413 23:31:17 InnoDB: Waiting for the background threads to start
    21:31:17 UTC - mysqld got signal 11 ;
    This could be because you hit a bug. It is also possible that this binary
    or one of the libraries it was linked against is corrupt, improperly built,
    or misconfigured. This error can also be caused by malfunctioning hardware.
    We will try our best to scrape up some info that will hopefully help
    diagnose the problem, but since we have already crashed,
    something is definitely wrong and this may fail.
    Please help us make Percona XtraDB Cluster better by reporting any
    bugs at https://bugs.launchpad.net/percona-xtradb-cluster

    key_buffer_size=268435456
    read_buffer_size=131072
    max_used_connections=0
    max_threads=702
    thread_count=2
    connection_count=0
    It is possible that mysqld could use up to
    key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1798672 K bytes of memory
    Hope that's ok; if not, decrease some variables in the equation.

    Thread pointer: 0x0
    Attempting backtrace. You can use the following information to find out
    where mysqld died. If you see no messages after this, something went
    terribly wrong...
    stack_bottom = 0 thread_stack 0x30000
    /usr/sbin/mysqld(my_print_stacktrace+0x29)[0x7abeb9]
    /usr/sbin/mysqld(handle_fatal_signal+0x372)[0x6aa6d2]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7fbdbf8728d0]
    /usr/sbin/mysqld(_ZN3THD15raise_conditionEjPKcN11MYSQL_ERROR18enum_warning_levelES1_+0x22)[0x589062]
    /usr/sbin/mysqld(_Z19push_warning_printfP3THDN11MYSQL_ERROR18enum_warning_levelEjPKcz+0xc7)[0x596d17]
    /usr/sbin/mysqld(ib_warn_row_too_big+0x6d)[0x7d2d5d]
    /usr/sbin/mysqld[0x869bbc]
    /usr/sbin/mysqld[0x8774d3]
    /usr/sbin/mysqld[0x8748cd]
    /usr/sbin/mysqld[0x873a88]
    /usr/sbin/mysqld[0x874e56]
    /usr/sbin/mysqld[0x8753da]
    /usr/sbin/mysqld[0x8e4220]
    /usr/sbin/mysqld[0x8dd158]
    /usr/sbin/mysqld[0x80e126]
    /usr/sbin/mysqld[0x8040fc]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7fbdbf86b0a4]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbdbdab804d]
    You may download the Percona XtraDB Cluster operations manual by visiting
    http://www.percona.com/software/percona-xtradb-cluster/. You may find information
    in the manual which will help you identify the cause of the crash.
    150413 23:31:17 mysqld_safe mysqld from pid file /data/mysql/altair.pid ended


    At the moment, I'm unable to restart the node due to this problem and we are working on a two nodes cluster so in very degraded mode, is there any bug entry opened please ? Is there any workaround (other than having MySQL 5.5.42 as it's not available yet for Percona XtraDB Cluster) ?

    Regards,

    Laurent MINOST
    IPD - Infopro Digital
  • wagnerbianchiwagnerbianchi Remote DBA Current User Role Patron
    Can you share your my.cnf and the server's hardware configurations? Some time ago I hit this same bug with 5.5 many releases due to lots of threads and misconfiguration related to main memory space. Most of times, Signal 11 or Segment[ation] Fault is related to some piece a data a process/software looks for in the main memory spaces and actually this isn't there and then, in case of mysqld ... crash!!

    I recommend that you review your configs following the maths presented on the error message:
    key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1798672 K bytes of memory
    Hope that's ok; if not, decrease some variables in the equation.

    If the other nodes have the same amount of machine resources and the same mysql configuration on my.cnf file, you need to check the others nodes as well.

    Just my 2 cents.
  • lolominlolomin Contributor Current User Role Supporter
    Hi wagnerbianchi,

    Thanks for your answer.

    Please note that this problem occurs only after upgrading to v5.5.41 of PXDBC, after this was not able to do an SST, so tried to downgrade back to v5.5.37 but without any success, node is now stuck because everytime I try to restart it, it does an SST from another node and then it crashes after SST. So configuration is probably not the culprit here as this one has not changed since before when the node was working properly with v5.5.37 !

    Here is an output of the server's hardware for each node :

    [email protected]:~# lscpu
    Architecture: x86_64
    CPU op-mode(s): 32-bit, 64-bit
    Byte Order: Little Endian
    CPU(s): 32
    On-line CPU(s) list: 0-31
    Thread(s) per core: 2
    Core(s) per socket: 8
    Socket(s): 2
    NUMA node(s): 2
    Vendor ID: GenuineIntel
    CPU family: 6
    Model: 62
    Model name: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
    Stepping: 4
    CPU MHz: 2600.174
    BogoMIPS: 5199.87
    Virtualization: VT-x
    L1d cache: 32K
    L1i cache: 32K
    L2 cache: 256K
    L3 cache: 20480K
    NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
    NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31

    [email protected]:~# free -m
    total used free shared buffers cached
    Mem: 64516 62101 2414 5 782 58067
    -/+ buffers/cache: 3251 61264
    Swap: 15257 1 15256

    Dataset is located on a 4 disks Toshiba MK3001GRRB 300GB SAS 6Gb/s volume.

    Attached id the my.cnf file for this node, please rename to altair_my.cnf.7z as the file has been compressed because upload file size is limited to 20k here and .7z extension are forbidden so I renamed it to .txt.

    I'm now wondering to do a full backup of one of available/still-in-life node, then change its grastate.dat and apply the backup to this node so there will be no SST at all at start and see if it crashes again or not ... but even if it works, this situation is not normal.

    Please tell if you see any more informations useful to have.
    Thanks !

    Regards,

    Laurent
  • lolominlolomin Contributor Current User Role Supporter
    Hi,

    ​Does anyone have any idea on this problem please ?

    Regards,

    Laurent
  • lolominlolomin Contributor Current User Role Supporter
    Hi,

    Still having the problem when doing SST from another node in the cluster, does anyone have any idea how to solve this crash please ?

    150630 8:55:17 [Note] Plugin 'FEDERATED' is disabled.
    150630 8:55:17 InnoDB: The InnoDB memory heap is disabled
    150630 8:55:17 InnoDB: Mutexes and rw_locks use GCC atomic builtins
    150630 8:55:17 InnoDB: Compressed tables use zlib 1.2.7
    150630 8:55:17 InnoDB: Using Linux native AIO
    150630 8:55:17 InnoDB: Initializing buffer pool, size = 20.0G
    150630 8:55:18 InnoDB: Completed initialization of buffer pool
    150630 8:55:18 InnoDB: highest supported file format is Barracuda.
    150630 8:55:19 InnoDB: Waiting for the background threads to start
    06:55:20 UTC - mysqld got signal 11 ;
    This could be because you hit a bug. It is also possible that this binary
    or one of the libraries it was linked against is corrupt, improperly built,
    or misconfigured. This error can also be caused by malfunctioning hardware.
    We will try our best to scrape up some info that will hopefully help
    diagnose the problem, but since we have already crashed,
    something is definitely wrong and this may fail.
    Please help us make Percona XtraDB Cluster better by reporting any
    bugs at https://bugs.launchpad.net/percona-xtradb-cluster

    key_buffer_size=268435456
    read_buffer_size=131072
    max_used_connections=0
    max_threads=702
    thread_count=2
    connection_count=0
    It is possible that mysqld could use up to
    key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1798672 K bytes of memory
    Hope that's ok; if not, decrease some variables in the equation.

    Thread pointer: 0x0
    Attempting backtrace. You can use the following information to find out
    where mysqld died. If you see no messages after this, something went
    terribly wrong...
    stack_bottom = 0 thread_stack 0x30000
    /usr/sbin/mysqld(my_print_stacktrace+0x29)[0x7abeb9]
    /usr/sbin/mysqld(handle_fatal_signal+0x372)[0x6aa6d2]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f5e078e78d0]
    /usr/sbin/mysqld(_ZN3THD15raise_conditionEjPKcN11MYSQL_ERROR18enum_warning_levelES1_+0x22)[0x589062]
    /usr/sbin/mysqld(_Z19push_warning_printfP3THDN11MYSQL_ERROR18enum_warning_levelEjPKcz+0xc7)[0x596d17]
    /usr/sbin/mysqld(ib_warn_row_too_big+0x6d)[0x7d2d5d]
    /usr/sbin/mysqld[0x869bbc]
    /usr/sbin/mysqld[0x8774d3]
    /usr/sbin/mysqld[0x8748cd]
    /usr/sbin/mysqld[0x873a88]
    /usr/sbin/mysqld[0x874e56]
    /usr/sbin/mysqld[0x873a88]
    /usr/sbin/mysqld[0x874e56]
    /usr/sbin/mysqld[0x873a88]
    /usr/sbin/mysqld[0x874e56]
    /usr/sbin/mysqld[0x8753da]
    /usr/sbin/mysqld[0x8e4220]
    /usr/sbin/mysqld[0x8dd158]
    /usr/sbin/mysqld[0x80e126]
    /usr/sbin/mysqld[0x8040fc]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f5e078e00a4]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f5e05b2d04d]
    You may download the Percona XtraDB Cluster operations manual by visiting
    http://www.percona.com/software/percona-xtradb-cluster/. You may find information
    in the manual which will help you identify the cause of the crash.
    150630 08:55:21 mysqld_safe mysqld from pid file /var/lib/mysql/altair.pid ended

    Regards,

    Laurent
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.