We have observed this that percona xtradb cluster keeps on crashing periodically. Every crash creates a core.*
dump file under /var/lib/mysql
which ultimately fills the entire mounted space. Then the pxc container does not start anymore.
My version is: percona-xtradb-cluster:8.0.21-12
Here is the logs I see from
bash-4.2$ tail -100 wsrep_recovery_verbose_history.log
2021-06-02T19:52:20.409005Z 1 [Note] [MY-012550] [InnoDB] Doing recovery: scanned up to log sequence number 45650818
2021-06-02T19:52:20.412539Z 1 [Note] [MY-013083] [InnoDB] Log background threads are being started...
2021-06-02T19:52:20.412933Z 1 [Note] [MY-012532] [InnoDB] Applying a batch of 0 redo log records ...
2021-06-02T19:52:20.413134Z 1 [Note] [MY-012535] [InnoDB] Apply batch completed!
2021-06-02T19:52:20.414374Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'.
2021-06-02T19:52:20.415450Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'.
2021-06-02T19:52:20.418079Z 1 [Note] [MY-012910] [InnoDB] Opened 2 existing undo tablespaces.
2021-06-02T19:52:20.418234Z 1 [Note] [MY-011980] [InnoDB] GTID recovery trx_no: 101643
2021-06-02T19:52:20.671923Z 1 [Note] [MY-012255] [InnoDB] Removed temporary tablespace data file: "ibtmp1"
2021-06-02T19:52:20.672093Z 1 [Note] [MY-012923] [InnoDB] Creating shared tablespace for temporary tables
2021-06-02T19:52:20.672267Z 1 [Note] [MY-012265] [InnoDB] Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2021-06-02T19:52:20.704599Z 1 [Note] [MY-012266] [InnoDB] File './ibtmp1' size is now 12 MB.
2021-06-02T19:52:20.704829Z 1 [Note] [MY-013627] [InnoDB] Scanning temp tablespace dir:'./#innodb_temp/'
2021-06-02T19:52:20.765755Z 1 [Note] [MY-013018] [InnoDB] Created 128 and tracked 128 new rollback segment(s) in the temporary tablespace. 128 are now active.
2021-06-02T19:52:20.766444Z 1 [Note] [MY-012976] [InnoDB] Percona XtraDB (http://www.percona.com) 8.0.21-12 started; log sequence number 45650818
2021-06-02T19:52:20.767118Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2021-06-02T19:52:20.779823Z 1 [Note] [MY-011089] [Server] Data dictionary restarting version '80021'.
2021-06-02T19:52:20.924000Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2021-06-02T19:52:20.940090Z 1 [Note] [MY-012356] [InnoDB] Validated 39/39 tablespaces
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
2021-06-02T19:52:20.941005Z 1 [Note] [MY-000000] [WSREP] Initiating SST cancellation
19:52:20 UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Build ID: 5a2199b1784b967a713a3bde8d996dc517c41adb
Server Version: 8.0.21-12.1 Percona XtraDB Cluster (GPL), Release rel12, Revision 4d973e2, WSREP version 26.4.3, wsrep_26.4.3
Thread pointer: 0x7d05720
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa836952d20 thread_stack 0x46000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x20b4cf1]
/usr/sbin/mysqld(handle_fatal_signal+0x3c3) [0x128c8e3]
/lib64/libpthread.so.0(+0x12b20) [0x7fa844f1fb20]
/lib64/libc.so.6(gsignal+0x10f) [0x7fa842bed7ff]
/lib64/libc.so.6(abort+0x127) [0x7fa842bd7c35]
/lib64/libstdc++.so.6(+0x9009b) [0x7fa8435a309b]
/lib64/libstdc++.so.6(+0x9653c) [0x7fa8435a953c]
/lib64/libstdc++.so.6(+0x96597) [0x7fa8435a9597]
/lib64/libstdc++.so.6(+0x967f8) [0x7fa8435a97f8]
/lib64/libstdc++.so.6(+0x92235) [0x7fa8435a5235]
/lib64/libstdc++.so.6(+0xc2e9d) [0x7fa8435d5e9d]
/usr/sbin/mysqld(IB_thread create_detached_thread<void (&)(ib_wqueue_t*), ib_wqueue_t*&>(mysql_pfs_key_t, void (&)(ib_wqueue_t*), ib_wqueue_t*&)+0x353) [0x255b9a3]
/usr/sbin/mysqld(fts_optimize_init()+0x6f) [0x255c52f]
/usr/sbin/mysqld(srv_start_threads(bool)+0x13f) [0x23634ff]
/usr/sbin/mysqld() [0x21b237b]
/usr/sbin/mysqld() [0x1e5f0a2]
/usr/sbin/mysqld(dd::bootstrap::restart(THD*)+0x119) [0x1e68429]
/usr/sbin/mysqld() [0x2078f80]
/usr/sbin/mysqld(dd::upgrade_57::do_pre_checks_and_initialize_dd(THD*)+0xc40) [0x207e7c0]
/usr/sbin/mysqld() [0x1395bb8]
/usr/sbin/mysqld() [0x25d49f8]
/lib64/libpthread.so.0(+0x814a) [0x7fa844f1514a]
/lib64/libc.so.6(clone+0x43) [0x7fa842cb2f23]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 1
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
Writing a core file using lib coredumper
PATH: (null)
Error writting coredump: -1 Signal: 6
2021-06-02T19:57:26.136375Z 0 [Note] [MY-010949] [Server] Basedir set to /usr/.
2021-06-02T19:57:26.136398Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.21-12.1) starting as process 145
2021-06-02T19:57:26.142848Z 0 [Note] [MY-012366] [InnoDB] Using Linux native AIO
2021-06-02T19:57:26.143046Z 0 [Note] [MY-010747] [Server] Plugin 'FEDERATED' is disabled.
2021-06-02T19:57:26.144762Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2021-06-02T19:57:26.144826Z 1 [Note] [MY-013546] [InnoDB] Atomic write enabled
2021-06-02T19:57:26.144892Z 1 [Note] [MY-012932] [InnoDB] PUNCH HOLE support available
2021-06-02T19:57:26.144944Z 1 [Note] [MY-012943] [InnoDB] Mutexes and rw_locks use GCC atomic builtins
2021-06-02T19:57:26.144991Z 1 [Note] [MY-012944] [InnoDB] Uses event mutexes
2021-06-02T19:57:26.145039Z 1 [Note] [MY-012945] [InnoDB] GCC builtin __atomic_thread_fence() is used for memory barrier
2021-06-02T19:57:26.145148Z 1 [Note] [MY-012948] [InnoDB] Compressed tables use zlib 1.2.11
2021-06-02T19:57:26.147252Z 1 [Note] [MY-013251] [InnoDB] Number of pools: 1
2021-06-02T19:57:26.147420Z 1 [Note] [MY-012951] [InnoDB] Using CPU crc32 instructions
2021-06-02T19:57:26.147819Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2021-06-02T19:57:26.147947Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2021-06-02T19:57:26.150045Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 37 files.
2021-06-02T19:57:26.151383Z 1 [Note] [MY-012955] [InnoDB] Initializing buffer pool, total size = 384.000000M, instances = 1, chunk size =128.000000M
2021-06-02T19:57:26.168601Z 1 [Note] [MY-012957] [InnoDB] Completed initialization of buffer pool
2021-06-02T19:57:26.171069Z 0 [Note] [MY-011952] [InnoDB] If the mysqld execution user is authorized, page cleaner and LRU manager thread priority can be changed. See the man page of setpriority().
2021-06-02T19:57:26.172348Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.dblwr' for doublewrite
2021-06-02T19:57:26.174079Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_1.dblwr' for doublewrite
2021-06-02T19:57:26.299027Z 1 [Note] [MY-013566] [InnoDB] Double write buffer files: 2
2021-06-02T19:57:26.299145Z 1 [Note] [MY-013565] [InnoDB] Double write buffer pages per instance: 4
2021-06-02T19:57:26.299257Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.dblwr' for doublewrite
2021-06-02T19:57:26.299365Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_1.dblwr' for doublewrite
2021-06-02T19:57:26.300434Z 1 [Note] [MY-012560] [InnoDB] The log sequence number 44751731 in the system tablespace does not match the log sequence number 45650818 in the ib_logfiles!
2021-06-02T19:57:26.300545Z 1 [Note] [MY-012551] [InnoDB] Database was not shutdown normally!
2021-06-02T19:57:26.300648Z 1 [Note] [MY-012552] [InnoDB] Starting crash recovery.
2021-06-02T19:57:26.425114Z 1 [Note] [MY-013086] [InnoDB] Starting to parse redo log at lsn = 45650460, whereas checkpoint_lsn = 45650818
2021-06-02T19:57:26.425257Z 1 [Note] [MY-012550] [InnoDB] Doing recovery: scanned up to log sequence number 45650818
and
bash-4.2$ tail -100 wsrep_recovery_verbose.log
2021-06-02T19:57:26.136375Z 0 [Note] [MY-010949] [Server] Basedir set to /usr/.
2021-06-02T19:57:26.136398Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.21-12.1) starting as process 145
2021-06-02T19:57:26.142848Z 0 [Note] [MY-012366] [InnoDB] Using Linux native AIO
2021-06-02T19:57:26.143046Z 0 [Note] [MY-010747] [Server] Plugin 'FEDERATED' is disabled.
2021-06-02T19:57:26.144762Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2021-06-02T19:57:26.144826Z 1 [Note] [MY-013546] [InnoDB] Atomic write enabled
2021-06-02T19:57:26.144892Z 1 [Note] [MY-012932] [InnoDB] PUNCH HOLE support available
2021-06-02T19:57:26.144944Z 1 [Note] [MY-012943] [InnoDB] Mutexes and rw_locks use GCC atomic builtins
2021-06-02T19:57:26.144991Z 1 [Note] [MY-012944] [InnoDB] Uses event mutexes
2021-06-02T19:57:26.145039Z 1 [Note] [MY-012945] [InnoDB] GCC builtin __atomic_thread_fence() is used for memory barrier
2021-06-02T19:57:26.145148Z 1 [Note] [MY-012948] [InnoDB] Compressed tables use zlib 1.2.11
2021-06-02T19:57:26.147252Z 1 [Note] [MY-013251] [InnoDB] Number of pools: 1
2021-06-02T19:57:26.147420Z 1 [Note] [MY-012951] [InnoDB] Using CPU crc32 instructions
2021-06-02T19:57:26.147819Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2021-06-02T19:57:26.147947Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2021-06-02T19:57:26.150045Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 37 files.
2021-06-02T19:57:26.151383Z 1 [Note] [MY-012955] [InnoDB] Initializing buffer pool, total size = 384.000000M, instances = 1, chunk size =128.000000M
2021-06-02T19:57:26.168601Z 1 [Note] [MY-012957] [InnoDB] Completed initialization of buffer pool
2021-06-02T19:57:26.171069Z 0 [Note] [MY-011952] [InnoDB] If the mysqld execution user is authorized, page cleaner and LRU manager thread priority can be changed. See the man page of setpriority().
2021-06-02T19:57:26.172348Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.dblwr' for doublewrite
2021-06-02T19:57:26.174079Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_1.dblwr' for doublewrite
2021-06-02T19:57:26.299027Z 1 [Note] [MY-013566] [InnoDB] Double write buffer files: 2
2021-06-02T19:57:26.299145Z 1 [Note] [MY-013565] [InnoDB] Double write buffer pages per instance: 4
2021-06-02T19:57:26.299257Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.dblwr' for doublewrite
2021-06-02T19:57:26.299365Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_1.dblwr' for doublewrite
2021-06-02T19:57:26.300434Z 1 [Note] [MY-012560] [InnoDB] The log sequence number 44751731 in the system tablespace does not match the log sequence number 45650818 in the ib_logfiles!
2021-06-02T19:57:26.300545Z 1 [Note] [MY-012551] [InnoDB] Database was not shutdown normally!
2021-06-02T19:57:26.300648Z 1 [Note] [MY-012552] [InnoDB] Starting crash recovery.
2021-06-02T19:57:26.425114Z 1 [Note] [MY-013086] [InnoDB] Starting to parse redo log at lsn = 45650460, whereas checkpoint_lsn = 45650818
2021-06-02T19:57:26.425257Z 1 [Note] [MY-012550] [InnoDB] Doing recovery: scanned up to log sequence number 45650818
2021-06-02T19:57:26.428781Z 1 [Note] [MY-013083] [InnoDB] Log background threads are being started...
2021-06-02T19:57:26.429201Z 1 [Note] [MY-012532] [InnoDB] Applying a batch of 0 redo log records ...
2021-06-02T19:57:26.429366Z 1 [Note] [MY-012535] [InnoDB] Apply batch completed!
2021-06-02T19:57:26.430443Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'.
2021-06-02T19:57:26.431564Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'.
2021-06-02T19:57:26.434375Z 1 [Note] [MY-012910] [InnoDB] Opened 2 existing undo tablespaces.
2021-06-02T19:57:26.434521Z 1 [Note] [MY-011980] [InnoDB] GTID recovery trx_no: 101643
2021-06-02T19:57:26.731364Z 1 [Note] [MY-012255] [InnoDB] Removed temporary tablespace data file: "ibtmp1"
2021-06-02T19:57:26.731533Z 1 [Note] [MY-012923] [InnoDB] Creating shared tablespace for temporary tables
2021-06-02T19:57:26.731708Z 1 [Note] [MY-012265] [InnoDB] Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2021-06-02T19:57:26.765648Z 1 [Note] [MY-012266] [InnoDB] File './ibtmp1' size is now 12 MB.
2021-06-02T19:57:26.765882Z 1 [Note] [MY-013627] [InnoDB] Scanning temp tablespace dir:'./#innodb_temp/'
2021-06-02T19:57:26.828702Z 1 [Note] [MY-013018] [InnoDB] Created 128 and tracked 128 new rollback segment(s) in the temporary tablespace. 128 are now active.
2021-06-02T19:57:26.829355Z 1 [Note] [MY-012976] [InnoDB] Percona XtraDB (http://www.percona.com) 8.0.21-12 started; log sequence number 45650818
2021-06-02T19:57:26.830051Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2021-06-02T19:57:26.842937Z 1 [Note] [MY-011089] [Server] Data dictionary restarting version '80021'.
2021-06-02T19:57:26.999655Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2021-06-02T19:57:27.016396Z 1 [Note] [MY-012356] [InnoDB] Validated 39/39 tablespaces
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
2021-06-02T19:57:27.017410Z 1 [Note] [MY-000000] [WSREP] Initiating SST cancellation
19:57:27 UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Build ID: 5a2199b1784b967a713a3bde8d996dc517c41adb
Server Version: 8.0.21-12.1 Percona XtraDB Cluster (GPL), Release rel12, Revision 4d973e2, WSREP version 26.4.3, wsrep_26.4.3
Thread pointer: 0x6d15720
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f5ee4dc0d20 thread_stack 0x46000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x20b4cf1]
/usr/sbin/mysqld(handle_fatal_signal+0x3c3) [0x128c8e3]
/lib64/libpthread.so.0(+0x12b20) [0x7f5ef338db20]
/lib64/libc.so.6(gsignal+0x10f) [0x7f5ef105b7ff]
/lib64/libc.so.6(abort+0x127) [0x7f5ef1045c35]
/lib64/libstdc++.so.6(+0x9009b) [0x7f5ef1a1109b]
/lib64/libstdc++.so.6(+0x9653c) [0x7f5ef1a1753c]
/lib64/libstdc++.so.6(+0x96597) [0x7f5ef1a17597]
/lib64/libstdc++.so.6(+0x967f8) [0x7f5ef1a177f8]
/lib64/libstdc++.so.6(+0x92235) [0x7f5ef1a13235]
/lib64/libstdc++.so.6(+0xc2e9d) [0x7f5ef1a43e9d]
/usr/sbin/mysqld(IB_thread create_detached_thread<void (&)()>(mysql_pfs_key_t, void (&)())+0x33b) [0x224e8bb]
/usr/sbin/mysqld(srv_start_threads(bool)+0xc0) [0x2363480]
/usr/sbin/mysqld() [0x21b237b]
/usr/sbin/mysqld() [0x1e5f0a2]
/usr/sbin/mysqld(dd::bootstrap::restart(THD*)+0x119) [0x1e68429]
/usr/sbin/mysqld() [0x2078f80]
/usr/sbin/mysqld(dd::upgrade_57::do_pre_checks_and_initialize_dd(THD*)+0xc40) [0x207e7c0]
/usr/sbin/mysqld() [0x1395bb8]
/usr/sbin/mysqld() [0x25d49f8]
/lib64/libpthread.so.0(+0x814a) [0x7f5ef338314a]
/lib64/libc.so.6(clone+0x43) [0x7f5ef1120f23]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 1
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
Writing a core file using lib coredumper
PATH: (null)
Error writting coredump: -1 Signal: 6
Any idea as what could be going wrong ?