PerconaFT recovery fail after disk failure

harryc · January 16, 2018, 6:09am

We had a disk failure on a RAID array which caused our server to go down. On rebuilding the array we cannot start up percona due to a problem with the recovery log file. We have to start innodb with innodb_force_recovery to reach the tokudb loading stage but then receive this error (after innodb startup):

2018-01-16T11:34:32.800498Z 0 [Note] InnoDB: Buffer pool(s) load completed at 180116 11:34:32
Tue Jan 16 11:34:32 2018 PerconaFT recovery starting in env /var/local/percona/mysql/current/
Tue Jan 16 11:34:32 2018 PerconaFT recovery failed -30975
2018-01-16T11:34:32.896964Z 0 [ERROR] TokuDB unknown error -30975
/mnt/workspace/percona-server-5.7-binaries-release/label_exp/debian-wheezy-x64/percona-server-5.7.18-15/storage/tokudb/PerconaFT/portability/file.cc:425 file_fsync_internal: Assertion `get_error
_errno() == EINTR' failed (errno=22)
: Invalid argument
Backtrace: (Note: toku_do_assert=0x0x7fca994bb010)
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_Z19db_env_do_backtraceP8_IO_FILE+0x1b)[0x7fca994b9fcb]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x91153)[0x7fca994ba153]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x8fb1a)[0x7fca994b8b1a]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x916b4)[0x7fca994ba6b4]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_Z23toku_logger_maybe_fsyncP10tokulogger10__toku_lsnib+0x1e7)[0x7fca994fa007]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_Z23toku_log_end_checkpointP10tokuloggerP10__toku_lsniS1_mjj+0x141)[0x7fca9951fb71]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_ZN12checkpointer18log_end_checkpointEv+0x31)[0x7fca9951ff71]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_ZN12checkpointer14end_checkpointEPFvPvES0_+0x66)[0x7fca9951dbc6]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_Z15toku_checkpointP12checkpointerP10tokuloggerPFvPvES3_S5_S3_19checkpoint_caller_t+0x1d8)[0x7fca9951dde8]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x11afe5)[0x7fca99543fe5]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x66c69)[0x7fca9948fc69]
/opt/percona/mysql/current/bin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x51)[0x99f9f1]
/opt/percona/mysql/current/bin/mysqld[0xe1c6c6]
/opt/percona/mysql/current/bin/mysqld[0xe1e627]
/opt/percona/mysql/current/bin/mysqld(_Z11plugin_initPiPPci+0x949)[0xe224d9]
/opt/percona/mysql/current/bin/mysqld(_Z11mysqld_mainiPPc+0xab8)[0x9402d8]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fcab487e830]
/opt/percona/mysql/current/bin/mysqld[0x937955]
Engine status function not available
Memory usage:
Arena 0:
system bytes = 0
in use bytes = 0
Total (incl. mmap):
system bytes = 0
in use bytes = 0
max mmap regions = 0
max mmap bytes = 0
11:34:32 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=0
max_threads=215
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 93396 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/opt/percona/mysql/current/bin/mysqld(my_print_stacktrace+0x2e)[0xffbc1e]
/opt/percona/mysql/current/bin/mysqld(handle_fatal_signal+0x4a1)[0x9475d1]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fcab67f2390]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7fcab4893428]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7fcab489502a]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x91158)[0x7fca994ba158]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x8fb1a)[0x7fca994b8b1a]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x916b4)[0x7fca994ba6b4]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_Z23toku_logger_maybe_fsyncP10tokulogger10__toku_lsnib+0x1e7)[0x7fca994fa007]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_Z23toku_log_end_checkpointP10tokuloggerP10__toku_lsniS1_mjj+0x141)[0x7fca9951fb71]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_ZN12checkpointer18log_end_checkpointEv+0x31)[0x7fca9951ff71]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_ZN12checkpointer14end_checkpointEPFvPvES0_+0x66)[0x7fca9951dbc6]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(_Z15toku_checkpointP12checkpointerP10tokuloggerPFvPvES3_S5_S3_19checkpoint_caller_t+0x1d8)[0x7fca9951dde8]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x11afe5)[0x7fca99543fe5]
/opt/percona/mysql/current/lib/mysql/plugin/ha_tokudb.so(+0x66c69)[0x7fca9948fc69]
/opt/percona/mysql/current/bin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x51)[0x99f9f1]
/opt/percona/mysql/current/bin/mysqld[0xe1c6c6]
/opt/percona/mysql/current/bin/mysqld[0xe1e627]
/opt/percona/mysql/current/bin/mysqld(_Z11plugin_initPiPPci+0x949)[0xe224d9]
/opt/percona/mysql/current/bin/mysqld(_Z11mysqld_mainiPPc+0xab8)[0x9402d8]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fcab487e830]
/opt/percona/mysql/current/bin/mysqld[0x937955]
You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.

There doesn’t seem to be much documentation about recovering from this point. Any ideas how to fix the recovery log?

Peter · January 16, 2018, 7:17am

The recommended recovery for TokuDB in case of hardware failure is to recover from backup (or from replica)

George_Lorch · January 17, 2018, 9:29am

Peter is correct, at this point, I would strongly suggest recovering from backup both your InnoDB and TokuDB data. The fact that you had to force recover InnoDB all the way up to 6 illustrates some serious damage exists. TokuDB does not have any means of forceful or lossy recovery implemented. What is curious is that the call stack and assertion indicate that an fsync failed with an “EINVAL : Invalid Argument”. The only argument to fsync is the file descriptor (fd). From the fsync documentation :

“EROFS, EINVAL fd is bound to a special file which does not support synchronization.”

From this, my best guess is that something is still wrong with the hardware or you may have found some specific case of recovery where possibly the file is opened in such a way that fsync is not permitted. I have seen this happen once before where the recovery log was severely damaged (0 bytes or torn/partially written header) and was never able to diagnose the path it took to get there.

One trick that might get you past such a scenario would be to remove all existing TokuDB/PerconaFT recovery log files and copy a fresh, empty, undamaged recovery log file from a clean install that was properly shutdown (renaming is not necessary). See [url]TokuDB files and file types — Percona Server 8.0 Documentation for the naming convention and how to find it. Please note that this is a very dangerous suggestion. It may not work and it could permanently and irreparably damage your existing data set. If it does work, all changes since the last checkpoint will be lost (default checkpointing period is 60 seconds).

–
George O. Lorch III
Principal Software Engineer, Percona

Topic		Replies	Views
toku_recover_fassociate: Assertion `r==0' failed (errno=2) Percona Server for MySQL 5.7	3	1356	January 9, 2017
Mariadb + tokudb startup error MySQL & MariaDB percona , closed-no-reply	0	673	April 14, 2022
Percona mysql server crashes abruptly and keeps restarting and doing recovery Percona Distribution for MySQL	2	1336	September 14, 2022
Percona 5.7 w/ TokuDB - "Got Error 1152 from storage engine" ErrorNr 1030 Percona Server for MySQL 5.7	4	2943	May 1, 2016
percona cluster crashed when loading data in batch Percona XtraDB Cluster 5.x	0	510	December 8, 2015

PerconaFT recovery fail after disk failure

Related topics