Hi,
we have a problem with the prepare process on a freshly taken backup. The prepare starts and runs for a while and then suddenly it gets stuck. The process is still running, but it is not doing anything.
I waited for more than 12 hours, to be sure that there is no long lasting process in the log, but still nothing happened.
We are using xtrabackup 8.0.5 with a fresh installation of MySQL 8.0.14.
The data for the MySQL-Server has been restored from a backup of a MySQL 5.7.22 and then upgraded with mysql_upgrade.
After preparing the data we installed xtrabackup 8.0.5 and took a backup, which we tried to prepare then.
Here is the last output from prepare process:
xtrabackup: recognized server arguments: --innodb_checksum_algorithm=innodb --innodb_log_checksums=1 --innodb_data_file_path=ibdata1:12M:autoextend --innodb_log_files_in_group=2 --innodb_log_file_size=2684354560 --innodb_page_size=16384 --innodb_undo_directory=./ --innodb_undo_tablespaces=2 --server-id=303 --innodb_log_checksums=ON --innodb_redo_log_encrypt=0 --innodb_undo_log_encrypt=0
xtrabackup: recognized client arguments: --prepare=1 --use-memory=2GB --target-dir=/backup-storage/backup_dir/
/usr/bin/xtrabackup version 8.0.5 based on MySQL server 8.0.14 Linux (x86_64) (revision id: 40ec8a3)
xtrabackup: cd to /backup-storage/backup_dir//
xtrabackup: This target seems to be not prepared yet.
Number of pools: 1
xtrabackup: xtrabackup_logfile detected: size=10854268928, start_lsn=(65697107320549)
xtrabackup: using the following InnoDB configuration for recovery:
xtrabackup: innodb_data_home_dir = .
xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup: innodb_log_group_home_dir = .
xtrabackup: innodb_log_files_in_group = 1
xtrabackup: innodb_log_file_size = 10854268928
xtrabackup: using the following InnoDB configuration for recovery:
xtrabackup: innodb_data_home_dir = .
xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup: innodb_log_group_home_dir = .
xtrabackup: innodb_log_files_in_group = 1
xtrabackup: innodb_log_file_size = 10854268928
xtrabackup: Starting InnoDB instance for recovery.
xtrabackup: Using 2147483648 bytes for buffer pool (set by --use-memory parameter)
PUNCH HOLE support available
Mutexes and rw_locks use GCC atomic builtins
Uses event mutexes
GCC builtin __atomic_thread_fence() is used for memory barrier
Compressed tables use zlib 1.2.7
Number of pools: 1
Using CPU crc32 instructions
Directories to scan '.;.;.'
Scanning './'
Thread# 0 - Checked 487/2473 files
Thread# 0 - Checked 884/2473 files
Thread# 0 - Checked 1308/2473 files
Thread# 0 - Checked 1823/2473 files
Thread# 0 - Checked 2408/2473 files
Checked 2473 files
Completed space ID check of 2475 files.
Initializing buffer pool, total size = 2.000000G, instances = 1, chunk size =128.000000M
Completed initialization of buffer pool
page_cleaner coordinator priority: -20
page_cleaner worker priority: -20
page_cleaner worker priority: -20
page_cleaner worker priority: -20
The log sequence number 65634873689279 in the system tablespace does not match the log sequence number 65697107320549 in the ib_logfiles!
Database was not shutdown normally!
Starting crash recovery.
Starting to parse redo log at lsn = 65697107320655, whereas checkpoint_lsn = 65697107320549
Doing recovery: scanned up to log sequence number 65697112563200
...
Doing recovery: scanned up to log sequence number 65697815109120
Applying a batch of 83028 redo log records ...
4.6%
10%
...
100%
Apply batch completed!
Doing recovery: scanned up to log sequence number 65697820352000
...
Doing recovery: scanned up to log sequence number 65698056281600
my_realpath(./analysis/#sql-ib1616234-2716536971.ibd) failed!
my_realpath(./analysis/dcd_inv.ibd) failed!
Creating the tablespace : ./analysis/dcd_inv.ibd, space_id : 292372
my_realpath(./analysis/#sql-ib1616234-2716536971.ibd) failed!
Doing recovery: scanned up to log sequence number 65698061524480
...
Doing recovery: scanned up to log sequence number 65698155896320
my_realpath(./analysis/#sql-ib1616235-2716536972.ibd) failed!
my_realpath(./analysis/dcd_loc.ibd) failed!
Creating the tablespace : ./analysis/dcd_loc.ibd, space_id : 292373
my_realpath(./analysis/#sql-ib1616235-2716536972.ibd) failed!
Doing recovery: scanned up to log sequence number 65698161139200
...
Doing recovery: scanned up to log sequence number 65698622512640
Applying a batch of 190541 redo log records ...
1.9%
6%
...
30%
Extending tablespace : 150110 space name: onion/customer_visit_line_upload_log to new size: 8228608 pages during recovery.
...
Extending tablespace : 150110 space name: onion/customer_visit_line_upload_log to new size: 8229120 pages during recovery.
40%
50%
Extending tablespace : 220694 space name: mwf/belpos#P#p201904 to new size: 31744 pages during recovery.
59%
...
70%
Extending tablespace : 119368 space name: mwf_sessions/sessions to new size: 670976 pages during recovery.
...
Extending tablespace : 119368 space name: mwf_sessions/sessions to new size: 671744 pages during recovery.
76%
Extending tablespace : 119368 space name: mwf_sessions/sessions to new size: 672000 pages during recovery.
...
Extending tablespace : 119368 space name: mwf_sessions/sessions to new size: 673792 pages during recovery.
80%
80%
Here is the actual stacktrace:
Thread 16 (Thread 0x7f1ca900f700 (LWP 8070)):
#0 0x00007f1cacbe7965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000012f8760 in os_event::wait_low(long) ()
#2 0x00000000012f326b in os_aio_handler(unsigned long, fil_node_t**, void**, IORequest*) ()
#3 0x00000000011b0472 in fil_aio_wait(unsigned long) ()
#4 0x00000000013abcb8 in io_handler_thread ()
#5 0x0000000000b99913 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)(unsigned long), unsigned long> > >::_M_run() ()
#6 0x00000000021cc0af in execute_native_thread_routine ()
#7 0x00007f1cacbe3dd5 in start_thread () from /lib64/libpthread.so.0
#8 0x00007f1caa92df6d in clone () from /lib64/libc.so.6
Thread 15 (Thread 0x7f1ca880e700 (LWP 8071)):
#0 0x00007f1cacbe7965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
...
Thread 6 (Thread 0x7f1c1cef7700 (LWP 8080)):
#0 0x00007f1cacbe7965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000012f8760 in os_event::wait_low(long) ()
#2 0x0000000001111561 in buf_flush_page_coordinator_thread(unsigned long) ()
#3 0x0000000000b99913 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)(unsigned long), unsigned long> > >::_M_run() ()
#4 0x00000000021cc0af in execute_native_thread_routine ()
#5 0x00007f1cacbe3dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f1caa92df6d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f1c17fff700 (LWP 8081)):
#0 0x00007f1cacbe7965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000012f8760 in os_event::wait_low(long) ()
#2 0x0000000001110967 in buf_flush_page_cleaner_thread() ()
#3 0x0000000000b9980e in std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)()> > >::_M_run() ()
#4 0x00000000021cc0af in execute_native_thread_routine ()
#5 0x00007f1cacbe3dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f1caa92df6d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f1c177fe700 (LWP 8082)):
#0 0x00007f1cacbe7965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
...
#6 0x00007f1caa92df6d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f1c16ffd700 (LWP 8083)):
#0 0x00007f1cacbe7965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
...
#6 0x00007f1caa92df6d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f1be7fff700 (LWP 8084)):
#0 0x00007f1cacbeae3d in nanosleep () from /lib64/libpthread.so.0
#1 0x00000000012cfc25 in recv_writer_thread() ()
#2 0x0000000000b9980e in std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)()> > >::_M_run() ()
#3 0x00000000021cc0af in execute_native_thread_routine ()
#4 0x00007f1cacbe3dd5 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f1caa92df6d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f1cacff5940 (LWP 8051)):
#0 0x00007f1caa912e07 in sched_yield () from /lib64/libc.so.6
#1 0x00000000011c069d in Fil_shard::mutex_acquire_and_get_space(unsigned int, fil_space_t*&) ()
#2 0x00000000011c0934 in Fil_shard::space_extend(fil_space_t*, unsigned int) ()
#3 0x00000000011c1345 in Fil_shard::get_file_for_io(IORequest const&, fil_space_t*, unsigned int*, fil_node_t*&) ()
#4 0x00000000011c1e79 in Fil_shard::do_io(IORequest const&, bool, page_id_t const&, page_size_t const&, unsigned long, unsigned long, void*, void*) ()
#5 0x00000000011c277b in fil_io(IORequest const&, bool, page_id_t const&, page_size_t const&, unsigned long, unsigned long, void*, void*) ()
#6 0x0000000001120e53 in buf_read_page_low(dberr_t*, bool, unsigned long, unsigned long, page_id_t const&, page_size_t const&, bool) ()
#7 0x0000000001123215 in buf_read_recv_pages(bool, unsigned int, unsigned int const*, unsigned long) ()
#8 0x00000000012d42ce in recv_apply_hashed_log_recs(log_t&, bool) ()
#9 0x00000000012d74cd in recv_recovery_from_checkpoint_start(log_t&, unsigned long, unsigned long) ()
#10 0x00000000013b7e8d in srv_start(bool, std::string const&, unsigned long) ()
#11 0x0000000000ba10fc in innodb_init(bool, bool) ()
#12 0x0000000000ba8990 in xtrabackup_prepare_func(int, char**) ()
#13 0x0000000000b6abb0 in main ()
I added the full stacktrace as attachment: xtrackup-8-prepare-stacktrace.txt
Could you please give us a hint, what is going wrong?
xtrackup-8-prepare-stacktrace.txt (11.3 KB)