I have a three node galera cluster. I upgraded one node yesterday from 10.3 (Debian buster) galera 3, to 10.5 (Debian Bullseye) galera 4. Things went fine, everything worked.
Today I had to re-create the cluster, and re-do SST for the nodes from the 10.3 as the main donor for the others. The second node, running 10.3 joined and did SST just fine, the one I upgraded to 10.5 is unable to join because once it finishes its SST it complains:
InnoDB: Upgrade after a crash is not supported. The redo log was created with Backup 10.3.27-MariaDB.
and then aborts.
I’ve tried to remove entirely the data directory, and any logs, and re-do the state transfer from scratch, but the same problem happens.
Here is a complete log:
Feb 23 12:13:30 pochard mariadbd[528012]: 2021-02-23 12:13:30 1 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> c962afef-9de9-11ea-a9b2-2fa479531940:65036604
Feb 23 12:13:32 pochard mariadbd[528012]: 2021-02-23 12:13:32 0 [Note] WSREP: (980beb0c-b112, 'tcp://0.0.0.0:4567') turning message relay requesting off
Feb 23 12:14:11 pochard mariadbd[528012]: 2021-02-23 12:14:11 0 [Note] WSREP: 0.0 (scaup): State transfer to 2.0 (pochard) complete.
Feb 23 12:14:11 pochard mariadbd[528012]: 2021-02-23 12:14:11 0 [Note] WSREP: Member 0.0 (scaup) synced with group.
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 3 [Note] WSREP: SST received
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 3 [Note] WSREP: Server status change joiner -> initializing
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Using Linux native AIO
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Uses event mutexes
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Number of pools: 1
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Initializing buffer pool, total size = 5368709120, chunk size = 134217728
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Completed initialization of buffer pool
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [ERROR] InnoDB: Upgrade after a crash is not supported. The redo log was created with Backup 10.3.27-MariaDB.
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] InnoDB: Starting shutdown...
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [ERROR] Plugin 'InnoDB' init function returned error.
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Note] Plugin 'FEEDBACK' is disabled.
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [Warning] 'thread-concurrency' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [ERROR] Unknown/unsupported storage engine: InnoDB
Feb 23 12:14:12 pochard mariadbd[528012]: 2021-02-23 12:14:12 0 [ERROR] Aborting
Feb 23 12:14:12 pochard mariadbd[528012]: terminate called after throwing an instance of 'wsrep0:0:0:0:0:0:0:0runtime_error'
Feb 23 12:14:12 pochard mariadbd[528012]: what(): State wait was interrupted
Feb 23 12:14:12 pochard mariadbd[528012]: 210223 12:14:12 [ERROR] mysqld got signal 6 ;
Feb 23 12:14:12 pochard mariadbd[528012]: This could be because you hit a bug. It is also possible that this binary
Feb 23 12:14:12 pochard mariadbd[528012]: or one of the libraries it was linked against is corrupt, improperly built,
Feb 23 12:14:12 pochard mariadbd[528012]: or misconfigured. This error can also be caused by malfunctioning hardware.
Feb 23 12:14:12 pochard mariadbd[528012]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Feb 23 12:14:12 pochard mariadbd[528012]: We will try our best to scrape up some info that will hopefully help
Feb 23 12:14:12 pochard mariadbd[528012]: diagnose the problem, but since we have already crashed,
Feb 23 12:14:12 pochard mariadbd[528012]: something is definitely wrong and this may fail.
Feb 23 12:14:12 pochard mariadbd[528012]: Server version: 10.5.8-MariaDB-3-log
Feb 23 12:14:12 pochard mariadbd[528012]: key_buffer_size=536870912
Feb 23 12:14:12 pochard mariadbd[528012]: read_buffer_size=786432
Feb 23 12:14:12 pochard mariadbd[528012]: max_used_connections=0
Feb 23 12:14:12 pochard mariadbd[528012]: max_threads=2002
Feb 23 12:14:12 pochard mariadbd[528012]: thread_count=4
Feb 23 12:14:12 pochard mariadbd[528012]: It is possible that mysqld could use up to
Feb 23 12:14:12 pochard mariadbd[528012]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3650182 K bytes of memory
Feb 23 12:14:12 pochard mariadbd[528012]: Hope that's ok; if not, decrease some variables in the equation.
Feb 23 12:14:12 pochard mariadbd[528012]: Thread pointer: 0x7feb08002128
Feb 23 12:14:12 pochard mariadbd[528012]: Attempting backtrace. You can use the following information to find out
Feb 23 12:14:12 pochard mariadbd[528012]: where mysqld died. If you see no messages after this, something went
Feb 23 12:14:12 pochard mariadbd[528012]: terribly wrong...
Feb 23 12:14:12 pochard mariadbd[528012]: stack_bottom = 0x7feb0fffeb48 thread_stack 0x30000
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(my_print_stacktrace)[0x55811ec0647e]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(handle_fatal_signal)[0x55811e7172d5]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(__restore_rt)[0x7feb34a4d140]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(gsignal)[0x7feb34596ce1]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(abort)[0x7feb34580537]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(__cxa_throw_bad_array_new_length)[0x7feb349007ec]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(st0:0:0:0:0:0:0:0rethrow_exception(st0:0:0:0:0:0:0:0__exception_ptr0:0:0:0:0:0:0:0xception_ptr))[0x7feb3490b966]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(st0:0:0:0:0:0:0:0terminate())[0x7feb3490b9d1]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(__cxa_throw)[0x7feb3490bc65]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(Wsrep_server_servi0:0:0:0:0:0:0:0log_dummy_write_set(wsrep0:0:0:0:0:0:0:0lient_state&, wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x55811e421112]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0sst_received(wsrep0:0:0:0:0:0:0:0lient_service&, int))[0x55811ec7a63b]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(void st0:0:0:0:0:0:0:0vector<char, st0:0:0:0:0:0:0:0llocator<char> >0:0:0:0:0:0:0:0_M_realloc_insert<char const&>(__gnu_cxx0:0:0:0:0:0:0:0__normal_iterator<char*, st0:0:0:0:0:0:0:0vector<char, st0:0:0:0:0:0:0:0llocator<char> > >, char const&))[0x55811e9bb00a]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(void st0:0:0:0:0:0:0:0vector<char, st0:0:0:0:0:0:0:0llocator<char> >0:0:0:0:0:0:0:0_M_realloc_insert<char const&>(__gnu_cxx0:0:0:0:0:0:0:0__normal_iterator<char*, st0:0:0:0:0:0:0:0vector<char, st0:0:0:0:0:0:0:0llocator<char> > >, char const&))[0x55811e9bbc64]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(MyCTX_nop0:0:0:0:0:0:0:0inish(unsigned char*, unsigned int*))[0x55811e94eee2]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(start_thread)[0x7feb34a41ea7]
Feb 23 12:14:12 pochard mariadbd[528012]: ??:0(clone)[0x7feb34658def]
Feb 23 12:14:12 pochard mariadbd[528012]: Trying to get some variables.
Feb 23 12:14:12 pochard mariadbd[528012]: Some pointers may be invalid and cause the dump to abort.
Feb 23 12:14:12 pochard mariadbd[528012]: Query (0x0): (null)
Feb 23 12:14:12 pochard mariadbd[528012]: Connection ID (thread ID): 3
Feb 23 12:14:12 pochard mariadbd[528012]: Status: NOT_KILLED
Feb 23 12:14:12 pochard mariadbd[528012]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
Feb 23 12:14:12 pochard mariadbd[528012]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Feb 23 12:14:12 pochard mariadbd[528012]: information that should help you find out what is causing the crash.
Feb 23 12:14:12 pochard mariadbd[528012]: We think the query pointer is invalid, but we will try to print it anyway.
Feb 23 12:14:12 pochard mariadbd[528012]: Query:
Feb 23 12:14:12 pochard mariadbd[528012]: Writing a core file...
Feb 23 12:14:12 pochard mariadbd[528012]: Working directory at /var/lib/mysql
Feb 23 12:14:12 pochard mariadbd[528012]: Resource Limits:
Feb 23 12:14:12 pochard mariadbd[528012]: Limit Soft Limit Hard Limit Units
Feb 23 12:14:12 pochard mariadbd[528012]: Max cpu time unlimited unlimited seconds
Feb 23 12:14:12 pochard mariadbd[528012]: Max file size unlimited unlimited bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max data size unlimited unlimited bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max stack size 8388608 unlimited bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max core file size 0 unlimited bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max resident set unlimited unlimited bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max processes 47790 47790 processes
Feb 23 12:14:12 pochard mariadbd[528012]: Max open files 16384 16384 files
Feb 23 12:14:12 pochard mariadbd[528012]: Max locked memory 65536 65536 bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max address space unlimited unlimited bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max file locks unlimited unlimited locks
Feb 23 12:14:12 pochard mariadbd[528012]: Max pending signals 47790 47790 signals
Feb 23 12:14:12 pochard mariadbd[528012]: Max msgqueue size 819200 819200 bytes
Feb 23 12:14:12 pochard mariadbd[528012]: Max nice priority 0 0
Feb 23 12:14:12 pochard mariadbd[528012]: Max realtime priority 0 0
Feb 23 12:14:12 pochard mariadbd[528012]: Max realtime timeout unlimited unlimited us
Feb 23 12:14:12 pochard mariadbd[528012]: Core pattern: core
How can I get this one to re-join?
I do plan on upgrading the other nodes, but now I’m a bit scared that I didn’t do something right and if I can’t get this third node introduced into the cluster before I do the upgrade, I may have more issues.