InnoDB assertion failure crash when altering then optimizing table after upgrade from MySQL 5.7 to 8.0

resuni · August 7, 2023, 11:10pm

Hello,

I’m currently performing some test upgrades between MySQL 5.7 and 8.0 before upgrading a production database. In my testing, I’ve found that a specific series of actions after an upgrade will cause a crash.

Upgrade from 5.7 to 8.0 by removing 5.7 Percona packages, changing to the 8.0 yum repo, then installing 8.0 packages (This is being done on RHEL 8, so yum/dnf packages are being used).
Start MySQL. I watch mysqld.log while it’s starting. It successfully imports the 5.7 data with no errors.
Alter a table using the INSTANT algorithm, e.g.: ALTER TABLE tableNameRedacted ADD COLUMN columnNameRedacted TINYINT, ALGORITHM=INSTANT;
Run an OPTIMIZE on that same table: OPTIMIZE TABLE tableName

This procedure immediately crashes mysqld, writing the following to mysqld.log:

2023-08-07T20:33:41.038422Z 18 [ERROR] [MY-013183] [InnoDB] Assertion failure: ddl0builder.cc:992:len <= col->len || DATA_LARGE_MTYPE(col->mtype) || (col->mtype == DATA_POINT && len == DATA_MBR_LEN) || ((col->mtype == DATA_VARCHAR || col->mtype == DATA_BINARY || col->mtype == DATA_VARMYSQL) && (col->len == 0 || len <= col->len + prtype_get_compression_extra(col->prtype))) thread 140190688913152
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to [http://bugs.mysql.com](http://bugs.mysql.com/).
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2023-08-07T20:33:41Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=828db7db80124cd9a47f09b8a4a6dc29ead7ca38
Server Version: 8.0.33-25 Percona Server (GPL), Release 25, Revision 60c9e2c5

Thread pointer: 0x7f2e34000ff0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f80b0369c10 thread_stack 0x100000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x2185f51]
/usr/sbin/mysqld(print_fatal_signal(int)+0x2c2) [0x12128b2]
/usr/sbin/mysqld(my_server_abort()+0x75) [0x1212b75]
/usr/sbin/mysqld(my_abort()+0xe) [0x217fb8e]
/usr/sbin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x309) [0x23d8219]
/usr/sbin/mysqld(ddl::Builder::copy_columns(ddl::Copy_ctx&, unsigned long&, unsigned long&)+0x278) [0x25d1568]
/usr/sbin/mysqld(ddl::Builder::copy_row(ddl::Copy_ctx&, unsigned long&)+0xde) [0x25d1afe]
/usr/sbin/mysqld(ddl::Builder::add_to_key_buffer(ddl::Copy_ctx&, unsigned long&)+0x26) [0x25d1dc6]
/usr/sbin/mysqld(ddl::Builder::bulk_add_row(ddl::Cursor&, ddl::Row&, unsigned long, std::function<dberr_t ()>&&)+0xf8) [0x25d77f8]
/usr/sbin/mysqld(ddl::Builder::add_row(ddl::Cursor&, ddl::Row&, unsigned long, std::function<dberr_t ()>&&)+0xc5) [0x25d7f45]
/usr/sbin/mysqld() [0x25262b4]
/usr/sbin/mysqld() [0x2526609]
/usr/sbin/mysqld(Parallel_reader::Ctx::traverse_recs(PCursor*, mtr_t*)+0x466) [0x23382d6]
/usr/sbin/mysqld(Parallel_reader::Ctx::traverse()+0x148) [0x2338608]
/usr/sbin/mysqld(Parallel_reader::worker(Parallel_reader::Thread_ctx*)+0x248) [0x233e768]
/usr/sbin/mysqld(Parallel_reader::spawn(unsigned long)+0x24) [0x233f124]
/usr/sbin/mysqld(Parallel_reader::run(unsigned long)+0x36) [0x233f1b6]
/usr/sbin/mysqld(ddl::Parallel_cursor::scan(std::vector<ddl::Builder*, ut::allocator<ddl::Builder*, ut::detail::allocator_base_pfs<ddl::Builder*> > >&)+0xa2b) [0x2527eeb]
/usr/sbin/mysqld(ddl::Loader::scan_and_build_indexes()+0xd2) [0x2523822]
/usr/sbin/mysqld(ddl::Loader::build_all()+0xf0) [0x2523960]
/usr/sbin/mysqld(ddl::Context::build()+0x38) [0x25176d8]
/usr/sbin/mysqld(bool ha_innobase::inplace_alter_table_impl<dd::Table>(TABLE*, Alter_inplace_info*)+0x4ba) [0x220ba5a]
/usr/sbin/mysqld(ha_innobase::inplace_alter_table(TABLE*, Alter_inplace_info*, dd::Table const*, dd::Table*)+0x51) [0x220be11]
/usr/sbin/mysqld() [0x113a7cd]
/usr/sbin/mysqld(mysql_alter_table(THD*, char const*, char const*, HA_CREATE_INFO*, Table_ref*, Alter_info*)+0x4d46) [0x11503b6]
/usr/sbin/mysqld(mysql_recreate_table(THD*, Table_ref*, bool)+0x34e) [0x1150e9e]
/usr/sbin/mysqld() [0xfdbab7]
/usr/sbin/mysqld(Sql_cmd_optimize_table::execute(THD*)+0x115) [0xfdc535]
/usr/sbin/mysqld(mysql_execute_command(THD*, bool)+0xb09) [0x108a979]
/usr/sbin/mysqld(dispatch_sql_command(THD*, Parser_state*, bool)+0x5cb) [0x108e57b]
/usr/sbin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x2487) [0x1091027]
/usr/sbin/mysqld(do_command(THD*)+0x1d6) [0x1091bb6]
/usr/sbin/mysqld() [0x1202668]
/usr/sbin/mysqld() [0x261b0a4]
/lib64/libpthread.so.0(+0x81ca) [0x7f85b23561ca]
/lib64/libc.so.6(clone+0x43) [0x7f85b06fae73]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f2e34be8350): OPTIMIZE TABLE tableNameRedacted
Connection ID (thread ID): 18
Status: NOT_KILLED

Please help us make Percona Server better by reporting any
bugs at https://bugs.percona.com/You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.

I’ve been able to reproduce this issue multiple times. Curiously enough, if I use Percona xtrabackup to transfer the data from a 5.7 instance to a fresh 8.0 instance, this problem does not happen. I even see messages in the log indicating that it’s importing 5.7 data when I first start the 8.0 instance, but I can run through the same ALTER INSTANT/OPTIMIZE procedure without a crash as long as I’ve run the data through xtrabackup first.

I’d like to better understand this behavior before upgrading our production database servers. Have I encountered a bug? Any ideas why xtrabackup makes this problem go away?

matthewb · August 8, 2023, 12:26am

Hello @resuni,
Can you repeat your steps above, but BEFORE #3 ALTER, run OPTIMIZE TABLE first, then ALTER it. Does that also result in a crash? That is “test A”. I have another one. Before step #1, while still on 5.7, OPTIMIZE TABLE, then do all steps. This is “test B”. Does that cause crash?

Mughees_Ahmed · August 8, 2023, 2:05am

@resuni can you please mention the exact version of MySQL 8.0 please e.g 8.0.32

lalit.choudhary · August 8, 2023, 7:03am

Hello @resuni

It looks like this is the bug [PS-8770] INSTANT DDL crashes 8.0.32 after upgrading from 5.7.41 - Percona JIRA
Fixed in PS-8.0.34

resuni · August 8, 2023, 2:59pm

@lalit.choudhary This is good news, thank you. I’m eagerly anticipating this update. Can you speculate as to why I wouldn’t experience this bug after importing data with xtrabackup?

@matthewb I will try your tests as soon as I find a way to run this with less data. Right now I’m testing on production data, which takes hours to restore every time I crash a database. I plan to attempt to reproduce this with a smaller sample size.

matthewb · August 8, 2023, 5:40pm

@resuni,
If you’re on ZFS, BTRFS, EBS, or use LVM, you can use snapshots to make this process quicker.

Here is the patch for the fix with comments from Oracle

github.com/mysql/mysql-server

Bug#35410528 : INSTANT DDL crashes 8.0.32 after upgrading from 5.7.41

committed 06:35AM - 31 May 23 UTC

+28 -28

Background: When tables are loaded in mem cache (dict_table_t) they go through … fill_dict_table() which makes sure all in mem metadata is updated for the tables correctly. But in case of upgrade from 5.7, tables are loaded in dict_table_t cache from dict_load_table(). And then it remains in cache to be used. Issue: During upgrade, when the table is loaded from dict_load_table(), it doesn't set the initial/current/total/_column_count metadata of dict_table_t. Which causes issues when table is used after INSTANT ADD/DROP DDL. Note this is done only once and next server restarts, the table is loaded from fill_dict_table() function. And this is why if there is a restart after upgrade, the issue is not seen. Fix: Made sure when the table is loaded from dict_load_table() during upgrade, the column count metadata is also set correctly. Change-Id: I7729a9e03683737bfe8b3025203f31551b710f70

Based on Oracle’s comments, my guess is that since Xtrabackup “cleans” the state of InnoDB before starting MySQL 8, the issue isn’t seen. But when you do a direct 5.7->8, the incorrect state from 5.7 remains in cache and mysql 8, on first start, did not clean that cache properly.

resuni · August 8, 2023, 6:20pm

If you’re on ZFS, BTRFS, EBS, or use LVM, you can use snapshots to make this process quicker.

Unfortunately, it’s just an XFS file system, but it certainly wouldn’t be a bad idea to consider switching the next time I have to perform a restore.

Based on Oracle’s comments, my guess is that since Xtrabackup “cleans” the state of InnoDB before starting MySQL 8, the issue isn’t seen. But when you do a direct 5.7->8, the incorrect state from 5.7 remains in cache and mysql 8, on first start, did not clean that cache properly.

I’m wondering if I can use xtrabackup to “clean” the data directory without actually transferring from another server. This will be something else I test once I am able.

Topic		Replies	Views
Crashing after reverting back to MySQL [Updated] Other MySQL® Questions	2	618	October 22, 2011
After mysql.5.6.28 upgrade to 8.0.23, percona-xtrabackup-8.0.23 could not be used with mysql 8.0.23 Percona XtraBackup percona	4	1054	June 28, 2021
Repeatable crash with percona Server 5.5.13-15 Other MySQL® Questions	2	657	September 6, 2011
Update from 7 to 8 Other MySQL® Questions	1	538	March 11, 2019
Upgrading 5.7 to 8.0.21 getting errors when starting up Percona Server for MySQL 8.0	3	3242	December 14, 2020

InnoDB assertion failure crash when altering then optimizing table after upgrade from MySQL 5.7 to 8.0

Related topics