Data corruption after Percona Server 5.7.32-35 to 8.0.32-24 upgrade

Hi,

We upgraded two replicas from 5.7.32 to 8.0.32 recently. One replica with old_alter_table = off frequently has data corruption. The corruption is where after a column with default null is added to a table, the new column magically gets filled by data from the column next to it and optimize table crashes mysqld with the message below. The other replica with old_alter_table = on does not have the issue. We can’t reliably reproduce the issue. But we are wondering if anyone else has seen something similar.

2023-04-29T00:11:54Z UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=f5a4b8541445e3f1cd10798a1d9f0ca415d73180
Server Version: 8.0.32-24 Percona Server (GPL), Release 24, Revision e5c6e9d2

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x100000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x2153fed]
/usr/sbin/mysqld(print_fatal_signal(int)+0x39f) [0x11f020f]
/usr/sbin/mysqld(handle_fatal_signal+0xc5) [0x11f02e5]
/lib64/libpthread.so.0(+0x118e0) [0x7fb6e81578e0]
/usr/sbin/mysqld(page_cur_delete_rec(page_cur_t*, dict_index_t const*, unsigned long const*, mtr_t*)+0x201) [0x2339581]
/usr/sbin/mysqld(page_cur_parse_delete_rec(unsigned char*, unsigned char*, buf_block_t*, dict_index_t*, mtr_t*)+0xf6) [0x2339b26]
/usr/sbin/mysqld() [0x23099ce]
/usr/sbin/mysqld(recv_recover_page_func(bool, buf_block_t*)+0x64b) [0x230c46b]
/usr/sbin/mysqld(buf_page_io_complete(buf_page_t*, bool)+0x4a0) [0x2479140]
/usr/sbin/mysqld(fil_aio_wait(unsigned long)+0x18e) [0x2562c4e]
/usr/sbin/mysqld() [0x23c7218]
/usr/sbin/mysqld(std::thread::_State_impl<std::thread::_Invoker<std::tuple<Detached_thread, void (*)(unsigned long), unsigned long> > >::_M_run()+0xae) [0x23c7a2e]
/usr/sbin/mysqld() [0x2b367d4]
/lib64/libpthread.so.0(+0x744b) [0x7fb6e814d44b]
/lib64/libc.so.6(clone+0x3f) [0x7fb6e647652f]
Please help us make Percona Server better by reporting any
bugs at https://bugs.percona.com/

Hi @Chehai

Can you clarify how you are doing the ALTER TABLE?

can you test doing the two variants below and let us know if COPY algorithm also crashes the server?

1. ALTER TABLE ... ADD COLUMN ... ALGORITHM=INSTANT;
2. ALTER TABLE ... ADD COLUMN ... ALGORITHM=COPY;
1 Like

Hi, @Marcelo_Altmann

COPY does not crash, since the replica with old_alter_table = on has no issues so far. Only the replica that has old_alter_table = off has crashes, since I think INSTANT (the default ALTER TABLE algorithm) is the problem.

1 Like