Mysqld crash on 8.0.30

Hi,

We are seeing constant database crash with the following error:
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.

Build ID: Not Available
Server Version: 8.0.30-22 Percona Server (GPL), Release 22, Revision 7e301439b65

Thread pointer: 0x7fef6c4dd000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong…
stack_bottom = 7fef44afcbb0 thread_stack 0x100000
/home/mysqld/sys/8.0.30_percona/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x206883d]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(print_fatal_signal(int)+0x30f) [0x114a4af]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(handle_fatal_signal+0xc5) [0x114a585]
/lib64/libpthread.so.0(+0xf680) [0x7ff2722ad680]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PTI_expr_with_alias::itemize(Parse_context*, Item**)+0x35) [0x12e4455]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_select_item_list::contextualize(Parse_context*)+0x7e) [0xea328e]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_query_specification::contextualize(Parse_context*)+0x91) [0xea2c21]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_query_expression::contextualize(Parse_context*)+0x93) [0xea1313]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_select_stmt::make_cmd(THD*)+0x57) [0xe9acc7]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(LEX::make_sql_cmd(Parse_tree_root*)+0x27) [0xf9e157]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(THD::sql_parser()+0x5a) [0xf43f8a]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(parse_sql(THD*, Parser_state*, Object_creation_ctx*)+0x127) [0xfc3f77]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(dispatch_sql_command(THD*, Parser_state*, bool)+0x2ed) [0xfc904d]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x21dc) [0xfcb9cc]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(do_command(THD*)+0x282) [0xfcc862]
/home/mysqld/sys/8.0.30_percona/bin/mysqld() [0x113a880]
/home/mysqld/sys/8.0.30_percona/bin/mysqld() [0x253e925]
/lib64/libpthread.so.0(+0x7dd5) [0x7ff2722a5dd5]
/lib64/libc.so.6(clone+0x6d) [0x7ff2705f0b3d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fef6c477030): SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.run_id AS task_instance_run_id, task_instance.map_index AS task_instance_map_index, task_instance.start_date AS task_instance_start_date, task_instance.end_date AS task_instance_end_date, task_instance.duration AS task_instance_duration, task_instance.state AS task_instance_state, task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS task_instance_hostname, task_instance.unixname AS task_instance_unixname, task_instance.job_id AS task_instance_job_id, task_instance.pool AS task_instance_pool, task_instance.pool_slots AS task_instance_pool_slots, task_instance.queue AS task_instance_queue, task_instance.priority_weight AS task_instance_priority_weight, task_instance.operator AS task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, task_instance.queued_by_job_id AS task_instance_queued_by_job_id,
Connection ID (thread ID): 5703208
Status: NOT_KILLED

Please help a workaround.

Thanks in advance!

1 Like

Try to rebuild the tables (either using ALTER or mydumper/myloader) for all the associated tables which are part of the reported SELECT query.

1 Like

@Chanakya In the error message you show, I see this PTI_expr_with_alias::itemize as the top line in the backtrace. Can you please execute your SQL but remove all ‘AS’ parts? Execute the query without any aliases. Does that work? If so, put back only 1 ‘AS’ and repeat. Keep doing this until you crash again. Then remove 1 ‘AS’. Does that work? This might be a memory allocation error with too many ‘AS’ parts.

1 Like

@matthewb

Here there is no alias, but rest of the error is same.

/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_select_item_list::contextualize(Parse_context*)+0x7b) [0xea328b]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_query_specification::contextualize(Parse_context*)+0x91) [0xea2c21]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_query_expression::contextualize(Parse_context*)+0x93) [0xea1313]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(PT_select_stmt::make_cmd(THD*)+0x57) [0xe9acc7]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(LEX::make_sql_cmd(Parse_tree_root*)+0x27) [0xf9e157]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(THD::sql_parser()+0x5a) [0xf43f8a]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(parse_sql(THD*, Parser_state*, Object_creation_ctx*)+0x127) [0xfc3f77]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(dispatch_sql_command(THD*, Parser_state*, bool)+0x2ed) [0xfc904d]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x21dc) [0xfcb9cc]
/home/mysqld/sys/8.0.30_percona/bin/mysqld(do_command(THD*)+0x282) [0xfcc862]
/home/mysqld/sys/8.0.30_percona/bin/mysqld() [0x113a880]
/home/mysqld/sys/8.0.30_percona/bin/mysqld() [0x253e925]
/lib64/libpthread.so.0(+0x7ea5) [0x7fb7b372aea5]
/lib64/libc.so.6(clone+0x6d) [0x7fb7b1a73b2d]

1 Like

Can you provide the entire query please?

1 Like

SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.run_id AS task_instance_run_id, task_instance.start_date AS task_instance_start_date, task_instance.end_date AS task_instance_end_date, task_instance.duration AS task_instance_duration, task_instance.state AS task_instance_state, task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS task_instance_hostname, task_instance.unixname AS task_instance_unixname, task_instance.job_id AS task_instance_job_id, task_instance.pool AS task_instance_pool, task_instance.pool_slots AS task_instance_pool_slots, task_instance.queue AS task_instance_queue, task_instance.priority_weight AS task_instance_priority_weight, task_instance.operator AS task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, task_instance.queued_by_job_id AS task_instance_queued_by_job_id, task_instance.pid AS task_instance_pid, task_instance.executor_config AS task_instance_executor_config, task_instance.external_executor_id AS task_instance_external_executor_id, task_instance.trigger_id AS task_instance_trigger_id, task_instance.trigger_timeout AS task_instance_trigger_timeout, task_instance.next_method AS task_instance_next_method, task_instance.next_kwargs AS task_instance_next_kwargs, dag_run_1.state AS dag_run_1_state, dag_run_1.id AS dag_run_1_id, dag_run_1.dag_id AS dag_run_1_dag_id, dag_run_1.queued_at AS dag_run_1_queued_at, dag_run_1.execution_date AS dag_run_1_execution_date, dag_run_1.start_date AS dag_run_1_start_date, dag_run_1.end_date AS dag_run_1_end_date, dag_run_1.run_id AS dag_run_1_run_id, dag_run_1.creating_job_id AS dag_run_1_creating_job_id, dag_run_1.external_trigger AS dag_run_1_external_trigger, dag_run_1.run_type AS dag_run_1_run_type, dag_run_1.conf AS dag_run_1_conf, dag_run_1.data_interval_start AS dag_run_1_data_interval_start, dag_run_1.data_interval_end AS dag_run_1_data_interval_end, dag_run_1.last_scheduling_decision AS dag_run_1_last_scheduling_decision, dag_run_1.dag_hash AS dag_run_1_dag_hash

FROM task_instance INNER JOIN dag_run ON dag_run.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id INNER JOIN dag ON task_instance.dag_id = dag.dag_id INNER JOIN dag_run AS dag_run_1 ON dag_run_1.dag_id = task_instance.dag_id AND dag_run_1.run_id = task_instance.run_id

WHERE dag_run.run_type != ‘backfill’ AND dag_run.state != ‘queued’ AND dag.is_paused = 0 AND task_instance.state = ‘scheduled’ ORDER BY -task_instance.priority_weight, dag_run.execution_date

LIMIT 36 FOR UPDATE OF task_instance SKIP LOCKED;

1 Like

Tried it… :crossed_fingers: Thank you

1 Like

You said you removed all the aliases, yet there is an alias on just about every column. Please remove all aliases and try the query again.

1 Like

Unfortunately App has no control over the query to modify :frowning: Server has abundant memory, is there any parameter that I can tweak to get away from constant crashes ?
Also, it started happening in 8.0.30 after upgrading from 8.0.23

1 Like

Run the query manually, first with aliases then without aliases. When you run it manually, does it crash?

1 Like

We tried running manually as it is but its not crashing the DB , seems certain data set is causing the DB crash , the data in these tables is very dynamic as this is airflow backend database

Any other suggestion to debug this issue furhert

1 Like

Without a repeatable test case, it is hard to debug. I suggest you keep trying the query manually without aliases to see if that is the cause.

1 Like

any known bug around this issue

1 Like

You can search here: https://bugs.mysql.com

1 Like

we could not find out, do you know any bug for this issue

1 Like

If you found none on mysql.com then there’s none reported. You will need a repeatable test case to submit a bug report. You also need to identify the specific version where things stopped working. Roll back to 8.0.23 and confirm there are no issues. Then upgrade to 8.0.24 and check. If no crashes, upgrade to 8.0.25, etc. Doing this will help narrow down when the change to MySQL took place that is affecting your query.

1 Like

is it safe to rollback from 8.0.30 to 8.0.23 and if yes, do you have steps

1 Like

Hi @Sanjay_Sheoran ,

Thanks for reaching out. In addition to what my colleagues has responded, please allow me to share my views on this.

What I understand is that this particular query ( one you pasted, when being executed from application server ) is crashing your mysql server but when being executed manually, there is no crash . Also, this started post mysql upgradation to 8.0.30. I also understand that you have no control over application server to modify this query.

Do you see any corruption related errors in your error log ? Also, have you started facing these crashes post upgradation or after some days of upgradation ? Is there any other query of this table as well being executed from app server ?

1 Like

Hi @Ankit_Kapoor1
Yes, Your understanding is correct.

We don’t see any corruption related error on error log. Yes Post upgradation to 8.0.30, we are seeing these crashes.

1 Like

hi @Ankit_Kapoor1
Do you have any suggestions in mitigating this issue .

1 Like