Hi all,
I’m having a 3 nodes Cluster.
mysql> SHOW VARIABLES LIKE ‘version%’;
±------------------------±--------------------------------------------------------------------------------------+
| Variable_name | Value |
±------------------------±--------------------------------------------------------------------------------------+
| version | 8.0.37-29.1 |
| version_comment | Percona XtraDB Cluster (GPL), Release rel29, Revision d29a325, WSREP version 26.1.4.3 |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
| version_compile_zlib | 1.2.13 |
| version_suffix | .1 |
±------------------------±--------------------------------------------------------------------------------------+
6 rows in set (0.01 sec)
On top of those 3 nodes, I have a proxysql:
±-------------±-----------±-----±----------±--------±-------±------------±----------------±--------------------±--------±---------------±--------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
±-------------±-----------±-----±----------±--------±-------±------------±----------------±--------------------±--------±---------------±--------+
| 10 | prod-db-01 | 3306 | 0 | SHUNNED | 10000 | 0 | 1000 | 0 | 0 | 0 | |
| 10 | prod-db-02 | 3306 | 0 | ONLINE | 10000 | 0 | 1000 | 0 | 0 | 0 | |
| 10 | prod-db-03 | 3306 | 0 | SHUNNED | 9999 | 0 | 1000 | 0 | 0 | 0 | |
| 20 | prod-db-01 | 3306 | 0 | ONLINE | 10000 | 0 | 1000 | 0 | 0 | 0 | |
| 20 | prod-db-03 | 3306 | 0 | ONLINE | 9999 | 0 | 1000 | 0 | 0 | 0 | |
| 30 | prod-db-01 | 3306 | 0 | ONLINE | 10000 | 0 | 1000 | 0 | 0 | 0 | |
| 30 | prod-db-03 | 3306 | 0 | ONLINE | 9999 | 0 | 1000 | 0 | 0 | 0 | |
±-------------±-----------±-----±----------±--------±-------±------------±----------------±--------------------±--------±---------------±--------+
I applied write/read split:
*************************** 1. row ***************************
rule_id: 100
active: 1
username: NULL
schemaname: NULL
flagIN: 0
client_addr: NULL
proxy_addr: NULL
proxy_port: NULL
digest: NULL
match_digest: NULL
match_pattern: ^SELECT .* FOR UPDATE
negate_match_pattern: 0
re_modifiers: CASELESS
flagOUT: NULL
replace_pattern: NULL
destination_hostgroup: 10
cache_ttl: NULL
cache_empty_result: NULL
cache_timeout: NULL
reconnect: NULL
timeout: NULL
retries: NULL
delay: NULL
next_query_flagIN: NULL
mirror_flagOUT: NULL
mirror_hostgroup: NULL
error_msg: NULL
OK_msg: NULL
sticky_conn: NULL
multiplex: NULL
gtid_from_hostgroup: NULL
log: NULL
apply: 1
attributes:
comment: NULL
*************************** 2. row ***************************
rule_id: 200
active: 1
username: NULL
schemaname: NULL
flagIN: 0
client_addr: NULL
proxy_addr: NULL
proxy_port: NULL
digest: NULL
match_digest: NULL
match_pattern: ^SELECT .*
negate_match_pattern: 0
re_modifiers: CASELESS
flagOUT: NULL
replace_pattern: NULL
destination_hostgroup: 30
cache_ttl: NULL
cache_empty_result: NULL
cache_timeout: NULL
reconnect: NULL
timeout: NULL
retries: NULL
delay: NULL
next_query_flagIN: NULL
mirror_flagOUT: NULL
mirror_hostgroup: NULL
error_msg: NULL
OK_msg: NULL
sticky_conn: NULL
multiplex: NULL
gtid_from_hostgroup: NULL
log: NULL
apply: 1
attributes:
comment: NULL
*************************** 3. row ***************************
rule_id: 300
active: 1
username: NULL
schemaname: NULL
flagIN: 0
client_addr: NULL
proxy_addr: NULL
proxy_port: NULL
digest: NULL
match_digest: NULL
match_pattern: .*
negate_match_pattern: 0
re_modifiers: CASELESS
flagOUT: NULL
replace_pattern: NULL
destination_hostgroup: 10
cache_ttl: NULL
cache_empty_result: NULL
cache_timeout: NULL
reconnect: NULL
timeout: NULL
retries: NULL
delay: NULL
next_query_flagIN: NULL
mirror_flagOUT: NULL
mirror_hostgroup: NULL
error_msg: NULL
OK_msg: NULL
sticky_conn: NULL
multiplex: NULL
gtid_from_hostgroup: NULL
log: NULL
apply: 1
attributes:
comment: NULL
3 rows in set (0.00 sec)
Before anything happened we ran a script (ssh directly to Node02 to run) but got error because of missing the index.
ERROR 1822 (HY000) at line 94: Failed to add the foreign key constraint. Missing index for constraint ‘fk_ms_shortname’ in the referenced table ‘catalogitems’
Then :
The problem is that we tried to create and drop indexes via DBeaver Tool (while there are user still using the system → query, update to the DB), we got the Cluster crashed (Cluster crashed around 2025-07-10T12:45).
These are log files relatively to Node01, Node02 and Node03.
Node01 log
Node02 log
Node03 log
To bring back the cluster again, I have to restart Node01 and Node03 (these are aborted), then the SST flow started.
Please help me to identify what is the root cause. I tried to reproduce the issue by running queries while creating/dropping indexes but cannot have the same issue (Nodes got aborted).
Thanks.