PXC freeze on large DB restore

DavidD · July 23, 2019, 2:17am

Hello,

I have 3 node in a PXC cluster 5.7.26-29-57-log + 2 proxysql + Keepalived. I works great but i’m experiencing an issue when I restore databases (>200Mo) through dump or through application (my application have the ability to restore a database).
I’m not using XtraDbBackup for now because my application as priority to manage backup and restore.

when restoring, at begening all works well, but after a moment the replication latency grow up and then I have all nodes OFFLINE_SOFT then OFFLINE_HARD. In this state I can’t do any operation, all server are down. I have to shutdown all and then bootstrap the good node (by checking grastate.dat).
But sometimes no node are “safe_to_bootstrap” and so I have to check gvwstate or GTID.

The only workaround I found for now is to shutdown 2 nodes when I want to restore a database then restart the 2 nodes and it’s OK.

So I can say it’s a configuration problem for sure, but don’t know what to adjust.

Thanks for your help.

David

lorraine.pocklington · July 23, 2019, 2:55am

Hello DavidD

Thanks for your question.

Could you upload the mysql.conf for each of the nodes, please, and also any error log files that are being written.
Also, details of the server environment.

Please obfuscate any information that would identify your servers etc though.
Thanks!

Lorraine

DavidD · July 23, 2019, 3:04am

Hello Lorraine,

Thank you.

This is my configs files

node 1 :
my.cnf

[mysqld]
# general
server-id=1

datadir=/data
socket=/var/run/mysqld/mysqld.sock
pid-file=/var/run/mysqld/mysqld.pid

back_log=2000
connect_timeout=15
skip-name-resolve=1
ssl=0
table_definition_cache=2000
table_open_cache=10000
metadata_locks_hash_instances=256
max_connections=900
max_allowed_packet=1G
net_buffer_length=1048576

transaction-isolation=READ-COMMITTED

core-file
symbolic-links=0
bind-address = 0.0.0.0


character-set-server=utf8
collation-server=utf8_general_ci
# log
log-error=/var/log/mysqld.log
general-log-file= /datalog/mysql.log
pid-file=/var/run/mysqld/mysqld.pid
slow_query_log=ON
long_query_time=1
#slow_query_log_file=/datalog/mysql-slow.log
log_slow_rate_limit=100
log_slow_rate_type=query
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1

sql_mode="STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"

net_read_timeout=1440
net_write_timeout=2880

#pooling
#thread_handling=pool-of-threads
#thread_pool_size=36

# innodb
innodb_strict_mode=0
innodb_buffer_pool_size=8G
innodb_buffer_pool_instances=8
innodb_log_file_size=1G 
innodb_write_io_threads=8
innodb_read_io_threads=8

innodb_thread_concurrency=2
innodb_doublewrite=1
innodb_flush_log_at_trx_commit=1
innodb-page-cleaners=8
innodb_purge_threads=4
innodb_lru_scan_depth=2048
innodb_io_capacity=8000
innodb_io_capacity_max=16000
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_adaptive_hash_index=OFF
innodb-change-buffering=none
innodb_flush_neighbors=0
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10

# binlog
log-bin=/datalog/mysql-bin
log-bin-index=/datalog/mysql-bin.index
binlog-checksum=NONE
log_slave_updates
expire_logs_days=14
max_binlog_files=20
max_binlog_size=500M

log_output=file
binlog-format=ROW
relay-log=relay-1
enforce-gtid-consistency
gtid-mode=on
master-info-repository=TABLE
relay-log-info-repository=TABLE

# monitoring
performance_schema=ON
innodb_monitor_enable='%'
performance_schema_instrument='%synch%=on'
userstat=1

wsrep.cnf

[mysqld]
# Path to Galera library
wsrep_provider=/usr/lib/galera3/libgalera_smm.so

# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
wsrep_cluster_address=gcomm://192.168.100.10,192.168.100.11,192.168.100.12

# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW

# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB

# Slave thread to use
wsrep_slave_threads= 8

wsrep_log_conflicts

# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node IP address
wsrep_node_address=192.168.100.10
# Cluster name
wsrep_cluster_name=pxc-cluster

#If wsrep_node_name is not specified, then system hostname will be used
wsrep_node_name=pxc1

#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
pxc_strict_mode=DISABLED

# SST method
wsrep_sst_method=xtrabackup-v2

#Authentication for SST method
wsrep_sst_auth="sstuser:*********"

node 2:
my.cnf

[mysqld]
# general
server-id=1

datadir=/data
socket=/var/run/mysqld/mysqld.sock
pid-file=/var/run/mysqld/mysqld.pid

back_log=2000
connect_timeout=15
skip-name-resolve=1
ssl=0
table_definition_cache=2000
table_open_cache=10000
metadata_locks_hash_instances=256
max_connections=900
max_allowed_packet=1G
net_buffer_length=1048576

transaction-isolation=READ-COMMITTED

core-file
symbolic-links=0
bind-address = 0.0.0.0


character-set-server=utf8
collation-server=utf8_general_ci
# log
log-error=/var/log/mysqld.log
general-log-file= /datalog/mysql.log
pid-file=/var/run/mysqld/mysqld.pid
slow_query_log=ON
long_query_time=1
#slow_query_log_file=/datalog/mysql-slow.log
log_slow_rate_limit=100
log_slow_rate_type=query
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1

sql_mode="STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"

net_read_timeout=1440
net_write_timeout=2880

#pooling
#thread_handling=pool-of-threads
#thread_pool_size=36

# innodb
innodb_strict_mode=0
innodb_buffer_pool_size=8G
innodb_buffer_pool_instances=8
innodb_log_file_size=1G 
innodb_write_io_threads=8
innodb_read_io_threads=8

innodb_thread_concurrency=2
innodb_doublewrite=1
innodb_flush_log_at_trx_commit=1
innodb-page-cleaners=8
innodb_purge_threads=4
innodb_lru_scan_depth=2048
innodb_io_capacity=8000
innodb_io_capacity_max=16000
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_adaptive_hash_index=OFF
innodb-change-buffering=none
innodb_flush_neighbors=0
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10

# binlog
log-bin=/datalog/mysql-bin
log-bin-index=/datalog/mysql-bin.index
binlog-checksum=NONE
log_slave_updates
expire_logs_days=14
max_binlog_files=20
max_binlog_size=500M

log_output=file
binlog-format=ROW
relay-log=relay-1
enforce-gtid-consistency
gtid-mode=on
master-info-repository=TABLE
relay-log-info-repository=TABLE

# monitoring
performance_schema=ON
innodb_monitor_enable='%'
performance_schema_instrument='%synch%=on'
userstat=1

wsrep.cnf

[mysqld]
# Path to Galera library
wsrep_provider=/usr/lib/galera3/libgalera_smm.so

# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
wsrep_cluster_address=gcomm://192.168.100.10,192.168.100.11,192.168.100.12

# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW

# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB

# Slave thread to use
wsrep_slave_threads= 8

wsrep_log_conflicts

# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node IP address
wsrep_node_address=192.168.100.11
# Cluster name
wsrep_cluster_name=pxc-cluster

#If wsrep_node_name is not specified, then system hostname will be used
wsrep_node_name=pxc2

#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
pxc_strict_mode=DISABLED

# SST method
wsrep_sst_method=xtrabackup-v2

#Authentication for SST method
wsrep_sst_auth="sstuser:*********"

node 2:
my.cnf

[mysqld]
# general
server-id=1

datadir=/data
socket=/var/run/mysqld/mysqld.sock
pid-file=/var/run/mysqld/mysqld.pid

back_log=2000
connect_timeout=15
skip-name-resolve=1
ssl=0
table_definition_cache=2000
table_open_cache=10000
metadata_locks_hash_instances=256
max_connections=900
max_allowed_packet=1G
net_buffer_length=1048576

transaction-isolation=READ-COMMITTED

core-file
symbolic-links=0
bind-address = 0.0.0.0


character-set-server=utf8
collation-server=utf8_general_ci
# log
log-error=/var/log/mysqld.log
general-log-file= /datalog/mysql.log
pid-file=/var/run/mysqld/mysqld.pid
slow_query_log=ON
long_query_time=1
#slow_query_log_file=/datalog/mysql-slow.log
log_slow_rate_limit=100
log_slow_rate_type=query
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1

sql_mode="STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"

net_read_timeout=1440
net_write_timeout=2880

#pooling
#thread_handling=pool-of-threads
#thread_pool_size=36

# innodb
innodb_strict_mode=0
innodb_buffer_pool_size=8G
innodb_buffer_pool_instances=8
innodb_log_file_size=1G 
innodb_write_io_threads=8
innodb_read_io_threads=8

innodb_thread_concurrency=2
innodb_doublewrite=1
innodb_flush_log_at_trx_commit=1
innodb-page-cleaners=8
innodb_purge_threads=4
innodb_lru_scan_depth=2048
innodb_io_capacity=8000
innodb_io_capacity_max=16000
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_adaptive_hash_index=OFF
innodb-change-buffering=none
innodb_flush_neighbors=0
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10

# binlog
log-bin=/datalog/mysql-bin
log-bin-index=/datalog/mysql-bin.index
binlog-checksum=NONE
log_slave_updates
expire_logs_days=14
max_binlog_files=20
max_binlog_size=500M

log_output=file
binlog-format=ROW
relay-log=relay-1
enforce-gtid-consistency
gtid-mode=on
master-info-repository=TABLE
relay-log-info-repository=TABLE

# monitoring
performance_schema=ON
innodb_monitor_enable='%'
performance_schema_instrument='%synch%=on'
userstat=1

wsrep.cnf

[mysqld]
# Path to Galera library
wsrep_provider=/usr/lib/galera3/libgalera_smm.so

# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
wsrep_cluster_address=gcomm://192.168.100.10,192.168.100.11,192.168.100.12

# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW

# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB

# Slave thread to use
wsrep_slave_threads= 8

wsrep_log_conflicts

# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node IP address
wsrep_node_address=192.168.100.12
# Cluster name
wsrep_cluster_name=pxc-cluster

#If wsrep_node_name is not specified, then system hostname will be used
wsrep_node_name=pxc3

#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
pxc_strict_mode=DISABLED

# SST method
wsrep_sst_method=xtrabackup-v2

#Authentication for SST method
wsrep_sst_auth="sstuser:*********"

Thanks

lorraine.pocklington · July 30, 2019, 4:06am

Are you not seeing any log files being written when the db freezes?

przemek · July 30, 2019, 8:53am

Hello David,

pxc_strict_mode=DISABLED - any particular reason for disabling it? You may be simply asking for trouble by allowing potentially dangerous commands.
innodb_io_capacity=8000 - how much write random IOPS can your disk do? Have you tested it? In general you should not spend all of the capacity on that setting.

I wonder what does it mean - can you at least login to those nodes and take the basic status outputs, like:

SHOW PROCESSLIST;
SHOW ENGINE INNODB STATUS\G
SHOW STATUS LIKE 'ws%';

DavidD · August 2, 2019, 4:10am

Hello,

Sorry, I wasn’t free the days.
lorraine.pocklington → I have some customers on my DB and so I have to plan when to do a large restore db. So last time the crash happen was 2 month ago and I have log rotation on 15 days.
I will perform a crash test soon.
przemek → I host an application for which editor wants ‘pxc_strict_mode=DISABLED’ so not a personnal choice.

When cluster freeze I can’t perform any mysql operation on (even SELECT).
IO capacity has been benched and I take 30% less maximum value in RW mode with large samples, according with my engineer. I have SSD + NVMe.

the basics output on each nodes are in attached file

Regards

David

basics.txt (91.5 KB)

DavidD · August 2, 2019, 7:02am

Hi,

I just try run a restore a full database 45Mo and I have the issue.

No error on mysql error log

I have 1 active master (from proxysql) and when I read SHOW ENGINE INNODB STATUS \G I have this in joinned file

David

engine status.txt (30 KB)

DavidD · August 2, 2019, 7:07am

And then I have “WSREP detected deadlock/conflict and aborted the transaction. Try restarting the transaction” in my application

David

steven_Lin · August 2, 2019, 5:35pm

You can increase wsrep_slave_threads to 32
and try to restore your database to one node (not proxysql) again!

steven_Lin · August 2, 2019, 5:41pm

add the flowing to make your io faster

innodb_flush_log_at_trx_commit=2
sync_binlog=0

DavidD · August 5, 2019, 8:04am

Thanks steven Lin but I can’t change innodb_flush_log_at_trx_commi to 2 because of related risks (1 is ACID guaranteed while 2 can lose up to 1 second worth transactions – according to mysql documentation) and I have ~ 4k question/s, with 15k q/s peaks (~10% write, so 400 to 1500 write/s).
Write perfomance is not the problem I think.
This is my datacenter conf :
SAN Dell Unity 450F + 24 SSD + NVMe cache managed by ESX + 6 link 25Gb/s between ESXi + Xeon 6154
I see with my engineer and ihe benched SAN 200k IOPS on a database.

I’ll try increase slave_thread.

steven_Lin · August 5, 2019, 7:18pm

innodb_flush_log_at_trx_commit=0
can lose up to 1 second worth transactions when your node crash.

innodb_flush_log_at_trx_commit=2
the data will protected by your file system ,so in most situation your data will be safe.

DavidD · August 6, 2019, 4:19am

Hi, inscreasing slave_thread to 32 do the job. No stall on restoration today. Great
I’ll can now go ahead and sysbench the PXC with some big workload.
Thx
David

DavidD · December 3, 2019, 2:50am

Hi, I upgrade this post. We investigate more with our engineers and found that pmm-client and espacially mysqld_exporter (Prometheus) cause the bug : Sometimes the process take lot of memory causing oom process :


Nov. 28 09:46:38 pxc2 kernel: mysqld_exporter invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
nov. 28 09:46:38 pxc2 kernel: mysqld_exporter cpuset=/ mems_allowed=0
...
nov. 28 09:46:38 pxc2 kernel: Out of memory: Kill process 2718 (mysqld) score 946 or sacrifice child
nov. 28 09:46:38 pxc2 kernel: Killed process 2718 (mysqld) total-vm:19560856kB, anon-rss:9663284kB, file-rss:0kB, shmem-rss:0kB
nov. 28 09:46:38 pxc2 kernel: oom_reaper: reaped process 2718 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

At this moment I had 10GB on the server and 8GB allocated to innodb_buffer.
After I changed to have 12 GB on the server and 9GB on innodb_buffer, so remaining 3GB and I have no problem.

Is there a wait to control the amount of memory pmm use ?

David

lorraine.pocklington · December 3, 2019, 6:05am

OK, thank you for this update. Let me get someone in the PMM team to check in on this, as it’s crossing into their patch and they won’t by default be reviewing PXC threads.

lalit.choudhary · December 3, 2019, 6:46am

Hello David,

could you please tell us what pmm version you are using? There were some issues with the old pmm version where consumption of memory is high.
ref: [URL]Percona Monitoring and Management
If you are using old version of pmm, I would suggest upgrade to the latest version.

Also as you mentioned you had 10GB on the server and 8GB allocated to innodb_buffer, here we should also consider addition buffers of mysql which need additional memory other than innodb_buffer. so the real memory usage for mysql will be innodb_buffer + additional buffer memory.
Here check this blog post [URL]https://www.percona.com/blog/2014/01/24/mysql-server-memory-usage-2/[/URL]

So overall we left with a very low amount of memory for other OS processes and pmm exporter processes which could lead to Out of memory issue.

DavidD · December 4, 2019, 8:50am

Hello,
I’m using pmm-admin 1.17.2

I’ll check other buffer to see a much memory mysql uses and then adjust my parameters.

thanks

Topic		Replies	Views
XtraDBCluster 1 Node Crash Percona XtraDB Cluster 5.x community , mysql , percona	13	131	February 11, 2025
Verify Node after error during shutdown Percona XtraDB Cluster 5.x	3	910	August 18, 2015
Trying to restore partial backup on cluster Percona XtraDB Cluster 5.x	1	601	April 21, 2020
Restoring a single database to a live cluster Percona XtraBackup mysql , percona	3	1631	December 1, 2022
PXC cluster CrashLoopBackOff Percona XtraDB Cluster 5.x mysql , percona , kubernetes	6	1503	February 27, 2024

PXC freeze on large DB restore

Related topics