NEED help,there was a crush issue when importing big data into Percona cluster

dixon · May 27, 2017, 8:56pm

yesterday,we have builded percona cluster which has three nodes in test box.and then we using mysqldump to backup one database data from mysql 5.5 in production box.after we got the sql file,it was 11G and have 30000 lines.we use command “mysql -uroot -p123456 dbname < mysql_dump.sql” to load the database file into percona cluster.actually,we have created a same dbname in percona cluster。

“CREATE DATABASE dbname DEFAULT CHARSET utf8 ".

ok,we run this command “mysql -uroot -p123456 dbname < mysql_dump.sql” in node1 ,Unluckily we got a crush issue in middle of it,and node1’s mysql service was crush,need start.we try to run command many times ,so we wonder that if the percona cluster not be allown import sql file which more than 1GB???but if we decide to use percona cluster in our production box,the first step it load the currentlly db data into percona cluster to keep moving our data.did someone help us??thx very much in advance !!

przemek · May 30, 2017, 5:52am

You didn’t specify what kind of crash was it. Primary source of information for that is MySQL error log btw.
There is no limit in how big SQL dumps can be imported in PXC node, however long imports may affect the cluster performance or even cause cluster interruptions if nodes are using slow disks, network, etc.
Btw, I would recommend using mydumper/myloader rather, as in case of any problem in the middle would be easier to resume later.
Also, as Galera doesn’t really support table locks, better to use options in mysqldump that do not put the LOCK TABLE clauses in the dump.

dixon · May 30, 2017, 8:14pm

today i try to import more than 600MB sql file into pxc cluster,still got crush.and the err log doesn’t have more details.
only got :error 2013 ,lost connections to mysql server during query

jrivera · May 30, 2017, 8:17pm

try enabling wsrep_debug=1 on the crashing server, also check if you have relevant logs pertaining to the mysql crash from the system log (/var/log/messages or /var/log/syslog) and/or from dmesg.

dixon · May 30, 2017, 8:18pm

today i try to import sql file into pxc cluster ,the sql file only 700MB,but still got crush.i cant got more details from err log.only got :ERROR 2013,lost connection to mysql server during query

dixon · May 30, 2017, 8:35pm

i worder that the import time is long will made the node1 crush.

dixon · May 31, 2017, 1:01am

Last login: Wed May 31 13:10:06 2017 from 172.29.36.198
[ftpuser@osspclu1 ~]$ su - root
密码：
[root@osspclu1 ~]# mysql -uroot -pmysql bpmtest1 <sys_message_doto.sql
-bash: sys_message_doto.sql: 没有那个文件或目录
[root@osspclu1 ~]# cd /data/export/
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <sys_message_doto.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <bpm_exe_stack_executor.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <biz_statement_bill_detail.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <bpm_def_data.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <act_ru_variable.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <bpm_pro_inst.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <bpm_pro_inst_hi.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <biz_edit_record.sql
Warning: Using a password on the command line interface can be insecure.
[root@osspclu1 export]# mysql -uroot -pmysql bpmtest1 <act_hi_identitylink.sql
Warning: Using a password on the command line interface can be insecure.
ERROR 2013 (HY000) at line 125: Lost connection to MySQL server during query

still got crush,when i continue to load sql file into pxc cluster,seems plenty query will made mysql service crush
and err log was nothing even i set the wsrep_debug = 1.
only this:
170531 13:29:18 mysqld_safe Number of processes running now: 0
170531 13:29:18 mysqld_safe WSREP: not restarting wsrep node automatically
170531 13:29:18 mysqld_safe mysqld from pid file /data/mysql/osspclu1.pid ended

jrivera · May 31, 2017, 1:04am

this looks like an OOM, check your system log (/var/log/messages or /var/log/syslog) also check dmesg if mysql was killed by oom (out-of-memory).

dixon · May 31, 2017, 2:22am

thx Jrivera
i saw /var/log/messages has these error msg
May 31 13:29:16 osspclu1 kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
May 31 13:29:16 osspclu1 kernel: mysqld cpuset=/ mems_allowed=0
May 31 13:29:16 osspclu1 kernel: Pid: 19848, comm: mysqld Tainted: G --------------- H 2.6.32-358.el6.x86_64 #1
May 31 13:29:16 osspclu1 kernel: Call Trace:
May 31 13:29:16 osspclu1 kernel: [] ? cpuset_print_task_mems_allowed+0x91/0xb0
May 31 13:29:16 osspclu1 kernel: [] ? dump_header+0x90/0x1b0
May 31 13:29:16 osspclu1 kernel: [] ? security_real_capable_noaudit+0x3c/0x70
May 31 13:29:16 osspclu1 kernel: [] ? oom_kill_process+0x82/0x2a0
May 31 13:29:16 osspclu1 kernel: [] ? select_bad_process+0xe1/0x120
May 31 13:29:16 osspclu1 kernel: [] ? out_of_memory+0x220/0x3c0
May 31 13:29:16 osspclu1 kernel: [] ? __alloc_pages_nodemask+0x8ac/0x8d0
May 31 13:29:16 osspclu1 kernel: [] ? alloc_pages_current+0xaa/0x110
May 31 13:29:16 osspclu1 kernel: [] ? __page_cache_alloc+0x87/0x90
May 31 13:29:16 osspclu1 kernel: [] ? find_get_page+0x1e/0xa0
May 31 13:29:16 osspclu1 kernel: [] ? filemap_fault+0x1a7/0x500
May 31 13:29:16 osspclu1 kernel: [] ? __do_fault+0x54/0x530
May 31 13:29:16 osspclu1 kernel: [] ? handle_pte_fault+0xf7/0xb50
May 31 13:29:16 osspclu1 kernel: [] ? __sb_end_write+0x3d/0x70
May 31 13:29:16 osspclu1 kernel: [] ? generic_file_aio_write+0xba/0x100
May 31 13:29:16 osspclu1 kernel: [] ? handle_mm_fault+0x23a/0x310
May 31 13:29:16 osspclu1 kernel: [] ? __do_page_fault+0x139/0x480
May 31 13:29:16 osspclu1 kernel: [] ? do_page_fault+0x3e/0xa0
May 31 13:29:16 osspclu1 kernel: [] ? page_fault+0x25/0x30
May 31 13:29:16 osspclu1 kernel: Mem-Info:
May 31 13:29:16 osspclu1 kernel: Node 0 DMA per-cpu:
May 31 13:29:16 osspclu1 kernel: CPU 0: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 1: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 2: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 3: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 4: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 5: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 6: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 7: hi: 0, btch: 1 usd: 0
May 31 13:29:16 osspclu1 kernel: Node 0 DMA32 per-cpu:
May 31 13:29:16 osspclu1 kernel: CPU 0: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 1: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 2: hi: 186, btch: 31 usd: 31
May 31 13:29:16 osspclu1 kernel: CPU 3: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 4: hi: 186, btch: 31 usd: 30
May 31 13:29:16 osspclu1 kernel: CPU 5: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 6: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 7: hi: 186, btch: 31 usd: 14
May 31 13:29:16 osspclu1 kernel: Node 0 Normal per-cpu:
May 31 13:29:16 osspclu1 kernel: CPU 0: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 1: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 2: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 3: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 4: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 5: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 6: hi: 186, btch: 31 usd: 0
May 31 13:29:16 osspclu1 kernel: CPU 7: hi: 186, btch: 31 usd: 32

May 31 13:29:16 osspclu1 kernel: active_anon:1647254 inactive_anon:324046 isolated_anon:32
May 31 13:29:16 osspclu1 kernel: active_file:124 inactive_file:742 isolated_file:0
May 31 13:29:16 osspclu1 kernel: unevictable:0 dirty:0 writeback:24 unstable:0
May 31 13:29:16 osspclu1 kernel: free:25701 slab_reclaimable:3960 slab_unreclaimable:11205
May 31 13:29:16 osspclu1 kernel: mapped:380 shmem:223 pagetables:13946 bounce:0
May 31 13:29:16 osspclu1 kernel: Node 0 DMA free:15688kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15296kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May 31 13:29:16 osspclu1 kernel: lowmem_reserve: 0 3000 8050 8050
May 31 13:29:16 osspclu1 kernel: Node 0 DMA32 free:44816kB min:25140kB low:31424kB high:37708kB active_anon:2296016kB inactive_anon:580660kB active_file:20kB inactive_file:1824kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072160kB mlocked:0kB dirty:0kB writeback:0kB mapped:896kB shmem:212kB slab_reclaimable:1740kB slab_unreclaimable:1096kB kernel_stack:40kB pagetables:6584kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2016 all_unreclaimable? yes
May 31 13:29:16 osspclu1 kernel: lowmem_reserve: 0 0 5050 5050
May 31 13:29:16 osspclu1 kernel: Node 0 Normal free:42300kB min:42316kB low:52892kB high:63472kB active_anon:4293000kB inactive_anon:715524kB active_file:476kB inactive_file:1144kB unevictable:0kB isolated(anon):128kB isolated(file):0kB present:5171200kB mlocked:0kB dirty:0kB writeback:96kB mapped:624kB shmem:680kB slab_reclaimable:14100kB slab_unreclaimable:43724kB kernel_stack:3720kB pagetables:49200kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2560 all_unreclaimable? yes
May 31 13:29:16 osspclu1 kernel: lowmem_reserve: 0 0 0 0
May 31 13:29:16 osspclu1 kernel: Node 0 DMA: 44kB 38kB 216kB 032kB 264kB 1128kB 0256kB 0512kB 11024kB 12048kB 34096kB = 15688kB
May 31 13:29:16 osspclu1 kernel: Node 0 DMA32: 4324kB 4078kB 32816kB 27432kB 16264kB 45128kB 14256kB 3512kB 41024kB 02048kB 04096kB = 44344kB
May 31 13:29:16 osspclu1 kernel: Node 0 Normal: 10794kB 6928kB 45716kB 27132kB 14364kB 47128kB 6256kB 0512kB 01024kB 02048kB 0*4096kB = 42540kB
May 31 13:29:16 osspclu1 kernel: 5623 total pagecache pages
May 31 13:29:16 osspclu1 kernel: 4444 pages in swap cache
May 31 13:29:16 osspclu1 kernel: Swap cache stats: add 3838714, delete 3834270, find 708177/738093
May 31 13:29:16 osspclu1 kernel: Free swap = 0kB
May 31 13:29:16 osspclu1 kernel: Total swap = 2097144kB
May 31 13:29:16 osspclu1 kernel: 2097136 pages RAM
May 31 13:29:16 osspclu1 kernel: 48900 pages reserved
May 31 13:29:16 osspclu1 kernel: 2023 pages shared
May 31 13:29:16 osspclu1 kernel: 2015620 pages non-shared

dixon · May 31, 2017, 2:24am

May 31 13:29:16 osspclu1 kernel: [24735] 0 24735 57333 681 4 0 0 mysql
May 31 13:29:16 osspclu1 kernel: Out of memory: Kill process 19792 (mysqld) score 952 or sacrifice child
May 31 13:29:16 osspclu1 kernel: Killed process 19792, UID 496, (mysqld) total-vm:28389780kB, anon-rss:7822640kB, file-rss:460kB

dixon · May 31, 2017, 2:27am

[root@osspclu1 export]# free -m
total used free shared buffers cached
Mem: 8001 7824 176 0 84 3085
-/+ buffers/cache: 4655 3346
Swap: 2047 161 1886

przemek · June 1, 2017, 2:46am

OK, so if your server runs out of memory, triggering OOMK, you should double check relevant MySQL settings as well as how much other processes use.
Can you post full my.cnf here?

dixon · June 1, 2017, 7:44pm

[mysqld]

datadir=/data/mysql

socket=/data/mysql/mysql.sock

user=mysql

lower_case_table_names=1

max_connections=1000

log-bin=/data/mysql-bin/mysql-bin

max_binlog_size=50M

expire_logs_days=3

log-error=/data/mysqld.err

max_allowed_packet = 500M

skip-name-resolve

innodb_buffer_pool_size = 14G

query_cache_type=1

query_cache_size=1024M

wsrep_provider_options=“gcache.size=2G”

Path to Galera library

wsrep_provider=/usr/lib64/libgalera_smm.so

Cluster connection URL contains the IPs of node#1,node#2 and node#3

wsrep_cluster_address=gcomm://172.29.9.41,172.29.9.42,172.29.9.43

In order for Galera to work correctly binlog formatshould be ROW

binlog_format=ROW

MyISAM storage engine has only experimental support

default_storage_engine=InnoDB

This changes how InnoDB autoincrement locks are managedand is a requirement for Galera

innodb_autoinc_lock_mode=2

Node #3 address

wsrep_node_address=172.29.9.41 # 本机IP地址

SST method

wsrep_sst_method=xtrabackup-v2

Cluster name

wsrep_cluster_name=my_centos_cluster

Authentication for SST method

wsrep_sst_auth=“sstuser:s3cret”

wsrep_slave_threads = 8
wsrep_on = ON
wsrep_causal_reads = ON
#wsrep_certify_nonPK = ON
[client]

socket=/data/mysql/mysql.sock

yesterday,we have added memory for node1,node2,node3,

[root@osspclu1 ~]# free -m
total used free shared buffers cached
Mem: 20121 19906 214 0 160 3853
-/+ buffers/cache: 15892 4228
Swap: 2047 875 1172
[root@osspclu1 ~]#

dixon · June 20, 2017, 2:48am

i have posted it,can u help me check to improve performance

przemek · June 25, 2017, 4:01pm

Sorry, missed that update.
So you have:
innodb_buffer_pool_size = 14G
query_cache_size=1024M

So when fully populated, these may easily contribute to about 16GB of mem usage. Plus session buffers depending on number of connections and others. On top of that, large transaction cause memory overhead on Galera side for certification.
Though I can see pretty small RSS size from OOMk report - what else was running on this server?

dixon · June 26, 2017, 1:07am

thx,actually we have added more memory on this box ,now we have 48 GB.and not OMMk happen.thx
but i need u help me to check the parameter for this DB.if these are someting can improve the DB performance

Topic		Replies	Views
Crash Cluster Percona XtraDB Cluster 5.x	4	1253	October 2, 2014
Existing database import almost impossible (PXC 8.0.27) Percona XtraDB Cluster 8.x	5	1517	May 22, 2023
PXC freeze on large DB restore Percona XtraDB Cluster 5.x	16	1632	December 4, 2019
Can't import dump file to Percona Cluster with 3 node (actually 1 node is active) Percona XtraDB Cluster 5.x	4	2399	December 8, 2017
mysqldump import = deadlock? Percona XtraDB Cluster 5.x	1	929	August 14, 2012