gcache.page.some_number full disk problems

Hello,

We have a 3 node cluster setup, and it happens that nodes create a lot of gcache.page files and it leads to fulling disks on nodes and to node crash. Then node needs to be reinitialised and gcache.page.some_number files needs to be deleted. All those gcache.page files can full up the disks up to 90 GB, althought when node starts mysql data directory takes only 12 GB. It is Percona XtraDB Cluster 5.6. How can I limit the number of these files?

Thank you!

Hi,

how about gcache-keep-pages-size
http://galeracluster.com/documentation-webpages/galeraparameters.html#gcache-keep-pages-size

I also think, It might be good to fully synchronize data by SST when you setup.

Hi,

I’m facing the same problem which seems to happen in mariadb as well
https://mariadb.atlassian.net/browse/MDEV-6822

@taka-h: I don’t really understand the doc of gcache.keep_page_size - is this the number of files or is the the overall amounth of space required by gcache?

Cheers
B

Is there any news on this issue? We have a three node cluster, and invariably after a few days to a week, we end up with a bunch of gcache.page.XXX files being created. The cluster itself is fine. I end up shutting down the node with the files (whichis the one getting all the writes), manually delete them, and start it back up.

-rw-------. 1 mysql mysql 134219048 Dec 11 18:27 galera.cache
-rw------- 1 mysql mysql 134217728 Dec 12 12:26 gcache.page.000000
-rw------- 1 mysql mysql 134217728 Dec 13 12:27 gcache.page.000001
-rw------- 1 mysql mysql 134217728 Dec 14 10:27 gcache.page.000002
-rw------- 1 mysql mysql 134217728 Dec 14 12:54 gcache.page.000003
-rw------- 1 mysql mysql 134217728 Dec 14 14:34 gcache.page.000004
-rw------- 1 mysql mysql 134217728 Dec 14 15:56 gcache.page.000005
-rw------- 1 mysql mysql 134217728 Dec 14 18:59 gcache.page.000006
-rw------- 1 mysql mysql 134217728 Dec 15 10:41 gcache.page.000007
-rw------- 1 mysql mysql 134217728 Dec 15 12:31 gcache.page.000008
-rw------- 1 mysql mysql 134217728 Dec 15 15:11 gcache.page.000009
-rw-rw---- 1 mysql mysql 133 Dec 10 19:32 GRA_9_888385.log
-rw-rw----. 1 mysql mysql 108 Dec 15 00:30 grastate.dat
-rw-rw---- 1 mysql mysql 265 Dec 3 20:24 gvwstate.dat

[root@host src]# rpm -qa | grep -i percona
Percona-XtraDB-Cluster-56-5.6.26-25.12.1.el7.x86_64
Percona-XtraDB-Cluster-shared-56-5.6.26-25.12.1.el7.x86_64
percona-xtrabackup-2.3.2-1.el7.x86_64
Percona-XtraDB-Cluster-client-56-5.6.26-25.12.1.el7.x86_64
Percona-XtraDB-Cluster-galera-3-3.12.2-1.rhel7.x86_64
Percona-XtraDB-Cluster-server-56-5.6.26-25.12.1.el7.x86_64

[root@host src]# cat /etc/my.cnf
[client]
default-character-set = utf8mb4
socket = /x/mysql/mysql.sock

[mysql]
default-character-set = utf8mb4

[mysqld]
port = 3306
socket = /x/mysql/mysql.sock
datadir = /x/mysql
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
max_connections = 1000
log_bin = mysql-bin
expire_logs_days = 3

binlog_format = ROW
default_storage_engine = innodb
innodb_buffer_pool_size = 5G 
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 20M
innodb_file_per_table = 1
innodb_large_prefix = true
innodb_file_format = Barracuda
innodb_autoinc_lock_mode = 2

wsrep_cluster_address = gcomm://x.x.x.138,x.x.x.139,x.x.x.140
wsrep_node_address = x.x.x.138
wsrep_provider = /usr/lib64/galera3/libgalera_smm.so
wsrep_sst_method = xtrabackup-v2
wsrep_sst_auth = "x:x"
wsrep_sst_receive_address = x.x.x.138:14444

wsrep_slave_threads = 8
wsrep_cluster_name = xx 
wsrep_node_name = xxx

wsrep_provider_options="gcache.keep_pages_count=4"

[mysqld_safe]
pid-file = /run/mysqld/mysql.pid

Guys, have a look at: https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1488530
Upgrading to latest PXC with Galera 3.12 should already help in this situation.

I still have the same issue with this version :

Percona-XtraDB-Cluster-client-56-5.6.35-26.20.2.el6.x86_64
Percona-XtraDB-Cluster-shared-56-5.6.35-26.20.2.el6.x86_64
Percona-XtraDB-Cluster-56-5.6.35-26.20.2.el6.x86_64
percona-toolkit-2.2.15-2.noarch
Percona-XtraDB-Cluster-galera-3-3.20-2.el6.x86_64
percona-xtrabackup-2.3.5-1.el6.x86_64
Percona-XtraDB-Cluster-server-56-5.6.35-26.20.2.el6.x86_64

first setting was :

wsrep_provider_options = gcache.size=2G;gcache.keep_pages_size=100M;gcache.keep_pages_count=20

changed to

"wsrep_provider_options = gcache.size=2G;gcache.keep_pages_size=2G;gcache.keep_pages_count=10

both configuration facing the same problem
any suggestions ?

The algorithm for gcache page cleanup was changed in 5.6.35. However, gcache will not release the pages until it is safe to do so (until the data has been safely replicated to all the nodes). Do you have very large transactions? Also, the pages are freed in order they are created, so one early transaction could block the rest from being freed up.