Not the answer you need?
Register and ask your own question!

2 Node cluster locks db on 1 end, fails to reconnect automatically

joseph7joseph7 EntrantInactive User Role Beginner

I have a 2 node Percona cluster (percona-xtradb-cluster-56, 5.6.26-25.12-1.wheezy).

I had an issue where it seems db2 become unavailable due to some network issue.

While this happening db1 did not crash but locked down the database completely what freeradius was using. I guess this is normal behaviour.

During this time the database on db2 was accessible.

I have a feeling that there was not a long network outage between the nodes rather the auto-reconnect mechanism was failing because after I did restart db1 (just 4 minutes later of the last autoreconnect attempt) the cluster resynced and the dbs on db1 become accessible again.

My questions are:

1, How can I have the db at least in read only mode when the cluster split on both ends? In my case it would be useful for the radius to be still able to do authentication without updating infos in the database.

2, Can this be anyhow caused that my setup is using wsrep_sst_method=rsync instead of wsrep_sst_method=xtrabackup-v2?
I had no problem with this before.

3, How to increase the reconnect retry value to very high?



  • joseph7joseph7 Entrant Inactive User Role Beginner
    The problem come up once again. It seems that the nodes lost connectivity, the mem usage and cpu load went up high on the second node then it just suddenly come back, no restart required this time. I had ping running from node1 -> node2 and had no packet loss at all so it might not be a network issue.

    2016-08-19 13:57:27 24658 [Note] WSREP: (5fab4bf1, 'tcp://') turning message relay requesting on, nonlive peers:


    2016-08-19 13:57:31 7f729615a700 INNODB MONITOR OUTPUT
    Per second averages calculated from the last 4 seconds
    srv_master_thread loops: 1977149 srv_active, 0 srv_shutdown, 6148 srv_idle
    srv_master_thread log flush and writes: 1983165
    OS WAIT ARRAY INFO: reservation count 6052078
    OS WAIT ARRAY INFO: signal count 6049543
    Mutex spin waits 6425218, rounds 170770335, OS waits 5623549
    RW-shared spins 329083, rounds 9871237, OS waits 328969
    RW-excl spins 98958, rounds 2979655, OS waits 99116
    Spin rounds per wait: 26.58 mutex, 30.00 RW-shared, 30.11 RW-excl
    Trx id counter 66703779
    Purge done for trx's n:o < 65349398 undo n:o < 0 state: running but idle
    History list length 632497
    ---TRANSACTION 66703520, not started

    Nothing useful, after reconnecting it cleans up the transactions. Would an upgrade help anything on this? Since I have this cluster in production upgrading now is not that easy.

    ii percona-xtrabackup 2.2.12-1.wheezy amd64 Open source backup tool for InnoDB and XtraDB
    ii percona-xtradb-cluster-56 5.6.26-25.12-1.wheezy amd64 Percona XtraDB Cluster with Galera
    ii percona-xtradb-cluster-client-5.6 5.6.26-25.12-1.wheezy amd64 Percona XtraDB Cluster database client binaries
    ii percona-xtradb-cluster-common-5.6 5.6.26-25.12-1.wheezy amd64 Percona XtraDB Cluster database common files (e.g. /etc/mysql/my.cnf)
    ii percona-xtradb-cluster-galera-3 3.12.2-1.wheezy amd64 Metapackage for latest version of galera3.
    ii percona-xtradb-cluster-galera-3.x 3.12.2-1.wheezy amd64 Galera components of Percona XtraDB Cluster
    ii percona-xtradb-cluster-server-5.6 5.6.26-25.12-1.wheezy amd64 Percona XtraDB Cluster database server binaries
  • joseph7joseph7 Entrant Inactive User Role Beginner
    I have discovered that in my situation the second node becomes unavailable due to high cpu load/mem usage by the mysql. It is not a network issue between the nodes. I had ping running for days between the 2 hosts and there is no packetloss at all.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.