Not the answer you need?
Register and ask your own question!

Failed to rejoin the cluster

pavelkpavelk EntrantInactive User Role Beginner
Dear all,
we have 4node Percona XtraDB Cluster v5.5.x and until yesterday all worked fine. Yesterday we had to temporarily disconnect one node due to maintenance and now we are not able to rejoin the cluster again.
Each time we start the node it fails after SST. I tried to completely deleted all files inside the MySQL datadir but the result is the same.
Could you please someone help?
Thanks in advance!

Comments

  • pavelkpavelk Entrant Inactive User Role Beginner
    mysql_error.log
    ============
    160405 22:06:36 [Note] WSREP: Read nil XID from storage engines, skipping position init
    160405 22:06:36 [Note] WSREP: wsrep_load(): loading provider library 'none'
    160405 22:06:36 [Note] WSREP: Service disconnected.
    160405 22:06:37 [Note] WSREP: Some threads may fail to exit.
    Error in my_thread_global_end(): 1 threads didn't exit
    160405 22:06:42 [Note] WSREP: Read nil XID from storage engines, skipping position init
    160405 22:06:42 [Note] WSREP: wsrep_load(): loading provider library 'none'
    160405 22:06:43 [Note] WSREP: Service disconnected.
    160405 22:06:44 [Note] WSREP: Some threads may fail to exit.
    Error in my_thread_global_end(): 1 threads didn't exit
    160405 22:06:49 mysqld_safe Starting mysqld daemon with databases from /galera
    160405 22:06:49 mysqld_safe Skipping wsrep-recover for empty datadir: /galera
    160405 22:06:49 mysqld_safe Assigning 00000000-0000-0000-0000-000000000000:-1 to wsrep_start_position
    160405 22:06:49 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
    160405 22:06:49 [Note] WSREP: Read nil XID from storage engines, skipping position init
    160405 22:06:49 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
    160405 22:06:49 [Note] WSREP: wsrep_load(): Galera 2.12(r318911d) by Codership Oy <[email protected]> loaded successfully.
    160405 22:06:49 [Warning] WSREP: Could not open saved state file for reading: /galera//grastate.dat
    160405 22:06:49 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
    160405 22:06:49 [Note] WSREP: Preallocating 5368710440/5368710440 bytes in '/galera//galera.cache'...
    160405 22:06:49 [Note] WSREP: Passing config to GCS: base_host = 10.1.1.4; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /galera/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /galera//galera.cache; gcache.page_size = 128M; gcache.size = 5G; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protonet.backend = asio; protonet.version
    160405 22:06:50 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
    160405 22:06:50 [Note] WSREP: wsrep_sst_grab()
    160405 22:06:50 [Note] WSREP: Start replication
    160405 22:06:50 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
    160405 22:06:50 [Note] WSREP: protonet asio version 0
    160405 22:06:50 [Note] WSREP: backend: asio
    160405 22:06:50 [Note] WSREP: GMCast version 0
    160405 22:06:50 [Note] WSREP: (eebb1260, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    160405 22:06:50 [Note] WSREP: (eebb1260, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    160405 22:06:50 [Note] WSREP: EVS version 0
    160405 22:06:50 [Note] WSREP: PC version 0
    160405 22:06:50 [Note] WSREP: gcomm: connecting to group 'My_Galera_Cluster', peer '10.1.1.1:,10.1.1.2:,10.1.1.3:'
    160405 22:06:50 [Note] WSREP: (eebb1260, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
    160405 22:06:50 [Note] WSREP: declaring 7c575c0d at tcp://10.1.1.1:4567 stable
    160405 22:06:50 [Note] WSREP: declaring eed4f866 at tcp://10.1.1.2:4567 stable
    160405 22:06:50 [Note] WSREP: declaring f203440b at tcp://10.1.1.3:4567 stable
    160405 22:06:50 [Note] WSREP: Node 7c575c0d state prim
    160405 22:06:50 [Note] WSREP: gcomm: connected
    160405 22:06:50 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
    160405 22:06:50 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
    160405 22:06:50 [Note] WSREP: Opened channel 'Jablotron_Galera_Cluster'
    160405 22:06:50 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 4
    160405 22:06:50 [Note] WSREP: Waiting for SST to complete.
    160405 22:06:50 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
    160405 22:06:50 [Note] WSREP: STATE EXCHANGE: sent state msg: eebc36bc-fb69-11e5-93d0-d34c86a21acd
    160405 22:06:50 [Note] WSREP: STATE EXCHANGE: got state msg: eebc36bc-fb69-11e5-93d0-d34c86a21acd from 0 (db3)
    160405 22:06:50 [Note] WSREP: STATE EXCHANGE: got state msg: eebc36bc-fb69-11e5-93d0-d34c86a21acd from 2 (db1)
    160405 22:06:50 [Note] WSREP: STATE EXCHANGE: got state msg: eebc36bc-fb69-11e5-93d0-d34c86a21acd from 3 (db2)
    160405 22:06:50 [Note] WSREP: STATE EXCHANGE: got state msg: eebc36bc-fb69-11e5-93d0-d34c86a21acd from 1 (db4)
    160405 22:06:50 [Note] WSREP: Quorum results:
    version = 2,
    component = PRIMARY,
    conf_id = 178,
    members = 3/4 (joined/total),
    act_id = 247014377,
    last_appl. = -1,
    protocols = 0/4/3 (gcs/repl/appl),
    group UUID = 04f450ca-def9-11e5-95f1-b33677ec468f
    160405 22:06:50 [Note] WSREP: Flow-control interval: [32, 32]
    160405 22:06:50 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 247014377)
    160405 22:06:50 [Note] WSREP: State transfer required:
    Group state: 04f450ca-def9-11e5-95f1-b33677ec468f:247014377
    Local state: 00000000-0000-0000-0000-000000000000:-1
    160405 22:06:50 [Note] WSREP: New cluster view: global state: 04f450ca-def9-11e5-95f1-b33677ec468f:247014377, view# 179: Primary, number of nodes: 4, my index: 1, protocol version 3
    160405 22:06:50 [Warning] WSREP: Gap in state sequence. Need state transfer.
    160405 22:06:50 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.1.1.4:4444' --auth 'xbuser:acRHRlm5aBX' --datadir '/galera/' --defaults-file '/etc/my.cnf' --parent '21480''
    WSREP_SST: [INFO] Streaming with xbstream (20160405 22:06:50.622)
    WSREP_SST: [INFO] Using socat as streamer (20160405 22:06:50.623)
    WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160405 22:06:50.645)
    160405 22:06:50 [Note] WSREP: Prepared SST request: xtrabackup-v2|10.1.1.4:4444/xtrabackup_sst
    160405 22:06:50 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    160405 22:06:50 [Note] WSREP: Assign initial position for certification: 247014377, protocol version: 2
    160405 22:06:50 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (04f450ca-def9-11e5-95f1-b33677ec468f): 1 (Operation not permitted)
    at galera/src/replicator_str.cpp:prepare_for_IST():447. IST will be unavailable.
    160405 22:06:50 [Note] WSREP: Node 1 (db4) requested state transfer from '*any*'. Selected 0 (db3)(SYNCED) as donor.
    160405 22:06:50 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 247014456)
    160405 22:06:50 [Note] WSREP: Requesting state transfer: success, donor: 0
    WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160405 22:06:51.061)
    WSREP_SST: [INFO] Proceeding with SST (20160405 22:06:51.076)
    WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20160405 22:06:51.078)
    removed ‘/galera/mysql/db.frm’
    removed ‘/galera/mysql/db.MYI’
    removed ‘/galera/mysql/db.MYD’
    removed ‘/galera/mysql/host.frm’
    removed ‘/galera/mysql/host.MYI’
    removed ‘/galera/mysql/host.MYD’
    removed ‘/galera/mysql/user.frm’
    removed ‘/galera/mysql/user.MYI’
    removed ‘/galera/mysql/user.MYD’
    removed ‘/galera/mysql/func.frm’
    removed ‘/galera/mysql/func.MYI’
    removed ‘/galera/mysql/func.MYD’
    removed ‘/galera/mysql/plugin.frm’
    removed ‘/galera/mysql/plugin.MYI’
    removed ‘/galera/mysql/plugin.MYD’
    removed ‘/galera/mysql/servers.frm’
    removed ‘/galera/mysql/servers.MYI’
    removed ‘/galera/mysql/servers.MYD’
    removed ‘/galera/mysql/tables_priv.frm’
    removed ‘/galera/mysql/tables_priv.MYI’
    removed ‘/galera/mysql/tables_priv.MYD’
    removed ‘/galera/mysql/columns_priv.frm’
    removed ‘/galera/mysql/columns_priv.MYI’
    removed ‘/galera/mysql/columns_priv.MYD’
    etc.

    WSREP_SST: [INFO] Waiting for SST streaming to complete! (20160405 22:06:51.089)
    160405 22:06:53 [Note] WSREP: (eebb1260, 'tcp://0.0.0.0:4567') turning message relay requesting off
    160406 0:13:09 [Note] WSREP: 0 (db3): State transfer to 1 (db4) complete.
    WSREP_SST: [INFO] Preparing the backup at /galera//.sst (20160406 00:13:09.207)
    WSREP_SST: [INFO] Evaluating innobackupex --no-version-check --apply-log $rebuildcmd ${DATA} &>${DATA}/innobackup.prepare.log (20160406 00:13:09.209)
    160406 0:13:09 [Note] WSREP: Member 0 (db3) synced with group.
    Terminated
    WSREP_SST: [ERROR] Removing /galera//.sst/xtrabackup_galera_info file due to signal (20160406 00:36:32.563)
    WSREP_SST: [ERROR] Cleanup after exit with status:143 (20160406 00:36:32.566)
    160406 0:36:32 [ERROR] WSREP: Process was aborted.
    160406 0:36:32 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.134.203.213:4444' --auth 'xbuser:8JS5PiGacRHRlm5aBXOZviaU' --datadir '/galera/' --defaults-file '/etc/my.cnf' --parent '21480': 10 (No child processes)
    160406 0:36:32 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
    160406 0:36:32 [ERROR] WSREP: SST failed: 10 (No child processes)
    160406 0:36:32 [ERROR] Aborting
  • pavithrapavithra Entrant Current User Role Beginner
    Hi pavelk even i am facing the same issue. Could you plz let me know if u found any solution for this.thanks on advance
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.