Not the answer you need?
Register and ask your own question!

Percona Cluster Node Timeout in Azure VM

kvigneshskvigneshs ContributorCurrent User Role Novice
I have Percona Xtradb Cluster in Azure VM with Three node (2 were MySQL Node NodeA and NodeB, 1 Garb Arbitrator NodeC)
Node A and C in Vnet and connected to NodeB with single VNET Peering
I have joined these cluster with Vnet Peering and private IP address. Issue is MySQL Node is getting connection lost and mysql stopped following were the logs found. Is there is any fix for this.

In Stopped Node (NodeA)

WSREP: Failed to report last committed 20712650, -110 (Connection timed out)
WSREP: last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT2.35911S), skipping check
Log of wsrep recovery (--wsrep-recover)

In Running Node (NodeB) - Bootstrap

2020-02-25T10:50:58.405580Z 0 [Note] WSREP: (9288914c, 'tcp://0.0.0.0:4567') connection to peer 807f459c with addr tcp://NodeB:4567 timed out, no messages seen in PT3S (gmcast.peer_timeout)
2020-02-25T10:50:58.406359Z 0 [Note] WSREP: (9288914c, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://NodeB:4567
2020-02-25T10:50:59.630488Z 0 [Note] WSREP: (9288914c, 'tcp://0.0.0.0:4567') reconnecting to 807f459c (tcp://NodeB:4567), attempt 0
2020-02-25T10:51:00.906739Z 0 [Note] WSREP: declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)
2020-02-25T10:51:00.906800Z 0 [Note] WSREP: evs:: proto(9288914c, GATHER, view_id(REG,807f459c,3)) suspecting node: 807f459c
2020-02-25T10:51:00.906815Z 0 [Note] WSREP: evs:: proto(9288914c, GATHER, view_id(REG,807f459c,3)) suspected node without join message, declaring inactive
2020-02-25T10:51:01.406965Z 0 [Note] WSREP: declaring node with index 0 inactive (evs.inactive_timeout)
2020-02-25T10:51:01.527169Z 0 [Note] WSREP: declaring a4893046 at tcp://NodeC:4444 stable
2020-02-25T10:51:01.647292Z 0 [Note] WSREP: Node 9288914c state primary
2020-02-25T10:51:01.767513Z 0 [Note] WSREP: Current view of cluster as seen by this node

In Running Arbitrator Node (Node C)

2020-02-25 10:50:58.355 INFO: (a4893046, 'tcp://0.0.0.0:4444') connection to peer 807f459c with addr tcp://NodeB:4567 timed out, no messages seen in PT3S (gmcast.peer_timeout)
2020-02-25 10:50:58.355 INFO: (a4893046, 'tcp://0.0.0.0:4444') turning message relay requesting on, nonlive peers: tcp://NodeB:4567
2020-02-25 10:50:59.356 INFO: (a4893046, 'tcp://0.0.0.0:4444') reconnecting to 807f459c (tcp://NodeB:4567), attempt 0
2020-02-25 10:51:00.356 INFO: declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)
2020-02-25 10:51:00.356 INFO: evs:: proto(a4893046, OPERATIONAL, view_id(REG,807f459c,3)) suspecting node: 807f459c
2020-02-25 10:51:00.356 INFO: evs:: proto(a4893046, OPERATIONAL, view_id(REG,807f459c,3)) suspected node without join message, declaring inactive
2020-02-25 10:51:00.856 INFO: declaring node with index 0 inactive (evs.inactive_timeout)

Comments

  • kvigneshskvigneshs Contributor Current User Role Novice
    Any help on this please.
  • lorraine.pocklingtonlorraine.pocklington Percona Community Manager Legacy User Role Patron
    Hi kvigneshs

    Could you please add this information:
    • version of Percona XtraDB Cluster
    • version or other information about the Azure environment
    • copies of the my.cnf for each of the nodes in the cluster
    Meanwhile, I will share this link with the team in case they have any suggestions.
  • lorraine.pocklingtonlorraine.pocklington Percona Community Manager Legacy User Role Patron
    Please also see this JIRA post, if you have responses to Przemyslaw's questions that could help a great deal https://jira.percona.com/browse/PXC-2285
  • kvigneshskvigneshs Contributor Current User Role Novice
    Hi Lorraine,

    Thanks for your reply, please find mysql configuration Same configuration used for Node A and Node B. and Percona xtradb cluster version i'm using 5.7.28.
    ############### mysqld.cnf ###############
    
    # Template my.cnf for PXC
    # Edit to your requirements.
    [client]
    socket=/var/lib/mysql/mysql.sock
    [mysqld]
    server-id=1
    datadir=/var/lib/mysql
    socket=/var/lib/mysql/mysql.sock
    log-error=/var/log/mysql/mysql-error.log
    pid-file=/var/run/mysqld/mysqld.pid
    log-bin
    log_slave_updates
    expire_logs_days=7
    sql_mode=''
    innodb_buffer_pool_size = 4G # (adjust value here, 50%-70% of total RAM)
    innodb_buffer_pool_instances=10
    innodb_log_file_size = 1G
    #innodb_log_file_size = 50331648
    innodb_flush_log_at_trx_commit = 1 # may change to 2 or 0
    innodb_flush_method = O_DIRECT
    innodb_read_io_threads = 16
    innodb_write_io_threads = 16
    innodb_io_capacity = 3000
    innodb_io_capacity_max = 6000
    innodb_temp_data_file_path=ibtmp1:12M:autoextend:m ax:1G
    # Disabling symbolic-links is recommended to prevent assorted security risks
    symbolic-links=0
    log_bin_trust_function_creators = 1
    query_cache_type = 1
    query_cache_size =125M
    innodb_autoinc_lock_mode = 2
    hot_cache.key_buffer_size=1G
    slow_query_log = 1
    slow_query_log_file=/var/log/mysql/slow-query.log
    long_query_time=10
    log-queries-not-using-indexes
    
    
    
    ############### wsrep.cnf ###############
    
    [mysqld]
    # Path to Galera library
    wsrep_provider=/usr/lib64/galera3/libgalera_smm.so
    # Cluster connection URL contains IPs of nodes
    #If no IP is found, this implies that a new cluster needs to be created,
    #in order to do that you need to bootstrap this node
    wsrep_cluster_address=gcomm://NODE_A,NODE_B,NODE_C
    wsrep_provider_options="gcache.size=3G;gcache.page _size=1G;gcache.recover=yes"
    # In order for Galera to work correctly binlog format should be ROW
    binlog_format=ROW
    # MyISAM storage engine has only experimental support
    default_storage_engine=InnoDB
    # Slave thread to use
    wsrep_slave_threads=16
    wsrep_log_conflicts=ON
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    # Node IP address
    wsrep_node_address=IPAPPRESS_A/B
    # Cluster name
    wsrep_cluster_name=decodeglobal-cluster
    #If wsrep_node_name is not specified, then system hostname will be used
    wsrep_node_name=CLUSTERNAME
    #pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
    pxc_strict_mode=DISABLED
    # SST method
    wsrep_sst_method=xtrabackup-v2
    #Authentication for SST method
    wsrep_sst_auth="USER:PASSWORD"
    max_connections=500
    max_connect_errors=500
    sql_mode=''
    
    
    ############### mysqld_safe.cnf ###############
    #
    # The Percona Server 5.7 configuration file.
    #
    # One can use all long options that the program supports.
    # Run program with --help to get a list of available options and with
    # --print-defaults to see which it would actually understand and use.
    #
    # For explanations see
    # [URL="http://dev.mysql.com/doc/mysql/en/server-system-variables.html"]http://dev.mysql.com/doc/mysql/en/se...variables.html[/URL]
    
    [mysqld_safe]
    pid-file = /var/run/mysqld/mysqld.pid
    socket = /var/lib/mysql/mysql.sock
    nice = 0
    

    Azure Environment:

    Node A and Node C were in same region and same vnet, but Node B is in Different region, connected with Azure VNET Peering.
    I have tried to cluster these machine using both Public and Private IP address.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.